* computer stalls instead of reboot
@ 2011-09-08 23:15 Sven Köhler
2011-09-08 23:17 ` Andrew Cooper
0 siblings, 1 reply; 11+ messages in thread
From: Sven Köhler @ 2011-09-08 23:15 UTC (permalink / raw)
To: xen-devel
Hi,
I'm still seeing a very strange issue here. First, let's clarify that
the issue has never occurred with the good old xen 3.x and the good old
2.6.18 kernel.
So the issue is, that with xen 4.x (including 4.1.1) pretty much any
kernel (including kernel from [1] and vanilla 3.0.0, didn't test the
2.6.18), the machine freezes during a reboot. The machine won't come up
again, not even the BIOS screen will show.
It doesn't happen when running the kernel on bare metal. Also the fact
that it doesn't happen with xen 3.x + 2.6.18 might meen, that this is a
regression of some sort.
This issue has prevented my move from xen 3.x to xen 4.x for many years
now. I already asked about this issue, and nobody replied. So I hoped,
that the kernel from kernel.org would solve this, once that pvops dom0
enabled kernels were available. Well, it didn't. I'm still stuck with
this issue. Every time I want to reboot my machine, I have to call my
hosting company to reboot the server.
It's a MSI X58 Pro-E (MS-7522) motherboard, equipped with Intel Core i7
920 and an nvidia graphics card.
Did anybody ever experience a similar issue?
Does anybody have any suggestions how to continue?
This seems to be a very weired issue, and even pushing the computers
reset button doesn't seem to help. (It's a remote machine, and I can
remotely push the reset button).
I have already updated the BIOS, and disabled virtualization (only
paravirt domUs). However, no improvement.
Kind Regards,
Sven
[1] http://code.google.com/p/gentoo-xen-kernel/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: computer stalls instead of reboot
2011-09-08 23:15 computer stalls instead of reboot Sven Köhler
@ 2011-09-08 23:17 ` Andrew Cooper
2011-09-08 23:40 ` Sven Köhler
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2011-09-08 23:17 UTC (permalink / raw)
To: xen-devel
On 09/09/2011 00:15, Sven Köhler wrote:
> Hi,
>
> I'm still seeing a very strange issue here. First, let's clarify that
> the issue has never occurred with the good old xen 3.x and the good old
> 2.6.18 kernel.
>
> So the issue is, that with xen 4.x (including 4.1.1) pretty much any
> kernel (including kernel from [1] and vanilla 3.0.0, didn't test the
> 2.6.18), the machine freezes during a reboot. The machine won't come up
> again, not even the BIOS screen will show.
>
> It doesn't happen when running the kernel on bare metal. Also the fact
> that it doesn't happen with xen 3.x + 2.6.18 might meen, that this is a
> regression of some sort.
>
> This issue has prevented my move from xen 3.x to xen 4.x for many years
> now. I already asked about this issue, and nobody replied. So I hoped,
> that the kernel from kernel.org would solve this, once that pvops dom0
> enabled kernels were available. Well, it didn't. I'm still stuck with
> this issue. Every time I want to reboot my machine, I have to call my
> hosting company to reboot the server.
>
> It's a MSI X58 Pro-E (MS-7522) motherboard, equipped with Intel Core i7
> 920 and an nvidia graphics card.
>
> Did anybody ever experience a similar issue?
> Does anybody have any suggestions how to continue?
>
> This seems to be a very weired issue, and even pushing the computers
> reset button doesn't seem to help. (It's a remote machine, and I can
> remotely push the reset button).
Does your "remote" method involve actually pushing the reset button, and
does this method actually work under normal circumstances?
As for the problem itself, do you have C states enabled in the BIOS?
This sounds similar to several errata published for the i7 series.
~Andrew
>
> I have already updated the BIOS, and disabled virtualization (only
> paravirt domUs). However, no improvement.
>
>
> Kind Regards,
> Sven
>
>
> [1] http://code.google.com/p/gentoo-xen-kernel/
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: computer stalls instead of reboot
2011-09-08 23:17 ` Andrew Cooper
@ 2011-09-08 23:40 ` Sven Köhler
2011-09-09 12:01 ` Andrew Cooper
0 siblings, 1 reply; 11+ messages in thread
From: Sven Köhler @ 2011-09-08 23:40 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 01:17, schrieb Andrew Cooper:
> Does your "remote" method involve actually pushing the reset button, and
> does this method actually work under normal circumstances?
I think, there is a device connected to the connector on the
motherboard, to which the reset button would normally be attached.
> As for the problem itself, do you have C states enabled in the BIOS?
> This sounds similar to several errata published for the i7 series.
I'm not sure how to tell whether C states are disabled/enabled.
What would those BIOS options typically be called?
Also, should I enable or disable them, in order to workaround those
errata that you mention?
Should those errors have occurred with xen 3.x as well, if those were a
result of the errata you mention?
Regards,
Sven
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: computer stalls instead of reboot
2011-09-08 23:40 ` Sven Köhler
@ 2011-09-09 12:01 ` Andrew Cooper
2011-09-09 12:55 ` Sven Köhler
2011-09-11 0:57 ` Sven Köhler
0 siblings, 2 replies; 11+ messages in thread
From: Andrew Cooper @ 2011-09-09 12:01 UTC (permalink / raw)
To: xen-devel
On 09/09/11 00:40, Sven Köhler wrote:
> Am 09.09.2011 01:17, schrieb Andrew Cooper:
>> Does your "remote" method involve actually pushing the reset button, and
>> does this method actually work under normal circumstances?
> I think, there is a device connected to the connector on the
> motherboard, to which the reset button would normally be attached.
Ok, in which case the state your computer is getting into is a very
broken state, if the reset button is not working
>> As for the problem itself, do you have C states enabled in the BIOS?
>> This sounds similar to several errata published for the i7 series.
> I'm not sure how to tell whether C states are disabled/enabled.
> What would those BIOS options typically be called?
That is too bios dependent to say for sure, but typically "C states" or
"deep sleep", with some intel ones going for "C1e"
> Also, should I enable or disable them, in order to workaround those
> errata that you mention?
They should be disabled. The errata state that there are several
situations when moving in and our of deep c states which cause
processors to lock up irreparably.
> Should those errors have occurred with xen 3.x as well, if those were a
> result of the errata you mention?
The power management code has changed quite a lot between 3.x and 4.x,
so it is quite possible that xen 3.x just managed to miss these errata.
>
> Regards,
> Sven
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: computer stalls instead of reboot
2011-09-09 12:01 ` Andrew Cooper
@ 2011-09-09 12:55 ` Sven Köhler
2011-09-09 13:31 ` Heiko Wundram
2011-09-11 0:57 ` Sven Köhler
1 sibling, 1 reply; 11+ messages in thread
From: Sven Köhler @ 2011-09-09 12:55 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 14:01, schrieb Andrew Cooper:
> On 09/09/11 00:40, Sven Köhler wrote:
>> Am 09.09.2011 01:17, schrieb Andrew Cooper:
>>> Does your "remote" method involve actually pushing the reset button, and
>>> does this method actually work under normal circumstances?
>> I think, there is a device connected to the connector on the
>> motherboard, to which the reset button would normally be attached.
>
> Ok, in which case the state your computer is getting into is a very
> broken state, if the reset button is not working
>
>>> As for the problem itself, do you have C states enabled in the BIOS?
>>> This sounds similar to several errata published for the i7 series.
>> I'm not sure how to tell whether C states are disabled/enabled.
>> What would those BIOS options typically be called?
>
> That is too bios dependent to say for sure, but typically "C states" or
> "deep sleep", with some intel ones going for "C1e"
>
>> Also, should I enable or disable them, in order to workaround those
>> errata that you mention?
>
> They should be disabled. The errata state that there are several
> situations when moving in and our of deep c states which cause
> processors to lock up irreparably.
Thanks for you help so far. I will try to disable the C-states as soon
as I have the time.
One more thing: are you aware of any way for telling from inside dom0
whether these C-states are enabled/disabled? Is the kernel or the xen
hypervisor able to tell whether they are active?
Also, is there any xen hypervisor command line option to disable the use
of them?
Regards,
Sven
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: computer stalls instead of reboot
2011-09-09 12:55 ` Sven Köhler
@ 2011-09-09 13:31 ` Heiko Wundram
2011-09-09 13:45 ` Sven Köhler
2011-09-11 0:55 ` Sven Köhler
0 siblings, 2 replies; 11+ messages in thread
From: Heiko Wundram @ 2011-09-09 13:31 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 14:55, schrieb Sven Köhler:
> <snip>
Without knowing much of the previous discussion: is this related to
Hetzner-servers (from seeing the mainboard type, I can only guess that
it's a machine from the "new" Hetzner-series)?
If that's the case, use: "acpi=off" on the Dom0-kernel commandline (I
use a Gentoo-adapted xen-sources-2.6.38 [rebased SuSE-Dom0-kernel]),
that should solve the reboot problems. IIRC somewhere on the
Hetzner-site they mention this, too. No reboot hangs/problems for me
after that.
--
--- Heiko.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: computer stalls instead of reboot
2011-09-09 13:31 ` Heiko Wundram
@ 2011-09-09 13:45 ` Sven Köhler
2011-09-09 13:52 ` Heiko Wundram
2011-09-11 0:55 ` Sven Köhler
1 sibling, 1 reply; 11+ messages in thread
From: Sven Köhler @ 2011-09-09 13:45 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 15:31, schrieb Heiko Wundram:
> Am 09.09.2011 14:55, schrieb Sven Köhler:
>> <snip>
>
> Without knowing much of the previous discussion: is this related to
> Hetzner-servers (from seeing the mainboard type, I can only guess that
> it's a machine from the "new" Hetzner-series)?
Yes it is a Hetzner machine!
> If that's the case, use: "acpi=off" on the Dom0-kernel commandline (I
> use a Gentoo-adapted xen-sources-2.6.38 [rebased SuSE-Dom0-kernel]),
> that should solve the reboot problems. IIRC somewhere on the
> Hetzner-site they mention this, too. No reboot hangs/problems for me
> after that.
I will try acpi=off as soon as possible.
I wonder, what the disadvantage are.
The hypervisor will still regulate CPU frequency, will it not?
Also, is the dom0 kernel doing something that it shouldn't do?
(maybe something that collides with the ACPI-related activities of the
hypervisor, if there are any?)
Regards,
Sven
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: computer stalls instead of reboot
2011-09-09 13:45 ` Sven Köhler
@ 2011-09-09 13:52 ` Heiko Wundram
2011-09-09 14:13 ` Sven Köhler
0 siblings, 1 reply; 11+ messages in thread
From: Heiko Wundram @ 2011-09-09 13:52 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 15:45, schrieb Sven Köhler:
> I wonder, what the disadvantage are.
> The hypervisor will still regulate CPU frequency, will it not?
No, it will not.
> Also, is the dom0 kernel doing something that it shouldn't do?
> (maybe something that collides with the ACPI-related activities of the
> hypervisor, if there are any?)
I guess the BIOS is simply reporting broken ACPI tables to the operating
system (the board is a "consumer" board, so you can guess that the
manufacturer only tests the ACPI-tables for compatability with Windows).
The ACPI tables (AFAIK, someone correct me) also contain a method for
rebooting the system, which simply doesn't work/is broken when Xen is
involved. Forcing acpi=off means that the normal triple-fault or
kbd-controller reset machinery is always used, as ACPI isn't even
initialized.
What struck me as odd, though: you can configure Linux to use "some
other" form of hard reset through a kernel parameter, but setting that
to explicitly use triple-faults didn't work, either (same hangs), so
possibly it's some form of additional interaction between Xen, the board
and the hypervisor. Anyway, the Hetzner "recommended" fix is just what I
sent you, and I can confirm that works.
--
--- Heiko.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: computer stalls instead of reboot
2011-09-09 13:52 ` Heiko Wundram
@ 2011-09-09 14:13 ` Sven Köhler
0 siblings, 0 replies; 11+ messages in thread
From: Sven Köhler @ 2011-09-09 14:13 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 15:52, schrieb Heiko Wundram:
> Am 09.09.2011 15:45, schrieb Sven Köhler:
>> I wonder, what the disadvantage are.
>> The hypervisor will still regulate CPU frequency, will it not?
>
> No, it will not.
In xen 3.x, the hypervisor did the cpufreq-like CPU frequency switching.
Has this changes in xen 4.x and the dom0 kernel is now responsible?
>> Also, is the dom0 kernel doing something that it shouldn't do?
>> (maybe something that collides with the ACPI-related activities of the
>> hypervisor, if there are any?)
>
> I guess the BIOS is simply reporting broken ACPI tables to the operating
> system (the board is a "consumer" board, so you can guess that the
> manufacturer only tests the ACPI-tables for compatability with Windows).
>
> The ACPI tables (AFAIK, someone correct me) also contain a method for
> rebooting the system, which simply doesn't work/is broken when Xen is
> involved. Forcing acpi=off means that the normal triple-fault or
> kbd-controller reset machinery is always used, as ACPI isn't even
> initialized.
>
> What struck me as odd, though: you can configure Linux to use "some
> other" form of hard reset through a kernel parameter, but setting that
> to explicitly use triple-faults didn't work, either (same hangs), so
> possibly it's some form of additional interaction between Xen, the board
> and the hypervisor. Anyway, the Hetzner "recommended" fix is just what I
> sent you, and I can confirm that works.
Thanks for the explanation.
Here's another thing: why does rebooting work, if xen is not involved,
i.e. if the same kernel runs without xen? (I'm pretty sure this was true
the last time I tried) I would assume, that broken ACPI tables would
result in no reboot no matter what.
Also, does the dom0 kernel do the reboot, or the hypervisor?
In the past, there were some reboot/poweroff related patches for the xen
part of the kernel. I assumed, that the dom0 kernel is not using the
"normal" reboot/poweroff code and instead instructs the hypervisor
reboot/poweroff the machine.
On the other hand, all the patches that went into linux 3.0 which were
aimed at making poweroff/reboot as similar to windows as possible
sounded promising, but didn't help in the Hetzner case :-(
Regards,
Sven
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: computer stalls instead of reboot
2011-09-09 13:31 ` Heiko Wundram
2011-09-09 13:45 ` Sven Köhler
@ 2011-09-11 0:55 ` Sven Köhler
1 sibling, 0 replies; 11+ messages in thread
From: Sven Köhler @ 2011-09-11 0:55 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 15:31, schrieb Heiko Wundram:
> Am 09.09.2011 14:55, schrieb Sven Köhler:
>> <snip>
>
> Without knowing much of the previous discussion: is this related to
> Hetzner-servers (from seeing the mainboard type, I can only guess that
> it's a machine from the "new" Hetzner-series)?
>
> If that's the case, use: "acpi=off" on the Dom0-kernel commandline (I
> use a Gentoo-adapted xen-sources-2.6.38 [rebased SuSE-Dom0-kernel]),
> that should solve the reboot problems.
In the wiki, I found the use of acpi=off in the xen command line.
Well, I have tried acpi=off on the xen command line and/or the domß
kernel command line. The kernel would refuse to boot (see the other
thread I started). So to anyone who's using upstream kernels:
don't even bother trying acpi=off
Regards,
Sven
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: computer stalls instead of reboot
2011-09-09 12:01 ` Andrew Cooper
2011-09-09 12:55 ` Sven Köhler
@ 2011-09-11 0:57 ` Sven Köhler
1 sibling, 0 replies; 11+ messages in thread
From: Sven Köhler @ 2011-09-11 0:57 UTC (permalink / raw)
To: xen-devel
Am 09.09.2011 14:01, schrieb Andrew Cooper:
>>> As for the problem itself, do you have C states enabled in the BIOS?
>>> This sounds similar to several errata published for the i7 series.
>> I'm not sure how to tell whether C states are disabled/enabled.
>> What would those BIOS options typically be called?
>
> That is too bios dependent to say for sure, but typically "C states" or
> "deep sleep", with some intel ones going for "C1e"
I found two BIOS options for C1E and C2, C3, etc.
I disabled both options. So C-states should have been disabled. However,
the issue reoccured. So it's not C-state related.
Regards,
Sven
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-09-11 0:57 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-08 23:15 computer stalls instead of reboot Sven Köhler
2011-09-08 23:17 ` Andrew Cooper
2011-09-08 23:40 ` Sven Köhler
2011-09-09 12:01 ` Andrew Cooper
2011-09-09 12:55 ` Sven Köhler
2011-09-09 13:31 ` Heiko Wundram
2011-09-09 13:45 ` Sven Köhler
2011-09-09 13:52 ` Heiko Wundram
2011-09-09 14:13 ` Sven Köhler
2011-09-11 0:55 ` Sven Köhler
2011-09-11 0:57 ` Sven Köhler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).