From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Atom2 <ariel.atom2@web2web.at>
Cc: Jan Beulich <JBeulich@suse.com>, xen-devel@lists.xen.org
Subject: Re: HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2
Date: Sun, 15 Nov 2015 15:12:50 +0000 [thread overview]
Message-ID: <5648A0F2.9000302@citrix.com> (raw)
In-Reply-To: <5647CE57.50209@web2web.at>
[-- Attachment #1.1: Type: text/plain, Size: 4507 bytes --]
On 15/11/15 00:14, Atom2 wrote:
>
>> Right - it would appear that the USE flag is definitely not what you
>> wanted, and causes bad compilation for Xen. The do_IRQ disassembly
>> you sent is a the result of disassembling a whole block of zeroes.
>> Sorry for leading you on a goose chase - the double faults will be
>> the product of bad compilation, rather than anything to do with your
>> specific problem.
> Hi Andrew,
> there's absolutely no need to appologize as it is me who asked for
> help and you who generously stepped in and provided it. I really do
> appreciate your help and it is for me, as the one seeking help, to
> provide all the information you deem necessary and you ask for.
>> However, the final log you sent (dmesg) is using a debug Xen, which
>> is what I was attempting to get you to do originally.
> Next time I know better how to arrive at a debug XEN. It's all about
> learning.
>> We still observe that the VM ends up in 32bit non-paged mode but with
>> an RIP with bit 32 set, which is an invalid state to be in. However,
>> there was nothing particularly interesting in the extra log information.
>>
>> Please can you rerun with "hvm_debug=0xc3f", which will cause far
>> more logging to occur to the console while the HVM guest is running.
>> That might show some hints.
> I haven't done that yet - but please see my next paragraph. If you are
> still interested in this, for whatever reason, I am clearly more than
> happy to rerun with your suggested option and provide that information
> as well.
>> Also, the fact that this occurs just after starting SeaBIOS is
>> interesting. As you have switched versions of Xen, you have also
>> switched hvmloader, which contains the SeaBIOS binary embedded in
>> it. Would you be able to compile both 4.5.1 and 4.5.2 and switch the
>> hvmloader binaries in use. It would be very interesting to see
>> whether the failure is caused by the hvmloader binary or the
>> hypervisor. (With `xl`, you can use
>> firmware_override="/full/path/to/firmware" to override the default
>> hvmloader).
> Your analysis was absolutely spot on. After re-thinking this for a
> moment, I thought going down that route first would make a lot of
> sense as PV guests still do work and one of the differences to HVM
> domUs is that the former do _not_ require SeaBIOS. Looking at my log
> files of installed packages confirmed an upgrade from SeaBIOS 1.7.5 to
> 1.8.2 in the relevant timeframe which obviously had not made it to the
> hvmloader of xen-4.5.1 as I did not re-compile xen after the upgrade
> of SeaBIOS.
>
> So I re-compiled xen-4.5.1 (obviously now using the installed SeaBIOS
> 1.8.2) and the same error as with xen-4.5.2 popped up - and that
> seemed to strongly indicate that there indeed might be an issue with
> SeaBIOS as this probably was the only variable that had changed from
> the original install of xen-4.5.1.
>
> My next step was to downgrade SeaBIOS to 1.7.5 and to re-compile
> xen-4.5.1. Voila, the system was again up and running. While still
> having SeaBIOS 1.7.5 installed, I also re-compiled xen-4.5.2 and ...
> you probably guessed it ... the problem was gone: The system boots up
> with no issues and everything is fine again.
>
> So in a nutshell: There seems to be a problem with SeaBIOS 1.8.2
> preventing HVM doamins from successfully starting up. I don't know
> what this is triggered from, if this is specific to my hardware or
> whether something else in my environment is to blame.
>
> In any case, I am again more than happy to provide data / run a few
> tests should you wish to get to the grounds of this.
>
> I do owe you a beer (or any other drink) should you ever be at my
> location (i.e. Vienna, Austria).
>
> Many thanks again for your analysis and your first class support. Xen
> and their people absolutely rock!
Great - so confirms the issue as a SeaBIOS interaction issue, rather
than a hypervisor regression.
As I said before, I am still certain that a guest should not be able to
get itself into the crashing state (short of a hardware errata), so I
still suspect that there is a latent hypervisor emulation bug which has
been tickled by the SeaBIOS update.
Would you please mind running the bad HVMLoader on Xen 4.5.2 with
hvm_debug=0xc3f ? I am still hoping that that will shed some light on
SeaBIOS actions just leading up to the crash.
Are you able to experiment with newer versions of Xen? It would be
interesting to see whether the issue is still present in Xen 4.6
~Andrew
[-- Attachment #1.2: Type: text/html, Size: 5893 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2015-11-15 15:12 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-12 1:08 HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2 Atom2
2015-11-12 12:52 ` Jan Beulich
2015-11-12 13:01 ` Andrew Cooper
2015-11-12 14:29 ` Atom2
2015-11-12 15:32 ` Jan Beulich
2015-11-12 16:43 ` Andrew Cooper
2015-11-12 23:00 ` Atom2
2015-11-13 7:25 ` Jan Beulich
2015-11-13 10:09 ` Andrew Cooper
2015-11-14 0:16 ` Atom2
2015-11-14 20:32 ` Andrew Cooper
2015-11-15 0:14 ` Atom2
2015-11-15 15:12 ` Andrew Cooper [this message]
2015-11-16 0:39 ` Atom2
2015-11-16 10:02 ` Andrew Cooper
2015-11-15 20:12 ` Doug Goldstein
2015-11-16 1:05 ` Atom2
2015-11-16 15:31 ` Konrad Rzeszutek Wilk
2015-11-16 19:16 ` Atom2
2015-11-16 19:25 ` Konrad Rzeszutek Wilk
2015-11-16 19:39 ` Doug Goldstein
2015-11-16 19:47 ` Konrad Rzeszutek Wilk
2015-11-16 19:45 ` Atom2
2015-11-16 23:01 ` Andrew Cooper
2015-11-16 23:10 ` Atom2
2015-11-18 22:51 ` Atom2
2015-11-18 23:17 ` Andrew Cooper
2015-11-19 0:31 ` Atom2
2015-11-19 1:06 ` Andrew Cooper
2015-11-19 20:02 ` Atom2
2015-11-19 23:53 ` Andrew Cooper
2015-11-24 11:53 ` Atom2
2015-11-19 10:24 ` Jan Beulich
2015-11-19 10:38 ` Andrew Cooper
2015-11-19 19:51 ` Atom2
2015-11-20 7:57 ` Jan Beulich
2015-11-24 10:32 ` Atom2
2015-11-24 10:43 ` Jan Beulich
2015-11-27 22:51 ` Atom2
2015-11-30 9:04 ` Jan Beulich
2015-11-16 19:47 ` Doug Goldstein
2015-11-16 20:14 ` Atom2
2015-11-12 14:12 ` Atom2
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5648A0F2.9000302@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=JBeulich@suse.com \
--cc=ariel.atom2@web2web.at \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).