xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu2@citrix.com>
To: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Doug Goldstein <cardoe@cardoe.com>, Wei Liu <wei.liu2@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	xen-devel@lists.xen.org
Subject: Re: Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore.
Date: Tue, 25 Oct 2016 18:26:27 +0100	[thread overview]
Message-ID: <20161025172627.GF30231@citrix.com> (raw)
In-Reply-To: <65a0fbdfb2a2861aa53f8262d7ea1786@eikelenboom.it>

On Tue, Oct 25, 2016 at 07:25:06PM +0200, Sander Eikelenboom wrote:
> On 2016-10-25 16:49, Wei Liu wrote:
> >On Tue, Oct 25, 2016 at 01:37:45PM +0200, Sander Eikelenboom wrote:
> >>
> >>Tuesday, October 25, 2016, 1:24:12 PM, you wrote:
> >>
> >>> On Tue, Oct 18, 2016 at 01:48:23PM +0100, Wei Liu wrote:
> >>>> On Mon, Oct 17, 2016 at 05:28:17PM +0200, Sander Eikelenboom wrote:
> >>>> > Thursday, October 13, 2016, 4:43:31 PM, you wrote:
> >>>> >
> >>>> > > Hi Jan / Wei,
> >>>> >
> >>>> > > Took a while before i had the chance to fiddle some more to find the actual culprit.
> >>>> > > After analyzing the output of xl -vvvvv create somewhat more i came to the
> >>>> > > insight it was probably Qemu and not Xen causing the fault.
> >>>> >
> >>>> > > As a test I just used a qemu-xen binary build with xen-4.6.0 booting up a guest with
> >>>> > > direct kernel boot mode on xen-unstable. And that old qemu binary works fine.
> >>>> >
> >>>> > > After testing i can conclude, Jan was right, the bisection was a red herring,
> >>>> > > the problem is caused by some change in Qemu and not by something in the Xen tree.
> >>>> > > (strange thing is that for as far as i know i did a "make distclean" between
> >>>> > > every build (taking a lot of time), which should have pulled a fresh qemu-xen
> >>>> > > tree and therefor the bisection should have lead to a commit with a Config.mk
> >>>> > > hash change for qemu-xen version.)
> >>>> >
> >>>> > > Will see if i can find some more time and bisect qemu and find the culprit.
> >>>> >
> >>>> > > --
> >>>> > > Sander
> >>>> >
> >>>> >
> >>>> > Unfortunately i have to give up on this issue, for me it's impossible to bisect this
> >>>> > issue with my present git-foo.
> >>>> >
> >>>> > The first try with bisection of the whole xen-tree seems to have hit the issue that the
> >>>> > qemu-revision that gets pulled on a fresh build is "master" during the whole
> >>>> > dev period. That creates havoc when trying to bisect, since you are testing
> >>>> > combinations that were never developed (nor auto tested) in that combination
> >>>> > (especially when a xen-tree and qemu-tree change have a dependency like Roger's
> >>>> > "xen: fix usage of xc_domain_create in domain builder")
> >>>> >
> >>>> > While trying to bisect only qemu (keeping xen itself on RELEASE-4.6.0 and
> >>>> > seabios on rel-1.8.2) it get stuck on issues with that tree.
> >>>> > Between 4.6.0 and 4.7.0 the qemu tree switched from git://xenbits.xen.org/qemu-upstream-4.6-testing.git
> >>>> > to git://xenbits.xen.org/qemu-xen.git),after that there seem to have
> >>>> > been a lot of merges going back and forth and to me it seems a mess (but as i
> >>>> > said it could also be a lack of git-foo). I tried by manual bisecting, removing
> >>>> > and cloning trees again etc. but that doesn't suffice, it's all going no-where.
> >>>> > (while the known good build (plain RELEASE-4.6.0) always works, so it doesn't
> >>>> > seem to be some random problem)
> >>>> >
> >>>>
> >>>> Thanks for trying.
> >>>>
> >>>> > So perhaps some dev can at least verify that the issue is there (since 4.7.0)
> >>>> > and put it on the "known broken" list of things.
> >>>> >
> >>>>
> >>>> I will put this into the list of things I need to look at.
> >>>>
> >>
> >>> I investigated this a bit. The root cause is the memory accounting is
> >>> wrong in QEMU. It would try to allocate more ram than allowed. I haven't
> >>> tried to figure out exactly what is wrong, though.
> >>
> >>That confirms what i was thinking in the end, but bisection the
> >>qemu-tree
> >>changes between the xen-4.6.0 and xen-4.7.0 release proved to be pretty
> >>difficult as i explained. So i you have a hunch as to in what code it
> >>should
> >>reside debugging instead of bisecting would probably be better.
> >>(so one of the questions is what changes in the memory accounting when
> >>you
> >>supply the kernel from the host instead of the guest, since booting a
> >>kernel
> >>with grub from within the guest doesn't give any memory accounting
> >>issues.)
> >>
> >>Thanks for investigating !
> >
> >I think I hunted down the offending function.
> >
> >Mind trying this patch for me?
> 
> Hi Wei,
> 
> This seems to help :)
> 
> With a linux 4.8 kernel the HVM guest now boots fine with direct kernel boot
> !
> 
> But there seems to be a gotcha which i think is not in the Xen docs/wiki:
> when trying a linux 4.3 kernel the guest still didn't boot and i got a:
> "qemu: linux kernel too old to load a ram disk" in the qemu log.
> I don't know what qemu regards as "old" in this case.
> 

QEMU checks for a  signature / version in kernel header or whatnot. I
can't tell why that specific number is chosen, though.

> Another considiration: would it be worthwhile to add an OSStest for direct
> kernel boot ?
> (under the assumption that the host kernel that gets build can also boot on
> HVM guest it's probably a very cheap test not requiring any additional
> builds.)

Yes, definitely. The more tests, the merrier.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

      reply	other threads:[~2016-10-25 17:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-25 20:21 Regression between Xen 4.6.0 and 4.7.0, Direct kernel boot on a qemu-xen and seabios HVM guest doesn't work anymore linux
2016-08-25 20:34 ` Doug Goldstein
2016-08-25 21:18   ` linux
2016-08-26 10:19     ` Håkon Alstadheim
2016-08-30 12:35       ` Wei Liu
2016-08-30 22:13         ` Håkon Alstadheim
2016-09-05  9:20     ` linux
2016-09-05  9:25       ` Wei Liu
2016-09-05  9:46       ` Jan Beulich
2016-09-05 10:02         ` linux
2016-09-05 10:25           ` Jan Beulich
2016-09-05 11:19             ` linux
2016-09-05 11:43               ` Jan Beulich
2016-09-05 12:00                 ` linux
2016-10-13 14:43               ` Sander Eikelenboom
2016-10-17 15:28                 ` Sander Eikelenboom
2016-10-18 12:48                   ` Wei Liu
2016-10-18 21:32                     ` Håkon Alstadheim
2016-10-25 11:24                     ` Wei Liu
2016-10-25 11:37                       ` Sander Eikelenboom
2016-10-25 14:49                         ` Wei Liu
2016-10-25 15:00                           ` Wei Liu
2016-10-25 17:25                           ` Sander Eikelenboom
2016-10-25 17:26                             ` Wei Liu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161025172627.GF30231@citrix.com \
    --to=wei.liu2@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=cardoe@cardoe.com \
    --cc=linux@eikelenboom.it \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).