From: Paul Durrant <Paul.Durrant@citrix.com>
To: 'Juergen Gross' <jgross@suse.com>, Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel (xen-devel@lists.xenproject.org)"
<xen-devel@lists.xenproject.org>,
"Julien Grall (julien.grall@arm.com)" <julien.grall@arm.com>,
'Boris Ostrovsky' <boris.ostrovsky@oracle.com>
Subject: Re: debian stretch dom0 + xen 4.9 fails to boot
Date: Wed, 7 Jun 2017 09:05:06 +0000 [thread overview]
Message-ID: <ad450ab0147147429a46cd7382a17c19@AMSPEX02CL03.citrite.net> (raw)
In-Reply-To: <e9772a31-a3c0-6994-2745-219e6b0948f8@suse.com>
> -----Original Message-----
> From: Juergen Gross [mailto:jgross@suse.com]
> Sent: 07 June 2017 10:03
> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: Julien Grall (julien.grall@arm.com) <julien.grall@arm.com>; xen-devel
> (xen-devel@lists.xenproject.org) <xen-devel@lists.xenproject.org>; 'Boris
> Ostrovsky' <boris.ostrovsky@oracle.com>
> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
>
> On 07/06/17 10:27, Jan Beulich wrote:
> >>>> On 07.06.17 at 10:07, <Paul.Durrant@citrix.com> wrote:
> >>> -----Original Message-----
> >>> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> >>> Sent: 06 June 2017 18:00
> >>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> >>> <JBeulich@suse.com>
> >>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >>> devel@lists.xenproject.org>
> >>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>>
> >>> On 06/06/2017 12:28 PM, Paul Durrant wrote:
> >>>>> -----Original Message-----
> >>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
> Of
> >>>>> Paul Durrant
> >>>>> Sent: 06 June 2017 16:52
> >>>>> To: 'Jan Beulich' <JBeulich@suse.com>
> >>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >>>>> devel@lists.xenproject.org>
> >>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
> >>>>>> Sent: 06 June 2017 16:11
> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>>>> Cc: xen-devel (xen-devel@lists.xenproject.org) <xen-
> >>>>>> devel@lists.xenproject.org>
> >>>>>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> >>>>>>
> >>>>>>>>> On 06.06.17 at 16:32, <Paul.Durrant@citrix.com> wrote:
> >>>>>>> I've been having fun setting up a new test rig...
> >>>>>>>
> >>>>>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so
> that's a
> >>>>>>> 4.9 kernel) and then tried building and installing the latest Xen
> staging-
> >>> 4.9
> >>>>>>> code. The system failed to boot... basically it got stuck before even
> >>>>>>> managing to get sufficiently into Xen to spit out anything on the
> >>> console.
> >>>>>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14
> >>>>> iterations
> >>>>>>> I got down to the following commit is being the problem:
> >>>>>>>
> >>>>>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0
> >>>>>>> Author: Juergen Gross <jgross@suse.com>
> >>>>>>> Date: Fri Mar 24 14:18:54 2017 +0100
> >>>>>>>
> >>>>>>> x86: split boot trampoline into permanent and temporary part
> >>>>>>>
> >>>>>>> The hypervisor needs a trampoline in low memory for early boot
> and
> >>>>>>> later for bringing up cpus and during wakeup from suspend.
> Today
> >>> this
> >>>>>>> trampoline is kept completely even if most of it isn't needed
> later.
> >>>>>>>
> >>>>>>> Split the trampoline into a permanent part and a temporary part
> >>>>> needed
> >>>>>>> at early boot only. Introduce a new entry at the boundary.
> >>>>>>>
> >>>>>>> Reduce the stack for wakeup code in order for the permanent
> >>>>>>> trampoline to fit in a single page. 4k of stack seems excessive,
> about
> >>>>>>> 3k should be more than enough.
> >>>>>>>
> >>>>>>> Add an ASSERT() to the linker script to ensure the wakeup stack is
> >>>>>>> always at least 3k.
> >>>>>>>
> >>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
> >>>>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>>>>>>
> >>>>>>> To verify this I checked out master, reverted that commit, and tried
> >>> again.
> >>>>>>> The NUC still booted fine.
> >>>>>> Well, interesting, but I don't think it is very realistic to expect any
> >>>>>> fix with just the information you supply. There must be something
> >>>>>> rather special about that system, and likely it would help if we
> >>>>>> knew what that is. E.g. an unusual E820 map. Worse would be if
> >>>>>> they used memory outside of properly marked E820 regions in a
> >>>>>> way colliding with what we do.
> >>>>>>
> >>>>>> Otherwise I'm afraid we need to hope for you to debug the issue.
> >>>>>>
> >>>>> Yes, I was posting this more a heads-up for the moment, so that 4.9
> does
> >>> not
> >>>>> go out with this regression.
> >>>>>
> >>>>> I will try to figure out what is going on... My initial thoughts on looking
> >> at
> >>> what
> >>>>> the patch does are that it may be something to do with the fact I am
> using
> >>> a
> >>>>> vga console rather than a serial one. I need to try another 4.9 on
> another
> >>>>> system (gigabyte brix) to see if the problem manifests there too. I'll
> also
> >>> have
> >>>>> to play with the BIOS settings on the skull canyon.
> >>>>>
> >>>> The problem definitely doesn't manifest on the brix, so the next theory
> is
> >>> that it is something to do with the BIOS of the skull canyon.
> >>>>
> >>>
> >>>
> >>> FWIW, one of machines in our test farm choked on this very patch. I
> >>> don't remember details now but essentially it turned out that syslinux
> >>> (we are pxe-booting) could not handle changes in ELF sections layout
> >>> (the way syslinux calculated how to load the binary into memory
> resulted
> >>> in overlap of some sort).
> >>>
> >>> I hacked it (mboot.c32 specifically) to work around this but never came
> >>> up with a proper solution.
> >>>
> >>
> >> In my case it was grub2... and thinking about it I am running an older
> >> version on the brix so I guess it may still manifest there if I update.
> >> Either way it sounds like it may be better to revert the patch until the
> >> issue is better understood.
> >
> > I'm not sure if we could simply revert this one patch - it's the first of a
> > 3-patch series. At the first glance I can't really see any dependency
> > of the later two patches on it, but then again I seem to recall that the
> > split was a prereq. Adding Jürgen.
>
> I think it could be reverted. It was a prerequisite for another patch I
> prepared but didn't send as it was quite late in the 4.9 cycle and it
> depended on the other patches of Daniel.
>
> TBH: I really can't see what is wrong with that patch. The only change
> which should be able to break something seems to be the reduction of the
> wakeup stack size to 3kB, but this shouldn't affect booting the system
> at all...
>
Yeah, my next test is going to be increasing the size of the wakeup stack again, but there is really nothing obviously wrong with the patch.
Paul
>
> Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-06-07 9:05 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-06 14:32 debian stretch dom0 + xen 4.9 fails to boot Paul Durrant
2017-06-06 15:11 ` Jan Beulich
2017-06-06 15:51 ` Paul Durrant
2017-06-06 16:28 ` Paul Durrant
2017-06-06 17:00 ` Boris Ostrovsky
2017-06-07 8:07 ` Jan Beulich
2017-06-07 8:09 ` Paul Durrant
2017-06-07 8:19 ` Paul Durrant
2017-06-07 14:05 ` Boris Ostrovsky
2017-06-07 8:07 ` Paul Durrant
2017-06-07 8:27 ` Jan Beulich
[not found] ` <5937D4FF02000078001602F6@suse.com>
2017-06-07 9:03 ` Juergen Gross
2017-06-07 9:05 ` Paul Durrant [this message]
2017-06-07 9:09 ` Andrew Cooper
2017-06-07 10:36 ` Paul Durrant
2017-06-07 11:06 ` Paul Durrant
2017-06-07 11:57 ` Juergen Gross
2017-06-07 12:02 ` Paul Durrant
2017-06-07 12:13 ` Juergen Gross
2017-06-07 12:19 ` Jan Beulich
2017-06-07 12:26 ` Paul Durrant
2017-06-07 12:34 ` Jan Beulich
2017-06-07 11:50 ` Jan Beulich
2017-06-07 11:55 ` Paul Durrant
2017-06-07 12:00 ` Jan Beulich
2017-06-07 12:46 ` Paul Durrant
2017-06-07 12:55 ` Jan Beulich
2017-06-07 15:06 ` Paul Durrant
2017-06-07 15:33 ` Jan Beulich
2017-06-07 15:40 ` Paul Durrant
2017-06-07 15:52 ` Jan Beulich
2017-06-08 12:42 ` Paul Durrant
2017-06-08 12:46 ` Juergen Gross
2017-06-08 13:18 ` Jan Beulich
2017-06-08 13:24 ` Paul Durrant
2017-06-09 12:19 ` Paul Durrant
2017-06-09 13:05 ` Jan Beulich
2017-06-09 13:52 ` Boris Ostrovsky
2017-06-09 15:14 ` Paul Durrant
2017-06-09 15:41 ` Jan Beulich
2017-06-09 15:47 ` Paul Durrant
2017-06-09 15:58 ` Jan Beulich
2017-06-12 8:14 ` Paul Durrant
2017-06-12 10:40 ` Jan Beulich
2017-06-12 10:44 ` Paul Durrant
2017-06-12 10:53 ` Paul Durrant
2017-06-12 11:12 ` Jan Beulich
2017-06-12 12:05 ` Paul Durrant
2017-06-12 12:25 ` Paul Durrant
2017-06-12 13:54 ` Jan Beulich
2017-06-12 14:28 ` Paul Durrant
2017-06-12 14:43 ` Paul Durrant
2017-06-12 15:03 ` Paul Durrant
2017-06-12 15:07 ` Jan Beulich
2017-06-12 15:21 ` Paul Durrant
2017-06-06 17:40 ` Julien Grall
2017-06-07 8:05 ` Paul Durrant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ad450ab0147147429a46cd7382a17c19@AMSPEX02CL03.citrite.net \
--to=paul.durrant@citrix.com \
--cc=JBeulich@suse.com \
--cc=boris.ostrovsky@oracle.com \
--cc=jgross@suse.com \
--cc=julien.grall@arm.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).