All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>,
	Lars Kurth <lars.kurth@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>, David Vrabel <david.vrabel@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: RFC: making the PVH 64bit ABI as stableo
Date: Mon, 8 Jun 2015 10:26:12 -0400	[thread overview]
Message-ID: <20150608142612.GG15682@l.oracle.com> (raw)
In-Reply-To: <5572C40B.40500@citrix.com>

On Sat, Jun 06, 2015 at 11:57:31AM +0200, Roger Pau Monné wrote:
> El 05/06/15 a les 23.52, Tim Deegan ha escrit:
> > At 18:21 +0100 on 05 Jun (1433528517), Andrew Cooper wrote:
> >> On 05/06/15 18:16, Stefano Stabellini wrote:
> >>> On Fri, 5 Jun 2015, Andrew Cooper wrote:
> >>>> On 05/06/15 17:43, Boris Ostrovsky wrote:
> >>>>> On 06/05/2015 12:16 PM, Roger Pau Monné wrote:
> >>>>>> El 03/06/15 a les 14.08, Jan Beulich ha escrit:
> >>>>>>>>>> On 03.06.15 at 12:02, <stefano.stabellini@eu.citrix.com> wrote:
> >>>>>>>> On Tue, 2 Jun 2015, Andrew Cooper wrote:
> >>>>>>>>> With my x86 maintainer hat on, the following is an absolute
> >>>>>>>>> minimum set
> >>>>>>>>> of prerequisite for PVH.
> >>>>>>>>>
> >>>>>>>>> * 32bit support
> >>>>>>>> Could you please explain why 32bit is important to get PVH out of tech
> >>>>>>>> preview? I don't see 32 bit OSes as an important use case. Maybe there
> >>>>>>>> is more behind it that I cannot see.
> >>>>>>> The primary reason was named before: 32-bit support will likely
> >>>>>>> end up changing the way 64-bit guests get launched.
> >>>>>> I can work on the new boot ABI, even if it's just a design document now,
> >>>>>> but the actual implementation needs to be done on top of the 32-bit
> >>>>>> support series.
> >>>>>>
> >>>>>> Boris, do you think you could send an early RFC of your 32-bit support
> >>>>>> series in a couple of weeks at most?
> >>>>> That's highly unlikely. For one, I am still unable to boot MP guests.
> >>>>> In addition, it is all held together by rubber bands and matchsticks
> >>>>> so calling it an RFC would be an insult to RFCs. (for example, I
> >>>>> apparently broke HVM somewhere along the way).
> >>>> How about working it the other way around.
> >>>>
> >>>> Start with an HVM guest and start with a sane method of booting.  I
> >>>> highly suggest multiboot1 as it is very easy and we have most of the
> >>>> code already.  Whomever actually gets around to doing this gets leeway,
> >>>> subject to it being sane (which the current method very certainly is not).
> 
> I agree that using a boot ABI similar to multiboot1 is going to solve
> some of the issues that we currently have, while probably simplifying
> the code to build a domain. There are also several multiboot1
> implementations around which can be used as a basis for this for guest
> OSes that don't have native multiboot support.

Multiboot1 requires that the header be within 8K of the start of the kernel.
Linux has an PE header, bootparams and multiboot1 would have to fit afterwards
in it.

Now from a implementation and political side:
 - In Linux you would need multiboot1 copy all the paramters in the bootparams
   type. And then if were to use the generic bootup path we need to track
   any changes in the early bootup code - which means we MUST at startup
   look like a bootloader (whatever that means).
 - Adding in extra bootloader support could be blocked by the Linux x86
   maintainers. As in they would prefer to have all the booting code
   related to this lay in arch/x86/xen and just call Linux code the same
   way as it is doing now (setup the pvops, x86_* function tables, etc;
   and then call x86_64_start_reservations (or i386_start_kernel).
   I can see them asking: "Why two entry points for Xen?!" And eliminating
   the old one is not yet an option, unless we are ready to make Linux
   upstream not boot on Amazon or other clouds that provide PV guest support.

Either way the code from multiboot1 (or XEN_ELFNOTE_ENTRY mechanism)
needs to do the same thing - fix up function tables such that the platform
is OK booting without much hardware. And then also setup Linux specific
changes (pass in EFI data, bootparams, x86_init, etc, stack protector).

I think the issue folks see with PVH bootup code is that a lot of 
the setup (GDT, CR registers, etc) is done on the hypervisor side.

And Andrew - I think you would like it to be done as much as possible on
the guest side - and by having the entry point to the OS have the state be
like an bootloader - it can be "done".

I think if you want to go that route it is going to delay PVH by
another year. It will also require the nod/approval from the Linux and
FreeBSD kernel folks.

The "done" is being skeptical. I think the Xen hypervisor part would
still need to setup GDT, CR registers, etc - so you would not
change much on the hypervisor side anyhow - except add more code
to deal with multiboot1 headers.

> 
> >>>> Start the domain without qemu, and expose some of the PV hypercalls to
> >>>> HVM guests, and see how far it gets.  One will find suddenly that all
> >>>> 32bit and AMD problems have already been solved.  All the PV(h) kernel
> >>>> needs to know is that there is no real hardware, and not to touch it.
> >>> This seems like a clean and nice way forward, but rather than PVH is
> >>> actually something else.  Am I the only one to think that making this
> >>> drastic change in the design at this stange (3 years in) is too late?
> 
> I don't think the ABI is going to change much, most of this plumbing is
> going to be in Xen internals, so I wouldn't call it a drastic change.
> 
> >> There was no design in the slightest, which is why we have got 3 years
> >> in and are in this position.
> > 
> > Please try to keep things friendly and contructive on this list.  Yes,
> > there was design; it was discussed on this list and at the Xen summit.
> > With hindsight, it turned out that "PV guest that uses a lightweight
> > HVM container" took a lot more work/code than was originally expected.
> > 
> > I suspect that an implementation of "HVM without qemu and some
> > hypercalls" will also turn out more complex than it sounds.  I believe
> > I've made my opinion clear that that's where PVH ought to end up, but
> > I'm unconvinced that starting from scratch will be the fastest way.
> 
> I believe the right way to move forward is to start implementing this
> new boot ABI on top of HVM, without axing out the PVH code. I think most
> of the current PVH code would still be needed for the HVM-without-dm
> kind of guest, and that at some point both will meet.
> 
> I will send a design document for this boot ABI next week, but the plan
> is as follows:
> 
>  - Start the guest in protected mode without paging.
>  - Fill the hypercall page using wrmsr (HVM).
>  - Map the shared info page using XENMEM_add_to_physmap (HVM).
> 
> That means we can get rid of some of the ELFNOTES, the ones that come to
> mind right now are:
> 
>  - XEN_ELFNOTE_VIRT_BASE
>  - XEN_ELFNOTE_HYPERCALL_PAGE
>  - XEN_ELFNOTE_HV_START_LOW
>  - XEN_ELFNOTE_PAE_MODE
>  - XEN_ELFNOTE_L1_MFN_VALID
> 
> And probably some more.
> 
> Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

  parent reply	other threads:[~2015-06-08 14:26 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-02 15:11 RFC: making the PVH 64bit ABI as stable Roger Pau Monné
2015-06-02 15:37 ` Andrew Cooper
2015-06-02 15:44 ` Jan Beulich
2015-06-02 16:51   ` RFC: making the PVH 64bit ABI as stableo Stefano Stabellini
2015-06-02 17:09     ` Andrew Cooper
2015-06-03  9:00       ` Ian Campbell
2015-06-03  9:06         ` Andrew Cooper
2015-06-03  9:20           ` Ian Campbell
2015-06-03 10:02       ` Stefano Stabellini
2015-06-03 12:08         ` Jan Beulich
2015-06-03 13:11           ` Stefano Stabellini
2015-06-03 14:59             ` Boris Ostrovsky
2015-06-05 16:16           ` Roger Pau Monné
2015-06-05 16:21             ` Stefano Stabellini
2015-06-05 16:46               ` Boris Ostrovsky
2015-06-05 16:43             ` Boris Ostrovsky
2015-06-05 16:57               ` Andrew Cooper
2015-06-05 17:16                 ` Stefano Stabellini
2015-06-05 17:20                   ` Boris Ostrovsky
2015-06-05 17:21                   ` Andrew Cooper
2015-06-05 21:52                     ` Tim Deegan
2015-06-06  9:57                       ` Roger Pau Monné
2015-06-06 14:41                         ` Andrew Cooper
2015-06-06 15:50                           ` Boris Ostrovsky
2015-06-08 14:26                         ` Konrad Rzeszutek Wilk [this message]
2015-06-08  9:07                     ` Ian Campbell
2015-06-02 17:12     ` Boris Ostrovsky
2015-06-03  6:09       ` Jan Beulich
2015-06-03 13:25         ` Boris Ostrovsky
2015-06-03  9:03       ` Ian Campbell
2015-06-03 13:35         ` Boris Ostrovsky
2015-06-05 16:29           ` Ian Campbell
2015-06-10 20:11             ` Elena Ufimtseva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150608142612.GG15682@l.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=lars.kurth@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.