From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?windows-1252?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
Subject: Re: RFC: making the PVH 64bit ABI as stableo
Date: Sat, 6 Jun 2015 11:57:31 +0200
Message-ID: <5572C40B.40500@citrix.com>
References: <556DEB9A020000780008079A@mail.emea.novell.com>
	<alpine.DEB.2.02.1506021648160.19838@kaball.uk.xensource.com>
	<556DE352.3030703@citrix.com>
	<alpine.DEB.2.02.1506031058090.19838@kaball.uk.xensource.com>
	<556F0A410200007800080E4B@mail.emea.novell.com>
	<5571CB4D.5010403@citrix.com>
	<5571D1BD.1090108@oracle.com> <5571D501.80305@citrix.com>
	<alpine.DEB.2.02.1506051813140.19838@kaball.uk.xensource.com>
	<5571DAB5.9030507@citrix.com>
	<20150605215228.GA28885@deinos.phlegethon.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta14.messagelabs.com ([193.109.254.103])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <prvs=592900044=roger.pau@citrix.com>)
	id 1Z1ArD-00087a-L0
	for xen-devel@lists.xenproject.org; Sat, 06 Jun 2015 09:57:39 +0000
In-Reply-To: <20150605215228.GA28885@deinos.phlegethon.org>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Tim Deegan <tim@xen.org>, Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>, Lars Kurth <lars.kurth@citrix.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, David Vrabel <david.vrabel@citrix.com>, Jan Beulich <JBeulich@suse.com>, xen-devel <xen-devel@lists.xenproject.org>, Boris Ostrovsky <boris.ostrovsky@oracle.com>
List-Id: xen-devel@lists.xenproject.org

El 05/06/15 a les 23.52, Tim Deegan ha escrit:
> At 18:21 +0100 on 05 Jun (1433528517), Andrew Cooper wrote:
>> On 05/06/15 18:16, Stefano Stabellini wrote:
>>> On Fri, 5 Jun 2015, Andrew Cooper wrote:
>>>> On 05/06/15 17:43, Boris Ostrovsky wrote:
>>>>> On 06/05/2015 12:16 PM, Roger Pau Monn=E9 wrote:
>>>>>> El 03/06/15 a les 14.08, Jan Beulich ha escrit:
>>>>>>>>>> On 03.06.15 at 12:02, <stefano.stabellini@eu.citrix.com> wrote:
>>>>>>>> On Tue, 2 Jun 2015, Andrew Cooper wrote:
>>>>>>>>> With my x86 maintainer hat on, the following is an absolute
>>>>>>>>> minimum set
>>>>>>>>> of prerequisite for PVH.
>>>>>>>>>
>>>>>>>>> * 32bit support
>>>>>>>> Could you please explain why 32bit is important to get PVH out of =
tech
>>>>>>>> preview? I don't see 32 bit OSes as an important use case. Maybe t=
here
>>>>>>>> is more behind it that I cannot see.
>>>>>>> The primary reason was named before: 32-bit support will likely
>>>>>>> end up changing the way 64-bit guests get launched.
>>>>>> I can work on the new boot ABI, even if it's just a design document =
now,
>>>>>> but the actual implementation needs to be done on top of the 32-bit
>>>>>> support series.
>>>>>>
>>>>>> Boris, do you think you could send an early RFC of your 32-bit suppo=
rt
>>>>>> series in a couple of weeks at most?
>>>>> That's highly unlikely. For one, I am still unable to boot MP guests.
>>>>> In addition, it is all held together by rubber bands and matchsticks
>>>>> so calling it an RFC would be an insult to RFCs. (for example, I
>>>>> apparently broke HVM somewhere along the way).
>>>> How about working it the other way around.
>>>>
>>>> Start with an HVM guest and start with a sane method of booting.  I
>>>> highly suggest multiboot1 as it is very easy and we have most of the
>>>> code already.  Whomever actually gets around to doing this gets leeway,
>>>> subject to it being sane (which the current method very certainly is n=
ot).

I agree that using a boot ABI similar to multiboot1 is going to solve
some of the issues that we currently have, while probably simplifying
the code to build a domain. There are also several multiboot1
implementations around which can be used as a basis for this for guest
OSes that don't have native multiboot support.

>>>> Start the domain without qemu, and expose some of the PV hypercalls to
>>>> HVM guests, and see how far it gets.  One will find suddenly that all
>>>> 32bit and AMD problems have already been solved.  All the PV(h) kernel
>>>> needs to know is that there is no real hardware, and not to touch it.
>>> This seems like a clean and nice way forward, but rather than PVH is
>>> actually something else.  Am I the only one to think that making this
>>> drastic change in the design at this stange (3 years in) is too late?

I don't think the ABI is going to change much, most of this plumbing is
going to be in Xen internals, so I wouldn't call it a drastic change.

>> There was no design in the slightest, which is why we have got 3 years
>> in and are in this position.
> =

> Please try to keep things friendly and contructive on this list.  Yes,
> there was design; it was discussed on this list and at the Xen summit.
> With hindsight, it turned out that "PV guest that uses a lightweight
> HVM container" took a lot more work/code than was originally expected.
> =

> I suspect that an implementation of "HVM without qemu and some
> hypercalls" will also turn out more complex than it sounds.  I believe
> I've made my opinion clear that that's where PVH ought to end up, but
> I'm unconvinced that starting from scratch will be the fastest way.

I believe the right way to move forward is to start implementing this
new boot ABI on top of HVM, without axing out the PVH code. I think most
of the current PVH code would still be needed for the HVM-without-dm
kind of guest, and that at some point both will meet.

I will send a design document for this boot ABI next week, but the plan
is as follows:

 - Start the guest in protected mode without paging.
 - Fill the hypercall page using wrmsr (HVM).
 - Map the shared info page using XENMEM_add_to_physmap (HVM).

That means we can get rid of some of the ELFNOTES, the ones that come to
mind right now are:

 - XEN_ELFNOTE_VIRT_BASE
 - XEN_ELFNOTE_HYPERCALL_PAGE
 - XEN_ELFNOTE_HV_START_LOW
 - XEN_ELFNOTE_PAE_MODE
 - XEN_ELFNOTE_L1_MFN_VALID

And probably some more.

Roger.