From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?windows-1252?Q?Roger_Pau_Monn=E9?= Subject: Re: RFC: making the PVH 64bit ABI as stableo Date: Sat, 6 Jun 2015 11:57:31 +0200 Message-ID: <5572C40B.40500@citrix.com> References: <556DEB9A020000780008079A@mail.emea.novell.com> <556DE352.3030703@citrix.com> <556F0A410200007800080E4B@mail.emea.novell.com> <5571CB4D.5010403@citrix.com> <5571D1BD.1090108@oracle.com> <5571D501.80305@citrix.com> <5571DAB5.9030507@citrix.com> <20150605215228.GA28885@deinos.phlegethon.org> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Z1ArD-00087a-L0 for xen-devel@lists.xenproject.org; Sat, 06 Jun 2015 09:57:39 +0000 In-Reply-To: <20150605215228.GA28885@deinos.phlegethon.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tim Deegan , Andrew Cooper Cc: Elena Ufimtseva , Lars Kurth , Stefano Stabellini , David Vrabel , Jan Beulich , xen-devel , Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org El 05/06/15 a les 23.52, Tim Deegan ha escrit: > At 18:21 +0100 on 05 Jun (1433528517), Andrew Cooper wrote: >> On 05/06/15 18:16, Stefano Stabellini wrote: >>> On Fri, 5 Jun 2015, Andrew Cooper wrote: >>>> On 05/06/15 17:43, Boris Ostrovsky wrote: >>>>> On 06/05/2015 12:16 PM, Roger Pau Monn=E9 wrote: >>>>>> El 03/06/15 a les 14.08, Jan Beulich ha escrit: >>>>>>>>>> On 03.06.15 at 12:02, wrote: >>>>>>>> On Tue, 2 Jun 2015, Andrew Cooper wrote: >>>>>>>>> With my x86 maintainer hat on, the following is an absolute >>>>>>>>> minimum set >>>>>>>>> of prerequisite for PVH. >>>>>>>>> >>>>>>>>> * 32bit support >>>>>>>> Could you please explain why 32bit is important to get PVH out of = tech >>>>>>>> preview? I don't see 32 bit OSes as an important use case. Maybe t= here >>>>>>>> is more behind it that I cannot see. >>>>>>> The primary reason was named before: 32-bit support will likely >>>>>>> end up changing the way 64-bit guests get launched. >>>>>> I can work on the new boot ABI, even if it's just a design document = now, >>>>>> but the actual implementation needs to be done on top of the 32-bit >>>>>> support series. >>>>>> >>>>>> Boris, do you think you could send an early RFC of your 32-bit suppo= rt >>>>>> series in a couple of weeks at most? >>>>> That's highly unlikely. For one, I am still unable to boot MP guests. >>>>> In addition, it is all held together by rubber bands and matchsticks >>>>> so calling it an RFC would be an insult to RFCs. (for example, I >>>>> apparently broke HVM somewhere along the way). >>>> How about working it the other way around. >>>> >>>> Start with an HVM guest and start with a sane method of booting. I >>>> highly suggest multiboot1 as it is very easy and we have most of the >>>> code already. Whomever actually gets around to doing this gets leeway, >>>> subject to it being sane (which the current method very certainly is n= ot). I agree that using a boot ABI similar to multiboot1 is going to solve some of the issues that we currently have, while probably simplifying the code to build a domain. There are also several multiboot1 implementations around which can be used as a basis for this for guest OSes that don't have native multiboot support. >>>> Start the domain without qemu, and expose some of the PV hypercalls to >>>> HVM guests, and see how far it gets. One will find suddenly that all >>>> 32bit and AMD problems have already been solved. All the PV(h) kernel >>>> needs to know is that there is no real hardware, and not to touch it. >>> This seems like a clean and nice way forward, but rather than PVH is >>> actually something else. Am I the only one to think that making this >>> drastic change in the design at this stange (3 years in) is too late? I don't think the ABI is going to change much, most of this plumbing is going to be in Xen internals, so I wouldn't call it a drastic change. >> There was no design in the slightest, which is why we have got 3 years >> in and are in this position. > = > Please try to keep things friendly and contructive on this list. Yes, > there was design; it was discussed on this list and at the Xen summit. > With hindsight, it turned out that "PV guest that uses a lightweight > HVM container" took a lot more work/code than was originally expected. > = > I suspect that an implementation of "HVM without qemu and some > hypercalls" will also turn out more complex than it sounds. I believe > I've made my opinion clear that that's where PVH ought to end up, but > I'm unconvinced that starting from scratch will be the fastest way. I believe the right way to move forward is to start implementing this new boot ABI on top of HVM, without axing out the PVH code. I think most of the current PVH code would still be needed for the HVM-without-dm kind of guest, and that at some point both will meet. I will send a design document for this boot ABI next week, but the plan is as follows: - Start the guest in protected mode without paging. - Fill the hypercall page using wrmsr (HVM). - Map the shared info page using XENMEM_add_to_physmap (HVM). That means we can get rid of some of the ELFNOTES, the ones that come to mind right now are: - XEN_ELFNOTE_VIRT_BASE - XEN_ELFNOTE_HYPERCALL_PAGE - XEN_ELFNOTE_HV_START_LOW - XEN_ELFNOTE_PAE_MODE - XEN_ELFNOTE_L1_MFN_VALID And probably some more. Roger.