From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jeremy Fitzhardinge <jeremy@goop.org>
Subject: Re: Re: Xen: Hybrid extension patchset for hypervisor
Date: Thu, 17 Sep 2009 10:34:51 -0700
Message-ID: <4AB2733B.6060804@goop.org>
References: <C6D66994.14D6A%keir.fraser@eu.citrix.com>	
	<C6D6AEEA.14EBF%keir.fraser@eu.citrix.com>	
	<0B53E02A2965CE4F9ADB38B34501A3A1940C78A8@orsmsx505.amr.corp.intel.com>	
	<4AB12C1F.9080502@goop.org>	
	<1253135571.3896.4873.camel@localhost.localdomain>	
	<4AB15707.20305@goop.org>
	<1253178985.16152.26.camel@zakaz.uk.xensource.com>
	<0B53E02A2965CE4F9ADB38B34501A3A1940C840B@orsmsx505.amr.corp.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <0B53E02A2965CE4F9ADB38B34501A3A1940C840B@orsmsx505.amr.corp.intel.com>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: "Nakajima, Jun" <jun.nakajima@intel.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>, "Yang,
	Sheng" <sheng.yang@intel.com>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "Dong, Eddie" <eddie.dong@intel.com>, Keir Fraser <Keir.Fraser@eu.citrix.com>
List-Id: xen-devel@lists.xenproject.org

On 09/17/09 08:56, Nakajima, Jun wrote:
>> I very much expect that it'll need fixing/(re)implementing on both the
>> kernel and hypervisor side...
>>     
> To me, leveraging the native MMU code, rather than using existing API/ABI, would simplify both the guest and hypervisor side if hardware MMU virtualization is present. For example:
> - today a 64-bit PV guest builds/switches page tables depending on the kernel/user mode. It's not required anymore.
>   
The two pagetables are largely shared, so it really comes down to
maintaining an additional L4 page.  If the domain is running in a HAP
container, then then the "kernel" pagetable would have proper U/S bit
its pagetable entries (ie, Xen wouldn't strip them off, or set global on
user mappings) and then loading a new pagetable would just mean
reloading cr3 with the kernel pagetable.  In other words, we can still
do an efficient pagetable swap without needing to change the guest or
the ABI at all; the user pagetable would be unused and ignored, but that
isn't a huge burden.

> - we can automatically get large page support (2MB, 1GB)
>   
Once the requirement to mark pagetable pages RO goes away, then it would
be easy to add large-page support.

> I thought pv_xxx_ps (such as pv_time, pv_cpu_ops, pv_mmu_ops, etc.) was designed to choose the right pv_ops accordingly depending on the features available. 
>   

Sure.  It would be easy to either use new special-purpose just plain
native versions of those ops if that's the right thing to do; but it
would be nice if a current unmodified PV guest worked within a HVM
container and got at least some benefit from doing so.  Also, pagetable
issues have repercussions beyond just the raw pagetable update functions.

Of course you can get both these features just by booting the kernel as
an hvm guest.  But if we're talking about giving PV kernels some
benefits from hvm/hap hardware features, I think we should looking at it
from the perspective of starting with a PV kernel then adding
incremental changes.

    J