From mboxrd@z Thu Jan  1 00:00:00 1970
From: Wei Huang <wei.huang2@amd.com>
Subject: Re: [RFC][Patches] Xen 1GB Page Table Support
Date: Wed, 13 Jan 2010 10:27:59 -0600
Message-ID: <4B4DF48E.9090904@amd.com>
References: <034622152516C547BE5EA19D5EFC86287E0ED0@sausexmb5.amd.com>	<de76405a0903181020i7e44a9d4o762e9045d085a97b@mail.gmail.com>
	<034622152516C547BE5EA19D5EFC86287E106B@sausexmb5.amd.com>
	<6CADD16F56BC954D8E28F3836FA7ED7112A79327BC@shzsmsx501.ccr.corp
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <6CADD16F56BC954D8E28F3836FA7ED7112A79327BC@shzsmsx501.ccr.corp.intel.com>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: "Xu, Dongxiao" <dongxiao.xu@intel.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "keir.fraser@eu.citrix.com" <keir.fraser@eu.citrix.com>, Tim Deegan <Tim.Deegan@citrix.com>
List-Id: xen-devel@lists.xenproject.org

Dongxiao,

Thanks for re-basing the code. Does the new code work for you? I need to 
test the new code again on my machine and make sure it doesn't break. 
After that, we can ask Keir to push into upstream.

-Wei


Xu, Dongxiao wrote:
> Hi, Wei, 
> 	I digged out this thread from looooong history...
> 	Do you still have plan to push your 1gb_p2m and 1gb_tool patches into upstream? 
> 	I rebased them to fit the latest upstream code (the 1gb_tools.patch is not changed). 
> 	Comment on that or any new idea? Thanks!
>
> Best Regards,
> -- Dongxiao
>
> Huang2, Wei wrote:
>   
>> Here are patches using the middle approach. It handles 1GB pages in
>> PoD by remapping 1GB with 2MB pages & retry. I also added code for 1GB
>> detection. Please comment.
>>
>> Thanks a lot,
>>
>> -Wei
>>
>> -----Original Message-----
>> From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
>> Dunlap
>> Sent: Wednesday, March 18, 2009 12:20 PM
>> To: Huang2, Wei
>> Cc: xen-devel@lists.xensource.com; keir.fraser@eu.citrix.com; Tim
>> Deegan Subject: Re: [Xen-devel] [RFC][Patches] Xen 1GB Page Table
>> Support 
>>
>> Thanks for doing this work, Wei -- especially all the extra effort for
>> the PoD integration.
>>
>> One question: How well would you say you've tested the PoD
>> functionality?  Or to put it the other way, how much do I need to
>> prioritize testing this before the 3.4 release?
>>
>> It wouldn't be a bad idea to do as you suggested, and break things
>> into 2 meg pages for the PoD case.  In order to take the best
>> advantage of this in a PoD scenario, you'd need to have a balloon
>> driver that could allocate 1G of continuous *guest* p2m space, which
>> seems a bit optimistic at this point...
>>
>>  -George
>>
>> 2009/3/18 Huang2, Wei <Wei.Huang2@amd.com>:
>>     
>>> Current Xen supports 2MB super pages for NPT/EPT. The attached
>>> patches extend this feature to support 1GB pages. The PoD
>>> (populate-on-demand) introduced by George Dunlap made P2M
>>> modification harder. I tried to preserve existing PoD design by
>>> introducing a 1GB PoD cache list. 
>>>
>>>
>>>
>>> Note that 1GB PoD can be dropped if we don't care about 1GB when PoD
>>> is enabled. In this case, we can just split 1GB PDPE into 512x2MB
>>> PDE entries and grab pages from PoD super list. That can pretty much
>>> make 1gb_p2m_pod.patch go away. 
>>>
>>>
>>>
>>> Any comment/suggestion on design idea will be appreciated.
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> -Wei
>>>
>>>
>>>
>>>
>>>
>>> The following is the description:
>>>
>>> === 1gb_tools.patch ===
>>>
>>> Extend existing setup_guest() function. Basically, it tries to
>>> allocate 1GB pages whenever available. If this request fails, it
>>> falls back to 2MB. If both fail, then 4KB pages will be used.
>>>
>>>
>>>
>>> === 1gb_p2m.patch ===
>>>
>>> * p2m_next_level()
>>>
>>> Check PSE bit of L3 page table entry. If 1GB is found (PSE=1), we
>>> split 1GB into 512 2MB pages. 
>>>
>>>
>>>
>>> * p2m_set_entry()
>>>
>>> Configure the PSE bit of L3 P2M table if page order == 18 (1GB).
>>>
>>>
>>>
>>> * p2m_gfn_to_mfn()
>>>
>>> Add support for 1GB case when doing gfn to mfn translation. When L3
>>> entry is marked as POPULATE_ON_DEMAND, we call
>>> 2m_pod_demand_populate(). Otherwise, we do the regular address
>>> translation (gfn ==> mfn). 
>>>
>>>
>>>
>>> * p2m_gfn_to_mfn_current()
>>>
>>> This is similar to p2m_gfn_to_mfn(). When L3 entry s marked as
>>> POPULATE_ON_DEMAND, it demands a populate using
>>> p2m_pod_demand_populate(). Otherwise, it does a normal translation.
>>> 1GB page is taken into consideration. 
>>>
>>>
>>>
>>> * set_p2m_entry()
>>>
>>> Request 1GB page
>>>
>>>
>>>
>>> * audit_p2m()
>>>
>>> Support 1GB while auditing p2m table.
>>>
>>>
>>>
>>> * p2m_change_type_global()
>>>
>>> Deal with 1GB page when changing global page type.
>>>
>>>
>>>
>>> === 1gb_p2m_pod.patch ===
>>>
>>> * xen/include/asm-x86/p2m.h
>>>
>>> Minor change to deal with PoD. It separates super page cache list
>>> into 2MB and 1GB lists. Similarly, we record last gpfn of sweeping
>>> for both 2MB and 1GB. 
>>>
>>>
>>>
>>> * p2m_pod_cache_add()
>>>
>>> Check page order and add 1GB super page into PoD 1GB cache list.
>>>
>>>
>>>
>>> * p2m_pod_cache_get()
>>>
>>> Grab a page from cache list. It tries to break 1GB page into 512 2MB
>>> pages if 2MB PoD list is empty. Similarly, 4KB can be requested from
>>> super pages. The breaking order is 2MB then 1GB.
>>>
>>>
>>>
>>> * p2m_pod_cache_target()
>>>
>>> This function is used to set PoD cache size. To increase PoD target,
>>> we try to allocate 1GB from xen domheap. If this fails, we try 2MB.
>>> If both fail, we try 4KB which is guaranteed to work.
>>>
>>>
>>>
>>> To decrease the target, we use a similar approach. We first try to
>>> free 1GB pages from 1GB PoD cache list. If such request fails, we
>>> try 2MB PoD cache list. If both fail, we try 4KB list.
>>>
>>>
>>>
>>> * p2m_pod_zero_check_superpage_1gb()
>>>
>>> This adds a new function to check for 1GB page. This function is
>>> similar to p2m_pod_zero_check_superpage_2mb().
>>>
>>>
>>>
>>> * p2m_pod_zero_check_superpage_1gb()
>>>
>>> We add a new function to sweep 1GB page from guest memory. This is
>>> the same as p2m_pod_zero_check_superpage_2mb().
>>>
>>>
>>>
>>> * p2m_pod_demand_populate()
>>>
>>> The trick of this function is to do remap_and_retry if
>>> p2m_pod_cache_get() fails. When p2m_pod_get() fails, this function
>>> will splits p2m table entry into smaller ones (e.g. 1GB ==> 2MB or
>>> 2MB ==> 4KB). That can guarantee populate demands always work. 
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel