From mboxrd@z Thu Jan  1 00:00:00 1970
From: David Vrabel <david.vrabel@citrix.com>
Subject: Re: [PATCH V3 1/1] expand x86 arch_shared_info to
 support >3 level p2m tree
Date: Tue, 16 Sep 2014 12:56:17 +0100
Message-ID: <54182561.6060403@citrix.com>
References: <1410256709-25885-1-git-send-email-jgross@suse.com>	<1410256709-25885-2-git-send-email-jgross@suse.com>	<540ED600.3060102@citrix.com>	<540EDB4F.30402@suse.com>	<5412CB80.9030208@suse.com>	<5416A379.5@citrix.com>	<5416A8CE.5020400@suse.com>	<5416B518.8030504@citrix.com>	<5416B6F1.2020102@suse.com>	<5416BFC2.4040900@citrix.com>	<5416C392.1010707@suse.com>	<5416F809.7060509@citrix.com>
	<5417B40B.4000703@suse.com>	<54180D98.8030903@citrix.com>
	<54181329.7030000@suse.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <54181329.7030000@suse.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Juergen Gross <jgross@suse.com>, David Vrabel <david.vrabel@citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, ian.campbell@citrix.com, ian.jackson@eu.citrix.com, jbeulich@suse.com, keir@xen.org, tim@xen.org, xen-devel@lists.xen.org
List-Id: xen-devel@lists.xenproject.org

On 16/09/14 11:38, Juergen Gross wrote:
> On 09/16/2014 12:14 PM, David Vrabel wrote:
>> On 16/09/14 04:52, Juergen Gross wrote:
>>> On 09/15/2014 04:30 PM, David Vrabel wrote:
>>>> On 15/09/14 11:46, Juergen Gross wrote:
>>>>> So you'd prefer:
>>>>>
>>>>> 1) >512GB pv-domains (including Dom0) will be supported only with new
>>>>>      Xen (4.6?), no matter if the user requires migration to be
>>>>> supported
>>>>
>>>> Yes.  >512 GiB and not being able to migrate are not obviously related
>>>> from the point of view of the end user (unlike assigning a PCI device).
>>>>
>>>> Failing at domain save time is most likely too late for the end user.
>>>
>>> What would you think about following compromise:
>>>
>>> We add a flag that indicates support of multi-level p2m. Additionally
>>> the Linux kernel can ignore the flag not being set either if started as
>>> Dom0 or if told so via kernel parameter.
>>
>> This sounds fine but this override should be via the command line
>> parameter only.  Crash dump analysis tools may not understand the 4
>> level p2m.
>>
>>>>> to:
>>>>>
>>>>> 2) >512GB pv-domains (especially Dom0 and VMs with direct hw
>>>>> access) can
>>>>>      be started on current Xen versions, migration is possible only if
>>>>> Xen
>>>>>      is new (4.6?)
>>>>
>>>> There's also my preferred option:
>>>>
>>>> 3) >512 GiB PV domains are not supported.  Large guests must be PVH or
>>>> PVHVM.
>>>
>>> In theory okay, but not right now, I think. PVH Dom0 is not production
>>> ready.
>>
>> I'm not really seeing the need for such a large dom0.
> 
> Okay, then I'd come back to V1 of my patches. This is the minimum
> required to be able to boot up a system with Xen and more than 512GB
> memory without having to reduce the Dom0 memory via Xen boot parameter.
> 
> Otherwise the hypervisor built mfn_list mapped into the initial address
> space will be too large.
> 
> And no, I don't think setting the boot parameter is the solution here.
> Dom0 should be usable on a huge machine without special parameters.

Ok. The case where's dom0's p2m format matters is pretty specialized.

>> I also think a flat array for the p2m might be better (less complex).
>> There's plenty of virtual address space in a 64-bit guest to allow for
>> this.
> 
> Hmm, do you think we could reserve an area of many GBs for Xen in
> virtual space? I suspect this would be rejected as another "Xen-ism".

alloc_vm_area()

> BTW: the mfn_list_list will still be required to be built as a tree.

The tools could be given the guest virtual address and walk the guest
page tables.

This is probably too much of a difference from the existing ABI to be
worth pursuing at this point.

David