From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41782)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <George.Dunlap@eu.citrix.com>) id 1Umi5V-0006iY-5E
	for qemu-devel@nongnu.org; Wed, 12 Jun 2013 06:15:38 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <George.Dunlap@eu.citrix.com>) id 1Umi5Q-000296-NI
	for qemu-devel@nongnu.org; Wed, 12 Jun 2013 06:15:33 -0400
Received: from smtp02.citrix.com ([66.165.176.63]:24543)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <George.Dunlap@eu.citrix.com>) id 1Umi5Q-000290-Ic
	for qemu-devel@nongnu.org; Wed, 12 Jun 2013 06:15:28 -0400
Message-ID: <51B84A31.8060003@eu.citrix.com>
Date: Wed, 12 Jun 2013 11:15:13 +0100
From: George Dunlap <george.dunlap@eu.citrix.com>
MIME-Version: 1.0
References: <EE92950F97EE42469CA4F508D4691F5E016FAD15@SHSMSX104.ccr.corp.intel.com>
	<alpine.DEB.2.02.1306071246270.4589@kaball.uk.xensource.com>
	<51B1FF50.90406@eu.citrix.com>
	<alpine.DEB.2.02.1306071655060.4589@kaball.uk.xensource.com>
	<403610A45A2B5242BD291EDAE8B37D3010E56731@SHSMSX102.ccr.corp.intel.com>
	<CAFLBxZZfH8im-hTrma29Ag7CUR1HZEm=4b7ft_h5weukGL1BzQ@mail.gmail.com>
	<alpine.DEB.2.02.1306111735590.4548@kaball.uk.xensource.com>
	<51B83E7A02000078000DD6E9@nat28.tlf.novell.com>
	<51B847E3.5010604@eu.citrix.com>
	<51B8657302000078000DD7FD@nat28.tlf.novell.com>
In-Reply-To: <51B8657302000078000DD7FD@nat28.tlf.novell.com>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [Xen-devel] [BUG 1747]Guest could't find bootable
 device with memory more than 3600M
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Jan Beulich <JBeulich@suse.com>
Cc: Tim Deegan <tim@xen.org>, Yongjie Ren <yongjie.ren@intel.com>, yanqiangjun@huawei.com, Keir Fraser <keir@xen.org>, Ian Campbell <Ian.Campbell@citrix.com>, hanweidong@huawei.com, Xudong Hao <xudong.hao@intel.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, luonengjun@huawei.com, qemu-devel@nongnu.org, wangzhenguo@huawei.com, xiaowei.yang@huawei.com, arei.gonglei@huawei.com, Paolo Bonzini <pbonzini@redhat.com>, YongweiX Xu <yongweix.xu@intel.com>, SongtaoX Liu <songtaox.liu@intel.com>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>

On 12/06/13 11:11, Jan Beulich wrote:
>>>> On 12.06.13 at 12:05, George Dunlap <george.dunlap@eu.citrix.com> wrote:
>> On 12/06/13 08:25, Jan Beulich wrote:
>>>>>> On 11.06.13 at 19:26, Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>> wrote:
>>>> I went through the code that maps the PCI MMIO regions in hvmloader
>>>> (tools/firmware/hvmloader/pci.c:pci_setup) and it looks like it already
>>>> maps the PCI region to high memory if the PCI bar is 64-bit and the MMIO
>>>> region is larger than 512MB.
>>>>
>>>> Maybe we could just relax this condition and map the device memory to
>>>> high memory no matter the size of the MMIO region if the PCI bar is
>>>> 64-bit?
>>> I can only recommend not to: For one, guests not using PAE or
>>> PSE-36 can't map such space at all (and older OSes may not
>>> properly deal with 64-bit BARs at all). And then one would generally
>>> expect this allocation to be done top down (to minimize risk of
>>> running into RAM), and doing so is going to present further risks of
>>> incompatibilities with guest OSes (Linux for example learned only in
>>> 2.6.36 that PFNs in ioremap() can exceed 32 bits, but even in
>>> 3.10-rc5 ioremap_pte_range(), while using "u64 pfn", passes the
>>> PFN to pfn_pte(), the respective parameter of which is
>>> "unsigned long").
>>>
>>> I think this ought to be done in an iterative process - if all MMIO
>>> regions together don't fit below 4G, the biggest one should be
>>> moved up beyond 4G first, followed by the next to biggest one
>>> etc.
>> First of all, the proposal to move the PCI BAR up to the 64-bit range is
>> a temporary work-around.  It should only be done if a device doesn't fit
>> in the current MMIO range.
>>
>> We have three options here:
>> 1. Don't do anything
>> 2. Have hvmloader move PCI devices up to the 64-bit MMIO hole if they
>> don't fit
>> 3. Convince qemu to allow MMIO regions to mask memory (or what it thinks
>> is memory).
>> 4. Add a mechanism to tell qemu that memory is being relocated.
>>
>> Number 4 is definitely the right answer long-term, but we just don't
>> have time to do that before the 4.3 release.  We're not sure yet if #3
>> is possible; even if it is, it may have unpredictable knock-on effects.
>>
>> Doing #2, it is true that many guests will be unable to access the
>> device because of 32-bit limitations.  However, in #1, *no* guests will
>> be able to access the device.  At least in #2, *many* guests will be
>> able to do so.  In any case, apparently #2 is what KVM does, so having
>> the limitation on guests is not without precedent.  It's also likely to
>> be a somewhat tested configuration (unlike #3, for example).
> That's all fine with me. My objection was to Stefano's consideration
> to assign high addresses to _all_ 64-bit capable BARs up, not just
> the biggest one(s).

Oh right -- I understood him to mean, "*allow* hvmloader to map the 
device memory to high memory *if necessary* if the BAR is 64-bit". I 
agree, mapping them all at 64-bit even if there's room in the 32-bit 
hole isn't a good idea.

  -George