From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50592)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <marcel@redhat.com>) id 1bEfLl-00054e-KF
	for qemu-devel@nongnu.org; Sun, 19 Jun 2016 12:13:31 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <marcel@redhat.com>) id 1bEfLh-0000NV-CI
	for qemu-devel@nongnu.org; Sun, 19 Jun 2016 12:13:28 -0400
Received: from mx1.redhat.com ([209.132.183.28]:40012)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <marcel@redhat.com>) id 1bEfLh-0000Mn-49
	for qemu-devel@nongnu.org; Sun, 19 Jun 2016 12:13:25 -0400
Received: from int-mx09.intmail.prod.int.phx2.redhat.com
	(int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mx1.redhat.com (Postfix) with ESMTPS id 05AC690E5A
	for <qemu-devel@nongnu.org>; Sun, 19 Jun 2016 16:13:22 +0000 (UTC)
References: <1466097133-5489-1-git-send-email-dgilbert@redhat.com>
	<1466097133-5489-5-git-send-email-dgilbert@redhat.com>
	<20160616202449.GY18662@thinpad.lan.raisama.net>
	<20160617081505.GA2273@work-vm>
	<08f8e4e0-781a-d7f2-9008-3274f8a085eb@redhat.com>
	<1466155074.18921.16.camel@redhat.com>
	<20160617115239.035fb544@nial.brq.redhat.com>
	<2cebe3e1-4d22-ef3b-d1d7-734f1b2371df@redhat.com>
From: Marcel Apfelbaum <marcel@redhat.com>
Message-ID: <5766C49D.2000102@redhat.com>
Date: Sun, 19 Jun 2016 19:13:17 +0300
MIME-Version: 1.0
In-Reply-To: <2cebe3e1-4d22-ef3b-d1d7-734f1b2371df@redhat.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 4/5] x86: Allow physical address bits to be
 set
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Laszlo Ersek <lersek@redhat.com>, Igor Mammedov <imammedo@redhat.com>, Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, aarcange@redhat.com, qemu-devel@nongnu.org, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>

On 06/17/2016 07:07 PM, Laszlo Ersek wrote:
> On 06/17/16 11:52, Igor Mammedov wrote:
>> On Fri, 17 Jun 2016 11:17:54 +0200
>> Gerd Hoffmann <kraxel@redhat.com> wrote:
>>
>>> On Fr, 2016-06-17 at 10:43 +0200, Paolo Bonzini wrote:
>>>>
>>>> On 17/06/2016 10:15, Dr. David Alan Gilbert wrote:
>>>>> Larger is a problem if the guest tries to map something to a high
>>>>> address that's not addressable.
>>>>
>>>> Right.  It's not a problem for most emulated PCI devices (it would be a
>>>> problem for those that have large RAM BARs, but even our emulated video
>>>> cards do not have 64-bit RAM BARs, I think;
>>>
>>> qxl can be configured to have one, try "-device
>>> qxl-vga,vram64_size_mb=1024"
>>>
>>>>>     2) While we have maxmem settings to tell us the top of VM RAM, do
>>>>>        we have anything that tells us the top of IO space? What happens
>>>>>        when we hotplug a PCI card?
>>>
>>>> (arch/x86/kernel/setup.c) but I agree that (2) is a blocker.
>>>
>>> seabios maps stuff right above ram (possibly with a hole due to
>>> alignment requirements).
>>>
>>> ovmf maps stuff into a 32G-aligned 32G hole.  Which lands at 32G and
>>> therefore is addressable with 36 bits, unless you have tons of ram (>
>>> 30G) assigned to your guest.  A physical host machine where you can plug
>>> in enough ram for such a configuration likely has more than 36 physical
>>> address lines too ...
>>>
>>> qemu checks where the firmware mapped 64bit bars, then adds those ranges
>>> to the root bus pci resources in the acpi tables (see /proc/iomem).
>>>
>>>> You don't know how the guest will assign PCI BAR addresses, and as you
>>>> said there's hotplug too.
>>>
>>> Not sure whenever qemu adds some extra space for hotplug to the 64bit
>>> hole and if so how it calculates the size then.  But the guest os should
>>> stick to those ranges when configuring hotplugged devices.
>> currently firmware would assign 64-bit BARs after reserved-memory-end
>> (not sure about ovmf though)
>
> OVMF does the same as well. It makes sure that the 64-bit PCI MMIO
> aperture is located above "etc/reserved-memory-end", if the latter exists.
>
>> but QEMU on ACPI side will add 64-bit _CRS only
>> for firmware mapped devices (i.e. no space reserved for hotplug).
>> And is I recall correctly ovmf won't map BARs if it doesn't have
>> a driver for it
>
> Yes, that's correct, generally for all UEFI firmware.
>
> More precisely, BARs will be allocated and programmed, but the MMIO
> space decoding bit will not be set (permanently) in the device's command
> register, if there is no matching driver in the firmware (or in the
> device's own oprom).
>
>> so ACPI tables won't even have a space for not mapped
>> 64-bit BARs.
>
> This used to be true, but that's not the case since
> <https://github.com/tianocore/edk2/commit/8f35eb92c419>.
>
> Namely, specifically for conforming to QEMU's ACPI generator, OVMF
> *temporarily* enables, as a platform quirk, all PCI devices present in
> the system, before triggering QEMU to generate the ACPI payload.
>
> Thus, nowadays 64-bit BARs work fine with OVMF, both for virtio-modern
> devices, and assigned physical devices. (This is very easy to test,
> because, unlike SeaBIOS, the edk2 stuff built into OVMF prefers to
> allocate 64-bit BARs outside of the 32-bit address space.)
>
> Devices behind PXBs are a different story, but Marcel's been looking
> into that, see <https://bugzilla.redhat.com/show_bug.cgi?id=1323976>.
>
>> There was another attempt to reserve more space in _CRS
>>    https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg00090.html
>
> That's actually Marcel's first own patch set for addressing RHBZ#1323976
> that I mentioned above (see it linked in
> <https://bugzilla.redhat.com/show_bug.cgi?id=1323976#c2>).
>
> It might have wider effects, but it is entirely motivated, to my
> knowledge, by PXB. If you don't have extra root bridges, and/or you plug
> all your devices with 64-bit MMIO BARs into the "main" (default) root
> bridge, then (I believe) that patch set is not supposed to make any
> difference. (I could be wrong, it's been a while since I looked at
> Marcel's work!)
>

Patch 3 and 4 indeed are for PXB only. but patch 'pci: reserve 64 bit MMIO range for PCI hotplug'
(see https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg00091.html) tries
to reserve [above_4g_mem_size, max_addressable_cpu_bits] range for PCI hotplug.

The implementation is not good enough because the number of addressable bits is hard-coded.
However, we have now David's wrapper I can use.


Thanks,
Marcel


> Thanks
> Laszlo
>