From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39066) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z4VTs-00075G-7F for qemu-devel@nongnu.org; Mon, 15 Jun 2015 10:35:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z4VTn-0001cJ-Mx for qemu-devel@nongnu.org; Mon, 15 Jun 2015 10:35:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45644) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z4VTn-0001a5-Fg for qemu-devel@nongnu.org; Mon, 15 Jun 2015 10:35:15 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (Postfix) with ESMTPS id 05715350104 for ; Mon, 15 Jun 2015 14:35:15 +0000 (UTC) Message-ID: <557EE29C.7090000@redhat.com> Date: Mon, 15 Jun 2015 16:35:08 +0200 From: Laszlo Ersek MIME-Version: 1.0 References: <557C32AC.7030903@redhat.com> <1434203538-8075-1-git-send-email-lersek@redhat.com> <1434203538-8075-2-git-send-email-lersek@redhat.com> <87twu9m59d.fsf@blackfin.pond.sub.org> In-Reply-To: <87twu9m59d.fsf@blackfin.pond.sub.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 1/4] hw/pci-bridge: create interrupt-less, hotplug-less bridge for PXB List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: Marcel Apfelbaum , qemu-devel@nongnu.org, "Michael S. Tsirkin" On 06/15/15 16:10, Markus Armbruster wrote: > Laszlo Ersek writes: > >> OVMF downloads the ACPI linker/loader script from QEMU when the edk2 PCI >> Bus driver globally signals the firmware that PCI enumeration and resource >> allocation have completed. At this point QEMU regenerates the ACPI payload >> in an fw_cfg read callback, and this is when the PXB's _CRS gets >> populated. >> >> Unfortunately, when this happens, the PCI_COMMAND_MEMORY bit is clear in >> the root bus's command register, *unlike* under SeaBIOS. The consequences >> unfold as follows: >> >> - When build_crs() fetches dev->io_regions[i].addr, it is all-bits-one, >> because pci_update_mappings() --> pci_bar_address() calculated it as >> PCI_BAR_UNMAPPED, due to the PCI_COMMAND_MEMORY bit being clear. >> >> - Consequently, the SHPC MMIO BAR (bar 0) of the bridge is not added to >> the _CRS, *despite* having been programmed in PCI config space. >> >> - Similarly, the SHPC MMIO BAR of the PXB is not removed from the main >> root bus's DWordMemory descriptor. >> >> - Guest OSes (Linux and Windows alike) notice the pre-programmed SHPC BAR >> within the PXB's config space, and notice that it conflicts with the >> main root bus's memory resource descriptors. Linux reports >> >> pci 0000:04:00.0: BAR 0: can't assign mem (size 0x100) >> pci 0000:04:00.0: BAR 0: trying firmware assignment [mem >> 0x88200000-0x882000ff 64bit] >> pci 0000:04:00.0: BAR 0: [mem 0x88200000-0x882000ff 64bit] conflicts >> with PCI Bus 0000:00 [mem >> 0x88200000-0xfebfffff] >> >> While Windows Server 2012 R2 reports >> >> https://technet.microsoft.com/en-us/library/cc732199%28v=ws.10%29.aspx >> >> This device cannot find enough free resources that it can use. If you >> want to use this device, you will need to disable one of the other >> devices on this system. (Code 12) >> >> This issue was apparently encountered earlier, see the "hack" in: >> >> https://lists.nongnu.org/archive/html/qemu-devel/2015-01/msg02983.html >> >> and the current hole-punching logic in build_crs() and build_ssdt() is >> probably supposed to remedy exactly that problem -- however, for OVMF they >> don't work, because at the end of the PCI enumeration and resource >> allocation, which cues the ACPI linker/loader client, the command register >> is clear. >> >> It has been suggested to implement the above "hack" more cleanly and >> permanently. Unfortunately, we can't just disable the SHPC bar of >> TYPE_PCI_BRIDGE_DEV based on a new property, because this device model has >> class-level ties to hotplug, and the SHPC bar (and the interrupt) >> originate from there. >> >> Therefore implement a more basic bridge device type, to be used by PXB, >> that has no SHPC / hotplug support at all, and consequently (see commit >> c008ac0c), no need for an interrupt / MSI either. >> >> Suggested-by: Marcel Apfelbaum >> Cc: Marcel Apfelbaum >> Cc: Michael S. Tsirkin >> Signed-off-by: Laszlo Ersek >> Reviewed-by: Marcel Apfelbaum > > Suggest to have the commit message state explicitly how > "pci-basic-bridge" is related to "pci-bridge". If I understand it > correctly, it's a stripped-down copy. A comment in the code might be in > order, too. > > We'd normally avoid copying, and instead have the same code implement > both variants of the device. I figure your reason for copying is that > the "copy" implementing "pci-basic-bridge" is so small and simple that > avoiding it wouldn't make things simpler or more maintainable. Suggest > to put that rationale into the commit message as well. That reason is a valid one, yes, but it's not the main reason. The main reason is that I cannot implement this specific flavor of the device, in the original device model, without (potentially) crashing outgoing migration. The reason is that this stripped down version (regardless of *how* it is implemented) will never initialize the "shpc" pointer in the grandparent PCIDevice object (-> "grandparent" used in the class hierarchy sense) -- as it will never call shpc_init(). Leaving "PCIDevice.shpc" at NULL is not a problem at all per se. Many (practically: all) other devices derived from PCIDevice do *not* call shpc_init() either, and they are all fine. However, the vmstate definition for the original PciBridgeDev -- that is, "pci_bridge_dev_vmstate" -- tries to migrate "PCIDevice.shpc" *unconditionally*. (Ie. it is not a subsection.) Meaning, if I add a property-controlled variant to PciBridgeDev that leaves "PCIDevice.shpc" at NULL, then outgoing migration will dereference that null pointer, in: SHPC_VMSTATE() VMSTATE_BUFFER_UNSAFE_INFO(... shpc_vmstate_info ...) shpc_save() qemu_put_buffer(f, d->shpc->config, SHPC_SIZEOF(d)); Therefore I'm forced to detach this new device model completely -- I have to eliminate that unconditional SHPC_VMSTATE() too. I could rework the original device model to use a property, *and* out-migrate the shpc field as an optional *subsection*. That would not crash. However, if I changed the original device model like this, then it could not *accept* a migration stream from earlier QEMU versions, because they would not send the SHPC as a subsection; they'd send it as a fixed field. I agree that this should be documented in the commit message; I'll write it up in the next version. Thanks Laszlo