From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Slutz Subject: Re: [RFC PATCH 1/1] Add pci_hole_min_size Date: Tue, 11 Mar 2014 13:16:48 -0400 Message-ID: <531F4500.703@terremark.com> References: <1393618535-9587-1-git-send-email-dslutz@verizon.com> <1393618535-9587-2-git-send-email-dslutz@verizon.com> <53162216.8070607@terremark.com> <20140307192832.GA9568@andromeda.dapyr.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap , Konrad Rzeszutek Wilk Cc: Ian Campbell , Stefano Stabellini , Ian Jackson , Don Slutz , "xen-devel@lists.xen.org" , Gordan Bobic List-Id: xen-devel@lists.xenproject.org On 03/11/14 08:54, George Dunlap wrote: > On Fri, Mar 7, 2014 at 7:28 PM, Konrad Rzeszutek Wilk wrote: >> On Tue, Mar 04, 2014 at 01:57:26PM -0500, Don Slutz wrote: >>> On 03/04/14 08:25, George Dunlap wrote: >>>> On Fri, Feb 28, 2014 at 8:15 PM, Don Slutz wrote: >>>>> This allows growing the pci_hole to the size needed. >>>> You mean, it allows the pci hole size to be specified at boot >>> Yes. >>> >>>> -- the >>>> pci hole still cannot be enlarged dynamically in hvmloader, correct? >>> If I am correctly understanding you, this is in reference to: >>> >>> /* >>> * At the moment qemu-xen can't deal with relocated memory regions. >>> * It's too close to the release to make a proper fix; for now, >>> * only allow the MMIO hole to grow large enough to move guest memory >>> * if we're running qemu-traditional. Items that don't fit will be >>> * relocated into the 64-bit address space. */ >>> >>> >>> so the answer is no, however using pci_hole_min_size can mean that >>> allow_memory_relocate is not needed for upstream QEMU. >>> >>> >>> >>>> What's your intended use case for this? >>>> >>>> -George >>> If you add enough PCI devices then all mmio may not fit below 4G which may >>> not be the layout the user wanted. This allows you to increase the below 4G >>> address space that PCI devices can use and therefore in more cases not have >>> any mmio that is above 4G. >>> >>> There are real PCI cards that do not support mmio over 4G, so if you want >>> to emulate them precisely, you may also need to increase the space below 4G >>> for them. There are drivers for these cards that also do not work if they >>> have their mmio space mapped above 4G. >> Would it be better if the HVM guests had something similar to what we >> manufacture for PV guests with PCI passthrough: an filtered version of >> the host's E820? >> >> That way you don't have to worry about resizing just right and instead >> the E820 looks like the hosts one. Granted you can't migrate, but I >> don't think that is a problem in your use-case? > Having the guest PCI hole the same size as the host PCI hole also gets > rid of a whole class of (unfortunately very common) bugs in PCI > hardware, such that if guest paddrs collide overlap with device IO > ranges the PCI hardware sends the DMA requests to the wrong place. > (In other words, VT-d as implemented in a very large number of > motherboards is utterly broken -- total fail on someone's part.) > > The main disadvantage of this is that it unnecessarily reduces the > amount of lowmem available -- and for 32-bit non-PAE guests, reduces > the total amount of memory available at all. > > I think long-term, it would be best to: > * Have the pci hole be small for VMs without devices passed through > * Have the pci hole default to the host pci hole for VMs with devices > passed through > * Have the pci hole size able to be specified, either as a size, or as "host". > > As long as the size specification can be extended to this > functionality easily, I think just having a size to begin with is OK. I see no problem with extending to add "host". So I am starting with just a size. Note: the new QEMU way is simpler to decode from an e820 map. For example: Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] e820: BIOS-provided physical RAM map: Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009afff] usable Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x000000000009b800-0x00000000000fffff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x0000000000100000-0x00000000bf63efff] usable Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000bf63f000-0x00000000bf6befff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000bf6bf000-0x00000000bf7befff] ACPI NVS Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000bf7bf000-0x00000000bf7fefff] ACPI data Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000bf7ff000-0x00000000bf7fffff] usable Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000bf800000-0x00000000bfffffff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000e0000000-0x00000000efffffff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000feb00000-0x00000000feb03fff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000fec00000-0x00000000fec00fff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000fed10000-0x00000000fed19fff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000fee00000-0x00000000feefffff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000000ffd80000-0x00000000ffffffff] reserved Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x0000000100000000-0x00000005c0e16fff] usable Mar 10 13:08:28 dcs-xen-54 kernel: [ 0.000000] Xen: [mem 0x00000005c0e17000-0x000000083fffffff] unusable pci_hole_min_size = 1082130432 (0x40800000) for 0xbf800000 pci_hole_min_size = 536870912 (0x20000000) for 0xe0000000 Note: It does not leap out at me from the e820 map which is the host one. > I think the qemu guys didn't like the term "pci_hole" and wanted > something like "lowmem" instead -- that will need to be sorted out. Next version of QEMU patch out with new name. Was not sure what name or size make the most sense. lowmem == 1 << 32 - pci_hole_min_size. -Don Slutz > -George