xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Don Slutz <dslutz@verizon.com>
To: George Dunlap <George.Dunlap@eu.citrix.com>,
	Konrad Rzeszutek Wilk <konrad@darnok.org>
Cc: Ian Campbell <ian.campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Don Slutz <dslutz@verizon.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	Gordan Bobic <gordan@bobich.net>
Subject: Re: [RFC PATCH 1/1] Add pci_hole_min_size
Date: Tue, 11 Mar 2014 13:16:48 -0400	[thread overview]
Message-ID: <531F4500.703@terremark.com> (raw)
In-Reply-To: <CAFLBxZahEBrHP=yEOQAZjXkts-zMqigmHuUA_JRumTz8c=N8-A@mail.gmail.com>

On 03/11/14 08:54, George Dunlap wrote:
> On Fri, Mar 7, 2014 at 7:28 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote:
>> On Tue, Mar 04, 2014 at 01:57:26PM -0500, Don Slutz wrote:
>>> On 03/04/14 08:25, George Dunlap wrote:
>>>> On Fri, Feb 28, 2014 at 8:15 PM, Don Slutz <dslutz@verizon.com> wrote:
>>>>> This allows growing the pci_hole to the size needed.
>>>> You mean, it allows the pci hole size to be specified at boot
>>> Yes.
>>>
>>>>   -- the
>>>> pci hole still cannot be enlarged dynamically in hvmloader, correct?
>>> If I am correctly understanding you, this is in reference to:
>>>
>>> /*
>>>       * At the moment qemu-xen can't deal with relocated memory regions.
>>>       * It's too close to the release to make a proper fix; for now,
>>>       * only allow the MMIO hole to grow large enough to move guest memory
>>>       * if we're running qemu-traditional.  Items that don't fit will be
>>>       * relocated into the 64-bit address space.   */
>>>
>>>
>>> so the answer is no, however using pci_hole_min_size can mean that
>>> allow_memory_relocate is not needed for upstream QEMU.
>>>
>>>
>>>
>>>> What's your intended use case for this?
>>>>
>>>>   -George
>>> If you add enough PCI devices then all mmio may not fit below 4G which may
>>> not be the layout the user wanted. This allows you to increase the below 4G
>>> address space that PCI devices can use and therefore in more cases not have
>>> any mmio that is above 4G.
>>>
>>> There are real PCI cards that do not support mmio over 4G, so if you want
>>> to emulate them precisely, you may also need to increase the space below 4G
>>> for them.  There are drivers for these cards that also do not work if they
>>> have their mmio space mapped above 4G.
>> Would it be better if the HVM guests had something similar to what we
>> manufacture for PV guests with PCI passthrough: an filtered version of
>> the host's E820?
>>
>> That way you don't have to worry about resizing just right and instead
>> the E820 looks like the hosts one. Granted you can't migrate, but I
>> don't think that is a problem in your use-case?
> Having the guest PCI hole the same size as the host PCI hole also gets
> rid of a whole class of (unfortunately very common) bugs in PCI
> hardware, such that if guest paddrs collide overlap with device IO
> ranges the PCI hardware sends the DMA requests to the wrong place.
> (In other words, VT-d as implemented in a very large number of
> motherboards is utterly broken -- total fail on someone's part.)
>
> The main disadvantage of this is that it unnecessarily reduces the
> amount of lowmem available -- and for 32-bit non-PAE guests, reduces
> the total amount of memory available at all.
>
> I think long-term, it would be best to:
> * Have the pci hole be small for VMs without devices passed through
> * Have the pci hole default to the host pci hole for VMs with devices
> passed through
> * Have the pci hole size able to be specified, either as a size, or as "host".
>
> As long as the size specification can be extended to this
> functionality easily, I think just having a size to begin with is OK.

I see no problem with extending to add "host".  So I am starting with just a size.  Note: the new QEMU way is simpler to decode from an e820 map.

For example:


Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] e820: BIOS-provided physical RAM map:
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x0000000000000000-0x000000000009afff] usable
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x000000000009b800-0x00000000000fffff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x0000000000100000-0x00000000bf63efff] usable
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000bf63f000-0x00000000bf6befff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000bf6bf000-0x00000000bf7befff] ACPI NVS
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000bf7bf000-0x00000000bf7fefff] ACPI data
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000bf7ff000-0x00000000bf7fffff] usable
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000bf800000-0x00000000bfffffff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000e0000000-0x00000000efffffff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000feb00000-0x00000000feb03fff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000fed10000-0x00000000fed19fff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000fee00000-0x00000000feefffff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000000ffd80000-0x00000000ffffffff] reserved
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x0000000100000000-0x00000005c0e16fff] usable
Mar 10 13:08:28 dcs-xen-54 kernel: [    0.000000] Xen: [mem 0x00000005c0e17000-0x000000083fffffff] unusable


pci_hole_min_size =  1082130432 (0x40800000) for 0xbf800000
pci_hole_min_size =  536870912 (0x20000000) for 0xe0000000

Note: It does not leap out at me from the e820 map which is the host one.

> I think the qemu guys didn't like the term "pci_hole" and wanted
> something like "lowmem" instead -- that will need to be sorted out.

Next version of QEMU patch out with new name.  Was not sure what name or size make the most sense.


lowmem == 1 << 32 - pci_hole_min_size.

     -Don Slutz

>   -George

  reply	other threads:[~2014-03-11 17:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-28 20:15 [RFC PATCH 0/1] Add pci_hole_min_size Don Slutz
2014-02-28 20:15 ` [RFC PATCH 1/1] " Don Slutz
2014-02-28 22:07   ` Boris Ostrovsky
2014-03-03 15:30     ` Don Slutz
2014-03-03 16:07       ` Boris Ostrovsky
2014-03-03 20:43         ` Don Slutz
2014-03-03 22:54           ` Boris Ostrovsky
2014-03-04 13:25   ` George Dunlap
2014-03-04 18:57     ` Don Slutz
2014-03-07 19:28       ` Konrad Rzeszutek Wilk
2014-03-11 12:54         ` George Dunlap
2014-03-11 17:16           ` Don Slutz [this message]
  -- strict thread matches above, loose matches on Subject: below --
2014-03-11 17:01 [RFC PATCH 0/1] " Don Slutz
2014-03-11 17:01 ` [RFC PATCH 1/1] " Don Slutz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=531F4500.703@terremark.com \
    --to=dslutz@verizon.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=gordan@bobich.net \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=konrad@darnok.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).