All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad@darnok.org>
To: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: xen-devel <xen-devel@lists.xensource.com>
Subject: Re: memory map issues with PV PCI passthrough
Date: Wed, 14 Dec 2011 17:53:37 -0400	[thread overview]
Message-ID: <20111214215337.GE25896@andromeda.dapyr.net> (raw)
In-Reply-To: <4EE0CC19.8040905@tycho.nsa.gov>

On Thu, Dec 08, 2011 at 09:39:21AM -0500, Daniel De Graaf wrote:
> I have a system with several reserved ranges low in the e820 map which
> cause problems when starting PV domains with PCI devices. The machine
> memory map looks like:
> 
> (XEN)  0000000000000000 - 0000000000060000 (usable)
> (XEN)  0000000000060000 - 0000000000068000 (reserved)
> (XEN)  0000000000068000 - 000000000009ac00 (usable)
> (XEN)  000000000009ac00 - 00000000000a0000 (reserved)
> (XEN)  00000000000e0000 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 0000000000800000 (usable)
> (XEN)  0000000000800000 - 000000000087d000 (unusable)
> (XEN)  000000000087d000 - 0000000000f00000 (usable)
> (XEN)  0000000000f00000 - 0000000001000000 (reserved)
> (XEN)  0000000001000000 - 0000000020000000 (usable)
> (XEN)  0000000020000000 - 0000000020200000 (reserved)
> (XEN)  0000000020200000 - 0000000040000000 (usable)
> (XEN)  0000000040000000 - 0000000040200000 (reserved)
> (XEN)  0000000040200000 - 00000000c95d6000 (usable)
> (XEN)  00000000c95d6000 - 00000000c961a000 (reserved)
> (XEN)  00000000c961a000 - 00000000c99b7000 (usable)
> (XEN)  00000000c99b7000 - 00000000c99e7000 (reserved)
> (XEN)  00000000c99e7000 - 00000000c9be7000 (ACPI NVS)
> (XEN)  00000000c9be7000 - 00000000c9bff000 (ACPI data)
> (XEN)  00000000c9bff000 - 00000000c9c00000 (usable)
> (XEN)  00000000c9f00000 - 00000000ca000000 (reserved)
> (XEN)  00000000cb000000 - 00000000cf200000 (reserved)
> (XEN)  00000000fed1c000 - 00000000fed30000 (reserved)
> (XEN)  00000000ffc00000 - 00000000ffc20000 (reserved)
> (XEN)  0000000100000000 - 000000042e000000 (usable)
> 
> When e820_sanitize is called on this memory map to create a PV domain, the
> resulting map has only one usable region (0-0xf00000) below 4GB, and Linux
> will not boot with this memory map.

OK, that looks like a bug. WE could modify e820_santizie in the libxl
to use the hosts E820 and fill in the (usuable) regions with the memory
that is allocated for it. Instead of allocating a big chunk at the start
and then working out the other regions.

> 
> I have a patch that reworks e820_sanitize to include later RAM regions as
> valid RAM, which works as long as the domain being booted has permission
> to map the PFNs from 0x20000-0x20200 and 0x40000-0x40200. If the domain is
> not given this permission (the default, since these regions are not part of
> the PCI device being passed to the guest) then the hypervisor crashes the
> domain when it attempts to map these regions (during init_memory_mapping).

Hmm, so it tries to map (reserved) regions? That seems rather odd.
I am pretty sure it worked for me. What Linux kernel did you use and
can you send the guest config file as well please?
> 
> The domain will boot when these regions are not marked as reserved in the
> e820 map or when the PFNs 0x20200-0x40000 and 0x40200-0xc95d6 are marked
> as unusable. However, it is difficult to make this happen in any general
> case without knowing what reserved regions actually need to be marked as
> reserved in the guest.
> 
> If PCI hot-add is not needed, the problem becomes simpler: the PCI regions
> for assigned devices can be included in the e820 map and other regions can
> be ignored (marking as RAM so that the guest does not attempt direct map).
> 
> Any suggestions on the best way to resolve this?

I think reworking the e820_allocate to just use the E820 from the host
and just convert the RAM regions that are above the map_limitkb to
(unsuable). And then apply the other logic in the e820_allocate to
convert gaps to (unusuable).

But I would think that "libxl: Convert E820_UNUSABLE and E820_RAM to
E820_UNUSABLE as appropriate" already takes care of this?
> 
> -- 
> Daniel De Graaf
> National Security Agency
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

  parent reply	other threads:[~2011-12-14 21:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-08 14:39 memory map issues with PV PCI passthrough Daniel De Graaf
2011-12-08 14:47 ` Ian Campbell
2011-12-08 14:54   ` Daniel De Graaf
2011-12-14 21:53 ` Konrad Rzeszutek Wilk [this message]
2011-12-14 22:36   ` Daniel De Graaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111214215337.GE25896@andromeda.dapyr.net \
    --to=konrad@darnok.org \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.