From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
"keir.fraser@eu.citrix.com" <keir.fraser@eu.citrix.com>
Subject: Re: [PATCH 4 of 5] libxl: Add support for passing in the machine's E820 for PCI passthrough
Date: Fri, 8 Apr 2011 09:33:35 -0400 [thread overview]
Message-ID: <20110408133335.GE6189@dumpdata.com> (raw)
In-Reply-To: <1302251805.27835.61.camel@zakaz.uk.xensource.com>
On Fri, Apr 08, 2011 at 09:36:45AM +0100, Ian Campbell wrote:
> On Thu, 2011-04-07 at 21:25 +0100, Konrad Rzeszutek Wilk wrote:
> > # HG changeset patch
> > # User Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > # Date 1302206363 14400
> > # Node ID 2e464234c94cfd29a98a9011a46d76846b87f7f8
> > # Parent 546d8a03d5cbe0ceddadf701174f2417a0b72891
> > libxl: Add support for passing in the machine's E820 for PCI passthrough.
> >
> > The code (libxl_e820_alloc) calls the xc_get_machine_memory_map to
> > retrieve the systems E820. Then the E820 is sanitized to weed out E820
> > entries below 16MB, and as well remove any E820_RAM or E820_UNUSED
> > regions as the guest does not need to know abou them. The guest
> > only needs the E820_ACPI, E820_NVS, E820_RESERVED to get an idea of
> > where the PCI I/O space is. Mostly.. The Linux kernel assumes that any
> > gap in the E820 is considered PCI I/O space which means that if we pass
> > in the guest 2GB, and the E820_ACPI, and its friend start at 3GB, the
> > gap between 2GB and 3GB will be considered as PCI I/O space. To guard against
> > that we also create an E820_UNUSABLE between the region of 'target_kb'
> > (called ram_end in the code) up to the first E820_[ACPI,NVS,RESERVED] region.
> >
> > Memory that is slack or for balloon (so 'maxmem' in guest configuration)
> > is put behind the machine E820. Which in most cases is after the 4GB.
> >
> > The reason for doing the fetching of the E820 using the hypercall in
> > the toolstack (instead of the guest doing it) is that when a Linux
> > guest would do a hypercall to 'XENMEM_machine_memory_map' it would
> > retrieve an E820 with I/O range caps added in. Meaning that the
> > region after 4GB up to end of possible memory would be marked as unusuable
> > and the Linux kernel would not have any space to allocate a balloon
> > region.
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> >
> > diff -r 546d8a03d5cb -r 2e464234c94c tools/libxl/libxl.h
> > --- a/tools/libxl/libxl.h Thu Apr 07 15:18:46 2011 -0400
> > +++ b/tools/libxl/libxl.h Thu Apr 07 15:59:23 2011 -0400
> > @@ -204,6 +204,14 @@
> > } libxl_file_reference;
> > void libxl_file_reference_destroy(libxl_file_reference *p);
> >
> > +#include <xc_e820.h>
>
> We are trying to avoid exposing libxc headers to libxl users. struct
> e820entry is well defined and unchanging so I don't think there is much
> harm in duplicating it.
OK.
>
> > +typedef struct {
> > + uint32_t nr_entries;
> > + struct e820entry *entry;
> > +} libxl_e820;
>
> BTW does the e820entry array need to be packed at some point before
> getting returned to the guest? (I'm not sure what the expected alignment
The struct e820entry has the __packed__ attribute on it.
> is, nor how a 20 byte struct containing two u64s and a u32 naturally
> aligns on x86_64).
It seems to work OK.. (I booted a 32-bit guest under a 64-bit hypervisor
with a 64-bit dom0).
>
> > diff -r 546d8a03d5cb -r 2e464234c94c tools/libxl/libxl.idl
> > --- a/tools/libxl/libxl.idl Thu Apr 07 15:18:46 2011 -0400
> > +++ b/tools/libxl/libxl.idl Thu Apr 07 15:59:23 2011 -0400
> > @@ -22,6 +22,7 @@
> >
> > libxl_hwcap = Builtin("hwcap")
> >
> > +libxl_e820 = Builtin("e820", destructor_fn="libxl_e820_destroy", passby=PASS_BY_REFERENCE)
>
> Hmmm, We are starting to collect a lot of builtins of the form {len,
> pointer}. Time to start thinking of a new type class abstraction, I
> think... (not your problem in this series though)
>
> > diff -r 546d8a03d5cb -r 2e464234c94c tools/libxl/libxl_pci.c
> > --- a/tools/libxl/libxl_pci.c Thu Apr 07 15:18:46 2011 -0400
> > +++ b/tools/libxl/libxl_pci.c Thu Apr 07 15:59:23 2011 -0400
> > @@ -37,6 +37,8 @@
> > #include "libxl_internal.h"
> > #include "flexarray.h"
> >
> > +#include <xc_e820.h>
> > +
> > #define PCI_BDF "%04x:%02x:%02x.%01x"
> > #define PCI_BDF_SHORT "%02x:%02x.%01x"
> > #define PCI_BDF_VDEVFN "%04x:%02x:%02x.%01x@%02x"
> > @@ -1047,3 +1049,154 @@
> > free(pcidevs);
> > return 0;
> > }
> > +
> > +#define E820MAX (128)
>
> Should be somewhere public for the benefit of libxl users who want to
> setup an e820? Even if it should be private it should be in a .h.
Aye aye.
>
> Are we expecting that libxl users might want to modify the e820? If not
> then why expose libxl_e820_alloc and libxl_e820_sanitize at all, just
> add a flag to the libxl interface which says whether or not to provide a
> host-derived e820.
OK. Where should I put the definitions of these two functions? (They are
called from libxl_dom.c and xl_cmdimpl.c) Is there a internal header file?
>
> If users are expected to modify it then I'm in two minds about doing the
> sanitize step at domain build time instead of in libxl_e820_alloc. If we
The issue I had was that to construct the e820, I needed the
'target_kb' and 'max_memkb' and 'slack_memkb'. If I can get those two values
when the guest config file is parsed (so xl_cmdimpl.c) I think moving
the sanitization in there (and not doing it during domain build time)
is the right thing to do. Let me see if I can do that.
> provide a default sanitized e820 to the caller and they want to modify
> it to be non-santized should we let them? I guess the question here is
> whether we are sanitizing actually invalid e820 maps or if we are also
> enforcing some higher level policy based on a specific class Linux
> guest's requirements.
Which might be quite different if the guest is a BSD one. I guess
it is time to start launching those NetBSD guests :-)
next prev parent reply other threads:[~2011-04-08 13:33 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-07 20:25 [PATCH 0 of 5] Patches for PCI passthrough with modified E820 Konrad Rzeszutek Wilk
2011-04-07 20:25 ` [PATCH 1 of 5] tools: Add xc_domain_set_memory_map and xc_get_machine_memory_map calls Konrad Rzeszutek Wilk
2011-04-08 8:18 ` Ian Campbell
2011-04-08 13:19 ` Konrad Rzeszutek Wilk
2011-04-07 20:25 ` [PATCH 2 of 5] x86: make the pv-only e820 array be dynamic Konrad Rzeszutek Wilk
2011-04-08 8:22 ` Ian Campbell
2011-04-08 13:21 ` Konrad Rzeszutek Wilk
2011-04-07 20:25 ` [PATCH 3 of 5] x86: adjust the size of the e820 for pv guest to " Konrad Rzeszutek Wilk
2011-04-07 20:25 ` [PATCH 4 of 5] libxl: Add support for passing in the machine's E820 for PCI passthrough Konrad Rzeszutek Wilk
2011-04-08 8:36 ` Ian Campbell
2011-04-08 10:56 ` Ian Jackson
2011-04-08 13:35 ` Konrad Rzeszutek Wilk
2011-04-08 13:55 ` Ian Campbell
2011-04-08 14:09 ` Tim Deegan
2011-04-08 14:17 ` Ian Campbell
2011-04-08 14:25 ` Tim Deegan
2011-04-08 14:33 ` Ian Campbell
2011-04-08 15:00 ` Konrad Rzeszutek Wilk
2011-04-08 14:34 ` Konrad Rzeszutek Wilk
2011-04-08 14:42 ` Ian Campbell
2011-04-08 14:54 ` Konrad Rzeszutek Wilk
2011-04-08 16:01 ` Ian Jackson
2011-04-08 13:33 ` Konrad Rzeszutek Wilk [this message]
2011-04-08 14:00 ` Ian Campbell
2011-04-07 20:25 ` [PATCH 5 of 5] libxl: Convert E820_UNUSABLE and E820_RAM to E820_UNUSABLE as appropriate Konrad Rzeszutek Wilk
2011-04-08 8:42 ` [PATCH 0 of 5] Patches for PCI passthrough with modified E820 Ian Campbell
2011-04-08 13:24 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110408133335.GE6189@dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=Ian.Campbell@citrix.com \
--cc=keir.fraser@eu.citrix.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.