xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Gordan Bobic <gordan@bobich.net>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xen.org
Subject: HVM support for e820_host (Was: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0)
Date: Tue, 03 Sep 2013 20:47:09 +0100	[thread overview]
Message-ID: <52263CBD.1090402@bobich.net> (raw)
In-Reply-To: <20130903145934.GC1487@konrad-lan.dumpdata.com>

[-- Attachment #1: Type: text/plain, Size: 3693 bytes --]

On 09/03/2013 03:59 PM, Konrad Rzeszutek Wilk wrote:

>>>> 2) Further, I'm finding myself motivated to write that
>>>> auto-set (as opposed to hard coded) vBAR=pBAR patch discussed
>>>> briefly a week or so ago (have an init script read the BAR
>>>> info from dom0 and put it in xenstore, plus a patch to
>>>> make pBAR=vBAR reservations built dynamically rather than
>>>> statically, based on this data. Now, I'm quite fluent in C,
>>>> but my familiarity with Xen soruce code is nearly non-existant
>>>> (limited to studying an old unsupported patch every now and then
>>>> in order to make it apply to a more recent code release).
>>>> Can anyone help me out with a high level view WRT where
>>>> this would be best plumbed in (which files and the flow of
>>>> control between the affected files)?
>>>
>>> hvmloader probably and the libxl e820 code. What from a
>>> high view needs to happen is that:
>>> 1). Need to relax the check in libxl for e820_hole
>>>     to also do it for HVM guests. Said code just iterates over the
>>>     host E820 and sanitizes it a bit and makes a E820 hypercall to
>>>     set it for the guest.
[snip]

OK, I have attached a preliminary patch against 4.3.0 for the libxl 
part. It compiles. I haven't tried running it to see if it actually 
works or does something, but my packages build.

Please let me know if I've missed anything. On it's own, I don't think 
this patch will do much (apart from maybe break HVM hosts with 
e820_host=1 set).

>>> 2). Figure out whether the E820 hypercall (which sets the E820
>>>     layout for a guest) can be run on HVM guests. I think it
>>>     could not and Mukesh in his PVH patches posted a patch
>>>     to enable that - "..Move e820 fields out of pv_domain struct"

Is this already in 4.3.0 or is this an out-of-tree patch? Do you have a 
link to it handy?

>>> 2). Hvmloader should do an E820 get machine memory hypercall
>>>    to see if there is anything there. If there is - that means
>>>     the toolstack has request a "new" type of E820. Iterate
>>>     over the E820 and make it look like that.
>>>     You can look in the Linux arch/x86/xen/setup.c to see how
>>>     it does that.
>>>
>>>    The complication there is that hvmloader needs to to fit the
>>>    ACPI code (the guest type one) and such.
>>>    Presumarily you can just re-use the existing spaces that
>>>    the host has marked as E820_RESERVED or E820_ACPI..
>>
>> Yup, I get it. Not only that, but it should also ideally (not
>> strictly necessary, but it'd be handy) map the IOMEM for devices
>> it is passed so that pBAR=vBAR (as opposed to just leaving all
>> the host e820 reserved areas well alone - which would work for
>> most things).
>
> Yes. That is an extra complication that could be done in subsequent
> patches. But in theory if you have the E820 mirrored from the host the
> pBAR=vBAR should be easy enough as the values from the host BARs can
> easily fit in the E820 gaps.

Agreed. Let's leave the pBAR=vBAR part for a separate patch set. I'll 
have to figure out a sensible way to query the IOMEM regions for each of 
the devices passed to the VM and make sure they are in the same hole.

>>>    Then there is the SMBIOS would need to move and the BIOS
>>>    might need to be relocated - but I think those are relocatable
>>>   in some form.

[bit above left for later reference]

>>> Well, I am more than happy to help you with this.
>>
>> Thanks, much appreciated. :)
>
> Yeeey! Vict^H^H^H^volunteer :-)! <manically laughter in the background>
>
> I am also reachable on IRC (FreeNode mostly) as either darnok or konrad
> if that would be more convient to discuss this.

Thanks. I'll keep that in mind. :)

Gordan

[-- Attachment #2: xen-hvm-libxl-e820_host.patch --]
[-- Type: text/plain, Size: 5596 bytes --]

--- xen-4.3.0/tools/libxl/libxl_create.c.orig	2013-09-03 14:26:47.478350269 +0100
+++ xen-4.3.0/tools/libxl/libxl_create.c	2013-09-03 14:45:26.710553063 +0100
@@ -208,6 +208,8 @@
 
     libxl_defbool_setdefault(&b_info->disable_migrate, false);
 
+    libxl_defbool_setdefault(&b_info->e820_host, false);
+
     switch (b_info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
         if (b_info->shadow_memkb == LIBXL_MEMKB_DEFAULT)
@@ -280,7 +282,6 @@
 
         break;
     case LIBXL_DOMAIN_TYPE_PV:
-        libxl_defbool_setdefault(&b_info->u.pv.e820_host, false);
         if (b_info->shadow_memkb == LIBXL_MEMKB_DEFAULT)
             b_info->shadow_memkb = 0;
         if (b_info->u.pv.slack_memkb == LIBXL_MEMKB_DEFAULT)
--- xen-4.3.0/tools/libxl/libxl_types.idl.orig	2013-09-03 14:16:48.462767589 +0100
+++ xen-4.3.0/tools/libxl/libxl_types.idl	2013-09-03 14:18:19.624028024 +0100
@@ -295,6 +295,8 @@
     ("irqs",             Array(uint32, "num_irqs")),
     ("iomem",            Array(libxl_iomem_range, "num_iomem")),
     ("claim_mode",	     libxl_defbool),
+    # Use host's E820 for PCI passthrough.
+    ("e820_host",        libxl_defbool),
     ("u", KeyedUnion(None, libxl_domain_type, "type",
                 [("hvm", Struct(None, [("firmware",         string),
                                        ("bios",             libxl_bios_type),
@@ -340,8 +342,6 @@
                                       ("cmdline", string),
                                       ("ramdisk", string),
                                       ("features", string, {'const': True}),
-                                      # Use host's E820 for PCI passthrough.
-                                      ("e820_host", libxl_defbool),
                                       ])),
                  ("invalid", Struct(None, [])),
                  ], keyvar_init_val = "LIBXL_DOMAIN_TYPE_INVALID")),
--- xen-4.3.0/tools/libxl/libxl_x86.c.orig	2013-09-03 14:26:36.093566315 +0100
+++ xen-4.3.0/tools/libxl/libxl_x86.c	2013-09-03 16:52:24.648701260 +0100
@@ -216,11 +216,8 @@
     struct e820entry map[E820MAX];
     libxl_domain_build_info *b_info;
 
-    if (d_config == NULL || d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM)
-        return ERROR_INVAL;
-
     b_info = &d_config->b_info;
-    if (!libxl_defbool_val(b_info->u.pv.e820_host))
+    if (!libxl_defbool_val(b_info->e820_host))
         return ERROR_INVAL;
 
     rc = xc_get_machine_memory_map(ctx->xch, map, E820MAX);
@@ -229,9 +226,15 @@
         return ERROR_FAIL;
     }
     nr = rc;
-    rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,
-                       (b_info->max_memkb - b_info->target_memkb) +
-                       b_info->u.pv.slack_memkb);
+    if (d_config == NULL || d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM) {
+        rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,
+                           (b_info->max_memkb - b_info->target_memkb));
+    } else if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV) {
+        rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,
+                           (b_info->max_memkb - b_info->target_memkb) +
+                           b_info->u.pv.slack_memkb);
+    }
+
     if (rc)
         return ERROR_FAIL;
 
@@ -296,8 +299,7 @@
         xc_shadow_control(ctx->xch, domid, XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION, NULL, 0, &shadow, 0, NULL);
     }
 
-    if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV &&
-            libxl_defbool_val(d_config->b_info.u.pv.e820_host)) {
+    if (libxl_defbool_val(d_config->b_info.e820_host)) {
         ret = libxl__e820_alloc(gc, domid, d_config);
         if (ret) {
             LIBXL__LOG_ERRNO(gc->owner, LIBXL__LOG_ERROR,
--- xen-4.3.0/tools/libxl/xl_cmdimpl.c.orig	2013-09-03 14:26:54.524214804 +0100
+++ xen-4.3.0/tools/libxl/xl_cmdimpl.c	2013-09-03 14:47:11.811612562 +0100
@@ -1274,11 +1274,7 @@
     if (!xlu_cfg_get_long (config, "pci_permissive", &l, 0))
         pci_permissive = l;
 
-    /* To be reworked (automatically enabled) once the auto ballooning
-     * after guest starts is done (with PCI devices passed in). */
-    if (c_info->type == LIBXL_DOMAIN_TYPE_PV) {
-        xlu_cfg_get_defbool(config, "e820_host", &b_info->u.pv.e820_host, 0);
-    }
+    xlu_cfg_get_defbool(config, "e820_host", &b_info->e820_host, 0);
 
     if (!xlu_cfg_get_list (config, "pci", &pcis, 0, 0)) {
         d_config->num_pcidevs = 0;
@@ -1296,8 +1292,8 @@
             if (!xlu_pci_parse_bdf(config, pcidev, buf))
                 d_config->num_pcidevs++;
         }
-        if (d_config->num_pcidevs && c_info->type == LIBXL_DOMAIN_TYPE_PV)
-            libxl_defbool_set(&b_info->u.pv.e820_host, true);
+        if (d_config->num_pcidevs)
+            libxl_defbool_set(&b_info->e820_host, true);
     }
 
     switch (xlu_cfg_get_list(config, "cpuid", &cpuids, 0, 1)) {
--- xen-4.3.0/tools/libxl/xl_sxp.c.orig	2013-09-03 14:25:37.839675572 +0100
+++ xen-4.3.0/tools/libxl/xl_sxp.c	2013-09-03 14:22:13.953561029 +0100
@@ -87,6 +87,10 @@
         }
     }
 
+    printf("\t(e820_host %s)\n",
+           libxl_defbool_to_string(b_info->e820_host));
+
+
     printf("\t(image\n");
     switch (c_info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
@@ -150,8 +154,6 @@
         printf("\t\t\t(kernel %s)\n", b_info->u.pv.kernel);
         printf("\t\t\t(cmdline %s)\n", b_info->u.pv.cmdline);
         printf("\t\t\t(ramdisk %s)\n", b_info->u.pv.ramdisk);
-        printf("\t\t\t(e820_host %s)\n",
-               libxl_defbool_to_string(b_info->u.pv.e820_host));
         printf("\t\t)\n");
         break;
     default:

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2013-09-03 19:47 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-23 22:34 Bug: Limitation of <=2GB RAM in domU persists with 4.3.0 Gordan Bobic
2013-07-24 14:08 ` Konrad Rzeszutek Wilk
2013-07-24 14:17   ` Gordan Bobic
2013-07-24 16:06     ` Konrad Rzeszutek Wilk
2013-07-24 16:14       ` Gordan Bobic
2013-07-24 16:31         ` Konrad Rzeszutek Wilk
2013-07-24 17:26           ` Gordan Bobic
2013-07-24 22:15           ` Gordan Bobic
2013-07-25 19:18             ` George Dunlap
2013-07-25 21:48               ` Gordan Bobic
2013-07-25 22:23                 ` Gordan Bobic
2013-07-26  0:21                   ` Ian Campbell
2013-07-26  1:15                     ` Andrew Bobulsky
2013-07-26  9:28                       ` Gordan Bobic
2013-07-26 13:11                         ` Gordan Bobic
2013-07-31 17:53                           ` George Dunlap
2013-07-31 17:56                             ` Andrew Cooper
2013-07-31 19:36                               ` Gordan Bobic
2013-07-31 19:35                             ` Gordan Bobic
2013-08-01  9:15                               ` George Dunlap
2013-08-01 13:10                                 ` Fabio Fantoni
2013-08-02 14:43                                   ` George Dunlap
2013-07-28 10:26                       ` Konrad Rzeszutek Wilk
2013-07-28 21:24                         ` Gordan Bobic
2013-07-28 23:17                           ` Konrad Rzeszutek Wilk
2013-07-28 23:30                             ` Gordan Bobic
2013-07-29  9:53                             ` Ian Campbell
2013-07-26  9:23                     ` Gordan Bobic
2013-07-29 11:14                       ` Ian Campbell
2013-07-29 18:04                       ` Konrad Rzeszutek Wilk
2013-09-03 13:53                         ` Gordan Bobic
2013-09-03 14:59                           ` Konrad Rzeszutek Wilk
2013-09-03 19:47                             ` Gordan Bobic [this message]
2013-09-03 20:35                               ` HVM support for e820_host (Was: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0) Gordan Bobic
2013-09-03 20:49                                 ` Gordan Bobic
2013-09-03 21:10                                   ` Konrad Rzeszutek Wilk
2013-09-03 21:24                                     ` Gordan Bobic
2013-09-03 21:30                                       ` Konrad Rzeszutek Wilk
2013-09-04  0:18                                         ` Gordan Bobic
2013-09-04 14:08                                           ` Konrad Rzeszutek Wilk
2013-09-04 14:23                                             ` Gordan Bobic
2013-09-04 18:00                                               ` Konrad Rzeszutek Wilk
2013-09-03 21:08                                 ` Konrad Rzeszutek Wilk
2013-09-04  9:21                                   ` Gordan Bobic
2013-09-04 11:01                                   ` Gordan Bobic
2013-09-04 13:11                                     ` Gordan Bobic
2013-09-04 20:18                                       ` Gordan Bobic
2013-09-05  2:04                                       ` Konrad Rzeszutek Wilk
2013-09-05  9:41                                         ` Gordan Bobic
2013-09-05 10:00                                           ` Gordan Bobic
2013-09-05 12:36                                             ` Konrad Rzeszutek Wilk
2013-09-05 10:26                                         ` Gordan Bobic
2013-09-05 12:38                                           ` Konrad Rzeszutek Wilk
2013-09-05 21:13                                         ` Gordan Bobic
2013-09-05 21:29                                           ` Gordan Bobic
2013-09-05 21:46                                             ` Gordan Bobic
2013-09-05 22:23                                           ` Konrad Rzeszutek Wilk
2013-09-05 22:42                                             ` Gordan Bobic
2013-09-06 13:09                                               ` Konrad Rzeszutek Wilk
2013-09-06 14:09                                                 ` Gordan Bobic
2013-09-05 22:45                                             ` Gordan Bobic
2013-09-05 23:01                                               ` Konrad Rzeszutek Wilk
2013-09-06 12:23                                                 ` Gordan Bobic
2013-09-06 13:20                                                   ` Konrad Rzeszutek Wilk
2013-09-06 14:45                                                     ` Gordan Bobic
2013-09-05 22:33                                           ` Gordan Bobic
2013-09-06 13:04                                             ` Konrad Rzeszutek Wilk
2013-09-06 13:34                                               ` Gordan Bobic
2013-09-06 14:32                                                 ` Konrad Rzeszutek Wilk
2013-09-06 16:30                                                   ` Gordan Bobic
2013-09-06 19:54                                                     ` Gordan Bobic
2013-09-10 13:35                                                       ` Konrad Rzeszutek Wilk
2013-09-10 15:04                                                         ` Gordan Bobic
2013-07-25 21:26           ` Bug: Limitation of <=2GB RAM in domU persists with 4.3.0 Gordan Bobic

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52263CBD.1090402@bobich.net \
    --to=gordan@bobich.net \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).