xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Ian Campbell <Ian.Campbell@eu.citrix.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Subject: Re: Re: [PATCH 0 of 3] Patches for PCI passthrough with modified E820 (v3) - resent.
Date: Tue, 17 May 2011 12:34:10 -0400	[thread overview]
Message-ID: <20110517163410.GA28286@dumpdata.com> (raw)
In-Reply-To: <1305648479.20907.77.camel@zakaz.uk.xensource.com>

On Tue, May 17, 2011 at 05:07:59PM +0100, Ian Campbell wrote:
> On Tue, 2011-05-17 at 17:02 +0100, Konrad Rzeszutek Wilk wrote:
> > On Fri, May 13, 2011 at 09:57:08AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > memhog 4G worked great.. but then I noticed it started slowing down and
> > > > > it was using the swap disk?
> > > > 
> > > > I guess the I/O holes shadowed the RAM and hence it is basically wasted.
> > > 
> > > <nods>
> > > > > Anyhow, seems that if you are using RHEL5, SLES11, you need to be carefull to
> > > > > use 'memory' and 'maxmem'.
> > > > 
> > > > Hrm, changing behaviour for existing guests isn't so nice, at least not
> > > > without a way to turn the behaviour off, perhaps we do need an explicit
> > > > cfg file variable to control this after all?
> > > 
> > > We could do that, and then once your idea below has been completly working
> > > we can rip out the parameter?
> > 
> > How does this patch look to your eyes:
> 
> Looks ok to me.
> 
> We've been using the _override suffix for the cfg visible symbol, not
> just the internal variables, so if we think this is something the user
> typically should not touch then we should call it e820_host_override in
> the cfg file too. Although see my earlier comment about this option also

Ok.
> enabling hotplug -- perhaps this is an option user will want to care
> about in the long run?

In which case we should decide on a good name since it will stay with us
forever. Perhaps just e820_host and drop the override? And do something like this:

# HG changeset patch
# Parent c6fa04014d6e99ca4e62d04132180338403c0478
libxl: Add 'e820_host' option to config file.

.. which will allow PV guests to see the host's E820. Previously
this was latched of the config having an entry in the 'pci' option.
But during testing of the patches which provide a host E820 in a PV guest,
certain inconsistencies were found with guests. When launching a RHEL5 or
SLES11 PV guest with 4GB and a PCI device, the kernel would report 4GB,
but have 1.5G "used". What happend was that the P2M that fall within the
E820 I/O holes would never be used and was just wasted. The mechanism to
go around this is to shrink the size of the guest
before launch (say memory=2048, maxmem=4096) and then balloon back to 4096M
after start. For PVOPS type kernels it would detect the E820 I/O holes and
deflate by the correct amount but would not inflate back to 4GB.
Manually inflating makes it work.

The fix in the future for guests where the memory amount flows over the
PCI hole, is to launch the guest with decreased amount right up to the cusp
of where the E820 PCI hole starts. Also increase the 'maxmem' by the delta
and then when the guest has launched, balloon up to the delta number.

This will require some careful surgery so for right now this parameter
will guard against unsuspecting users seeing their PV guests memory "vanish."

In the future, this option will remain (so that PCI hotplugging
can be done), and turn itself on when there is a 'pci' option.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

diff -r c6fa04014d6e tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c	Tue May 17 10:33:27 2011 -0400
+++ b/tools/libxl/xl_cmdimpl.c	Tue May 17 12:30:38 2011 -0400
@@ -979,6 +979,16 @@ skip_vfb:
     if (!xlu_cfg_get_long (config, "pci_power_mgmt", &l))
         pci_power_mgmt = l;
 
+    /* To be reworked (automatically enabled) once the auto ballooning
+     * after guest starts is done (with PCI devices passed in). */
+    if (!xlu_cfg_get_long (config, "e820_host", &l)) {
+        if (c_info->hvm)
+          fprintf(stderr, "Can't do e820_host in HVM mode!");
+        else {
+          if (l)
+            b_info->u.pv.machine_e820 = true;
+        }
+    }
     if (!xlu_cfg_get_list (config, "pci", &pcis, 0, 0)) {
         int i;
         d_config->num_pcidevs = 0;
@@ -995,8 +1005,6 @@ skip_vfb:
             if (!libxl_device_pci_parse_bdf(ctx, pcidev, buf))
                 d_config->num_pcidevs++;
         }
-        if (d_config->num_pcidevs && !c_info->hvm)
-          b_info->u.pv.machine_e820 = true;
     }
 
     switch (xlu_cfg_get_list(config, "cpuid", &cpuids, 0, 1)) {

      reply	other threads:[~2011-05-17 16:34 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-04 14:17 [PATCH 0 of 3] Patches for PCI passthrough with modified E820 (v3) - resent Konrad Rzeszutek Wilk
2011-05-04 14:17 ` [PATCH 1 of 3] tools: Add xc_domain_set_memory_map and xc_get_machine_memory_map calls (x86, amd64 only) Konrad Rzeszutek Wilk
2011-05-04 14:17 ` [PATCH 2 of 3] libxl: Add support for passing in the machine's E820 for PCI passthrough Konrad Rzeszutek Wilk
2011-05-10  8:29   ` Ian Campbell
2011-05-10 14:56     ` Konrad Rzeszutek Wilk
2011-05-17 15:10   ` Ian Jackson
2011-05-17 16:00     ` Konrad Rzeszutek Wilk
2011-05-17 16:05       ` Ian Campbell
2011-05-04 14:17 ` [PATCH 3 of 3] libxl: Convert E820_UNUSABLE and E820_RAM to E820_UNUSABLE as appropriate Konrad Rzeszutek Wilk
2011-05-09  9:00 ` [PATCH 0 of 3] Patches for PCI passthrough with modified E820 (v3) - resent Ian Campbell
2011-05-09 21:01   ` Konrad Rzeszutek Wilk
2011-05-10  8:30     ` Ian Campbell
2011-05-10 14:53       ` Konrad Rzeszutek Wilk
2011-05-10 15:09         ` Ian Campbell
2011-05-10 15:27           ` Konrad Rzeszutek Wilk
2011-05-10 15:32             ` Ian Campbell
2011-05-10 15:51               ` Konrad Rzeszutek Wilk
2011-05-11  7:49                 ` Ian Campbell
2011-05-12 17:41                   ` Konrad Rzeszutek Wilk
2011-05-13  8:47                     ` Ian Campbell
2011-05-13 13:57                       ` Konrad Rzeszutek Wilk
2011-05-17 16:02                         ` Konrad Rzeszutek Wilk
2011-05-17 16:07                           ` Ian Campbell
2011-05-17 16:34                             ` Konrad Rzeszutek Wilk [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110517163410.GA28286@dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Ian.Campbell@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).