From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: [PATCH] PoD: Handle operations properly when domain
	is dying
Date: Wed, 11 Nov 2009 17:15:28 +0000
Message-ID: <4AFAF130.7080506@eu.citrix.com>
References: <067674d9-b387-479d-961d-99fa8459485a@default>
Mime-Version: 1.0
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <067674d9-b387-479d-961d-99fa8459485a@default>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
List-Id: xen-devel@lists.xenproject.org

Dan Magenheimer wrote:
> BUT, PoD is essentially doing dynamic ballooning without
> notifying the tools, correct?  Unless I misunderstand, the
> whole point of PoD is to not use zeroed-out-by-Windows
> memory until it gets written (with non-zeroes), and the
> underlying objective is that that not-yet-used memory can
> be used for other purposes -- such as other domains.
>   
You misunderstand. :-)

The *only* point of PoD is to allow a VM to boot "pre-ballooned".  It is 
not related to memory overcommit.  PoD memory is allocated to a domain 
by the domain builder, and is only ever changed afterwards by explicit 
calls made by the toolstack.  Memory is moved from the PoD "cache"* to 
the p2m table and back again automatically, but the total amount of 
memory owned by the domain is constant.

To review, the problem that PoD solves is the following:

HVM guests (both Linux and Windows) read the e820 map early in boot, and 
consider their memory size essentially fixed based on what they read.  
IOW, if Windows reads 1GiB in the e820 map, it will never use more than 
1GiB of RAM.

In a virtualized environment, we'd like to have the flexibility of 
booting a VM with 1GiB of RAM, but then increasing its RAM (say, up to 
4GiB) after boot if it is determined that the VM in question needs more 
memory.

Without PoD, your only option is to build the domain with 4GiB of RAM 
and then wait for the balloon driver to balloon the VM down to 1GiB.  
The problem with this, of course, is that you have to scrape together 
the other 4GiB for the course of the boot.

With PoD, you pass the domain builder two values: 4GiB and 1GiB.  The 
domain builder will fill the p2m table with 4GiB of PoD entries, and 
then allocate 1GiB of ram for the per-domain PoD "cache".  Xen will move 
memory into and out of this "cache" as needed to allow the VM to boot 
until the balloon driver loads. But the total amount of memory used by 
the VM during this time is fixed at 1GiB.  If this pre-allocated amount 
of RAM is used up, no more memory is allocated; the domain crashes.

The zero-page scans are used to recover memory from the p2m table and 
put it back in the per-domain PoD "cache".  This memory is not returned 
to Xen, and cannot be used for other VMs.

So if you're using PoD, you can trust that the memory used by the VM 
will not change "under your feet" so to speak.

 -George

* I put the term "cache" in quotes because it is related to the normal 
English definition of the word ("a hidden storage space"), rather than 
the computer science meaning of the word (e.g., extra copies of data 
used to speed up storage heirarchies).