xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@eu.citrix.com>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Olaf Hering <olaf@aepfle.de>, "Keir (Xen.org)" <keir@xen.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	Andres Lagar-Cavilla <andreslc@gridcentric.ca>,
	"Tim (Xen.org)" <tim@xen.org>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	George Shuklin <george.shuklin@gmail.com>,
	Dario Faggioli <raistlin@linux.it>,
	Kurt Hackel <kurt.hackel@oracle.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>
Subject: Re: domain creation vs querying free memory (xend and xl)
Date: Fri, 5 Oct 2012 12:40:00 +0100	[thread overview]
Message-ID: <506EC710.3020606@eu.citrix.com> (raw)
In-Reply-To: <3e5cf0ba-4241-4f00-bef6-aaff2cc23419@default>

On 04/10/12 17:54, Dan Magenheimer wrote:
>>
> Scanning through the archived message I am under the impression
> that the focus is on a single server... i.e. "punt if actor is
> not xl", i.e. it addressed "balloon-to-fit" and only tries to avoid
> stepping on other memory overcommit technologies.  That makes it
> almost orthogonal, I think, to the problem I originally raised.
No, the idea was to allow the flexibility of different actors in 
different situations.  The plan was to start with a simple actor, but to 
add new ones as necessary.  But on reflection, it seems like the whole 
"actor" thing was actually something completely separate to what we're 
talking about here.  The idea behind the actor (IIRC) was that you could 
tell the toolstack, "Make VM A use X amount of host memory"; and the 
actor would determine the best way to do that -- either by only 
ballooning, or ballooning first and then swapping.  But it doesn't 
decide how to get the value X.

This thread has been very hard to follow for some reason, so let me see 
if I can understand everything:
* You are concerned about being able to predictably start VMs in the 
face of:
  - concurrent requests, and
  - dynamic memory technologies (including PoD, ballooning, paging, page 
sharing, and tmem)
Any of which may change the amount of free memory between the time a 
decision is made and the time memory is actually allocated.
* You have proposed a hypervisor-based solution that allows the 
toolstack to "reserve" a specific amount of memory to a VM that will not 
be used for something else; this allocation is transactional -- it will 
either completely succeed, or completely fail, and do it quickly.

Is that correct?

The problem with that solution, it seems to me, is that the hypervisor 
does not (and I think probably should not) have any insight into the 
policy for allocating or freeing memory as a result of other activities, 
such as ballooning or page sharing.  Suppose someone were ballooning 
down domain M to get 8GiB in order to start domain A; and at some point 
, another process looks and says, "Oh look, there's 4GiB free, that's 
enough to start domain B" and asks Xen to reserve that memory.  Xen has 
no way of knowing that the memory freed by domain M was "earmarked" for 
domain A, and so will happily give it to domain B, causing domain A's 
creation to fail (potentially).

So it seems like we need to have the idea of a memory controller -- one 
central process (per host, as you say) that would know about all of the 
knobs -- ballooning, paging, page sharing, tmem, whatever -- that could 
be in charge of knowing where all the memory was coming from and where 
it was going.  So if xl wanted to start a new VM, it can ask the memory 
controller for 3GiB, and the controller could decide, "I'll take 1GiB 
from domain M and 2 from domain N, and give it to the new domain", and 
respond when it has the memory that it needs.  Similarly, it can know 
that it should try to keep X megabytes for un-sharing of pages, and it 
can be responsible for freeing up more memory if that memory becomes 
exhausted.

At the moment, the administrator himself (or the cloud orchestration 
layer) needs to be his own memory controller; that is, he needs to 
manually decide if there's enough free memory to start a VM; if there's 
not, he needs to figure out how to get that memory (either by ballooning 
or swapping).  Ballooning and swapping are both totally under his 
control; the only thing he doesn't control is the unsharing of pages.  
But as long as there was a way to tell the page sharing daemon not to 
allocate an amount of free memory, then this 
"administrator-as-memory-controller" should work just fine.

Does that make sense?  Or am I still confused? :-)

  -George

  parent reply	other threads:[~2012-10-05 11:40 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-26 21:17 domain creation vs querying free memory (xend and xl) Dan Magenheimer
2012-09-27 11:26 ` Konrad Rzeszutek Wilk
2012-09-27 15:32   ` Dan Magenheimer
2012-09-27 15:24 ` George Shuklin
2012-09-28 16:08   ` Dario Faggioli
2012-10-02 18:17     ` Dan Magenheimer
2012-09-28 17:12 ` Ian Jackson
2012-10-01 20:03   ` Dan Magenheimer
2012-10-02  9:10     ` Tim Deegan
2012-10-02  9:47       ` Ian Campbell
2012-10-02 19:33       ` Dan Magenheimer
2012-10-02 20:16         ` Tim Deegan
2012-10-02 21:56           ` Dan Magenheimer
2012-10-04 10:06             ` Tim Deegan
2012-10-04 10:17               ` Ian Campbell
2012-10-04 13:20                 ` Andres Lagar-Cavilla
2012-10-04 13:25                   ` Ian Campbell
2012-10-04 16:54                   ` Dan Magenheimer
2012-10-04 17:00                     ` Andres Lagar-Cavilla
2012-10-05  9:44                     ` Ian Campbell
2012-10-05 11:40                     ` George Dunlap [this message]
2012-10-08  1:02                       ` Dan Magenheimer
2012-10-16 11:49                         ` George Dunlap
2012-10-16 17:51                           ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-17 18:33                               ` Andres Lagar-Cavilla
2012-10-17 19:46                                 ` Dan Magenheimer
2012-10-17 20:14                                   ` Andres Lagar-Cavilla
2012-10-17 22:07                                     ` Dan Magenheimer
2012-10-17 18:45                               ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-04 13:33               ` Andres Lagar-Cavilla
2012-10-04 16:59                 ` Dan Magenheimer
2012-10-04 17:08                   ` Andres Lagar-Cavilla
2012-10-04 17:18                     ` Dan Magenheimer
2012-10-04 17:30                       ` Andres Lagar-Cavilla
2012-10-04 17:55                         ` Dan Magenheimer
2012-10-05 14:25                           ` Andres Lagar-Cavilla
2012-10-07 23:43                             ` Dan Magenheimer
2012-10-04 16:36               ` Dan Magenheimer
2012-10-04 18:26     ` Olaf Hering
2012-10-04 19:38       ` Dan Magenheimer
2012-10-04 20:18         ` Olaf Hering
2012-10-04 20:35           ` Dan Magenheimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=506EC710.3020606@eu.citrix.com \
    --to=george.dunlap@eu.citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andreslc@gridcentric.ca \
    --cc=dan.magenheimer@oracle.com \
    --cc=george.shuklin@gmail.com \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kurt.hackel@oracle.com \
    --cc=olaf@aepfle.de \
    --cc=raistlin@linux.it \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).