From: George Dunlap <george.dunlap@eu.citrix.com>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Olaf Hering <olaf@aepfle.de>, "Keir (Xen.org)" <keir@xen.org>,
Ian Campbell <Ian.Campbell@citrix.com>,
Konrad Wilk <konrad.wilk@oracle.com>,
Andres Lagar-Cavilla <andreslc@gridcentric.ca>,
"Tim (Xen.org)" <tim@xen.org>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
George Shuklin <george.shuklin@gmail.com>,
Dario Faggioli <raistlin@linux.it>,
Kurt Hackel <kurt.hackel@oracle.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>
Subject: Re: domain creation vs querying free memory (xend and xl)
Date: Fri, 5 Oct 2012 12:40:00 +0100 [thread overview]
Message-ID: <506EC710.3020606@eu.citrix.com> (raw)
In-Reply-To: <3e5cf0ba-4241-4f00-bef6-aaff2cc23419@default>
On 04/10/12 17:54, Dan Magenheimer wrote:
>>
> Scanning through the archived message I am under the impression
> that the focus is on a single server... i.e. "punt if actor is
> not xl", i.e. it addressed "balloon-to-fit" and only tries to avoid
> stepping on other memory overcommit technologies. That makes it
> almost orthogonal, I think, to the problem I originally raised.
No, the idea was to allow the flexibility of different actors in
different situations. The plan was to start with a simple actor, but to
add new ones as necessary. But on reflection, it seems like the whole
"actor" thing was actually something completely separate to what we're
talking about here. The idea behind the actor (IIRC) was that you could
tell the toolstack, "Make VM A use X amount of host memory"; and the
actor would determine the best way to do that -- either by only
ballooning, or ballooning first and then swapping. But it doesn't
decide how to get the value X.
This thread has been very hard to follow for some reason, so let me see
if I can understand everything:
* You are concerned about being able to predictably start VMs in the
face of:
- concurrent requests, and
- dynamic memory technologies (including PoD, ballooning, paging, page
sharing, and tmem)
Any of which may change the amount of free memory between the time a
decision is made and the time memory is actually allocated.
* You have proposed a hypervisor-based solution that allows the
toolstack to "reserve" a specific amount of memory to a VM that will not
be used for something else; this allocation is transactional -- it will
either completely succeed, or completely fail, and do it quickly.
Is that correct?
The problem with that solution, it seems to me, is that the hypervisor
does not (and I think probably should not) have any insight into the
policy for allocating or freeing memory as a result of other activities,
such as ballooning or page sharing. Suppose someone were ballooning
down domain M to get 8GiB in order to start domain A; and at some point
, another process looks and says, "Oh look, there's 4GiB free, that's
enough to start domain B" and asks Xen to reserve that memory. Xen has
no way of knowing that the memory freed by domain M was "earmarked" for
domain A, and so will happily give it to domain B, causing domain A's
creation to fail (potentially).
So it seems like we need to have the idea of a memory controller -- one
central process (per host, as you say) that would know about all of the
knobs -- ballooning, paging, page sharing, tmem, whatever -- that could
be in charge of knowing where all the memory was coming from and where
it was going. So if xl wanted to start a new VM, it can ask the memory
controller for 3GiB, and the controller could decide, "I'll take 1GiB
from domain M and 2 from domain N, and give it to the new domain", and
respond when it has the memory that it needs. Similarly, it can know
that it should try to keep X megabytes for un-sharing of pages, and it
can be responsible for freeing up more memory if that memory becomes
exhausted.
At the moment, the administrator himself (or the cloud orchestration
layer) needs to be his own memory controller; that is, he needs to
manually decide if there's enough free memory to start a VM; if there's
not, he needs to figure out how to get that memory (either by ballooning
or swapping). Ballooning and swapping are both totally under his
control; the only thing he doesn't control is the unsharing of pages.
But as long as there was a way to tell the page sharing daemon not to
allocate an amount of free memory, then this
"administrator-as-memory-controller" should work just fine.
Does that make sense? Or am I still confused? :-)
-George
next prev parent reply other threads:[~2012-10-05 11:40 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-26 21:17 domain creation vs querying free memory (xend and xl) Dan Magenheimer
2012-09-27 11:26 ` Konrad Rzeszutek Wilk
2012-09-27 15:32 ` Dan Magenheimer
2012-09-27 15:24 ` George Shuklin
2012-09-28 16:08 ` Dario Faggioli
2012-10-02 18:17 ` Dan Magenheimer
2012-09-28 17:12 ` Ian Jackson
2012-10-01 20:03 ` Dan Magenheimer
2012-10-02 9:10 ` Tim Deegan
2012-10-02 9:47 ` Ian Campbell
2012-10-02 19:33 ` Dan Magenheimer
2012-10-02 20:16 ` Tim Deegan
2012-10-02 21:56 ` Dan Magenheimer
2012-10-04 10:06 ` Tim Deegan
2012-10-04 10:17 ` Ian Campbell
2012-10-04 13:20 ` Andres Lagar-Cavilla
2012-10-04 13:25 ` Ian Campbell
2012-10-04 16:54 ` Dan Magenheimer
2012-10-04 17:00 ` Andres Lagar-Cavilla
2012-10-05 9:44 ` Ian Campbell
2012-10-05 11:40 ` George Dunlap [this message]
2012-10-08 1:02 ` Dan Magenheimer
2012-10-16 11:49 ` George Dunlap
2012-10-16 17:51 ` Dan Magenheimer
2012-10-17 17:35 ` George Dunlap
2012-10-17 18:33 ` Andres Lagar-Cavilla
2012-10-17 19:46 ` Dan Magenheimer
2012-10-17 20:14 ` Andres Lagar-Cavilla
2012-10-17 22:07 ` Dan Magenheimer
2012-10-17 18:45 ` Dan Magenheimer
2012-10-17 17:35 ` George Dunlap
2012-10-04 13:33 ` Andres Lagar-Cavilla
2012-10-04 16:59 ` Dan Magenheimer
2012-10-04 17:08 ` Andres Lagar-Cavilla
2012-10-04 17:18 ` Dan Magenheimer
2012-10-04 17:30 ` Andres Lagar-Cavilla
2012-10-04 17:55 ` Dan Magenheimer
2012-10-05 14:25 ` Andres Lagar-Cavilla
2012-10-07 23:43 ` Dan Magenheimer
2012-10-04 16:36 ` Dan Magenheimer
2012-10-04 18:26 ` Olaf Hering
2012-10-04 19:38 ` Dan Magenheimer
2012-10-04 20:18 ` Olaf Hering
2012-10-04 20:35 ` Dan Magenheimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=506EC710.3020606@eu.citrix.com \
--to=george.dunlap@eu.citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=andreslc@gridcentric.ca \
--cc=dan.magenheimer@oracle.com \
--cc=george.shuklin@gmail.com \
--cc=keir@xen.org \
--cc=konrad.wilk@oracle.com \
--cc=kurt.hackel@oracle.com \
--cc=olaf@aepfle.de \
--cc=raistlin@linux.it \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).