xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Tim Deegan <tim@xen.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Olaf Hering <olaf@aepfle.de>, Keir Fraser <keir@xen.org>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Kurt Hackel <kurt.hackel@oracle.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	xen-devel@lists.xen.org,
	George Shuklin <george.shuklin@gmail.com>,
	Dario Faggioli <raistlin@linux.it>,
	Andres Lagar-Cavilla <andreslc@gridcentric.ca>
Subject: Re: domain creation vs querying free memory (xend and xl)
Date: Tue, 2 Oct 2012 21:16:24 +0100	[thread overview]
Message-ID: <20121002201624.GA98445@ocelot.phlegethon.org> (raw)
In-Reply-To: <66cc0085-1216-40f7-8059-eaf615202c12@default>

At 12:33 -0700 on 02 Oct (1349181195), Dan Magenheimer wrote:
> > From: Tim Deegan [mailto:tim@xen.org]
> > Subject: Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
> > 
> > At 13:03 -0700 on 01 Oct (1349096617), Dan Magenheimer wrote:
> > > Bearing in mind that I know almost nothing about xl or
> > > the tools layer, and that, as a result, I tend to look
> > > for hypervisor solutions, I'm thinking it's not possible to
> > > solve this without direct participation of the hypervisor anyway,
> > > at least while ensuring the solution will successfully
> > > work with any memory technology that involves ballooning
> > > with the possibility of overcommit (i.e. tmem, page sharing
> > > and host-swapping, manual ballooning, PoD)...  EVEN if the
> > > toolset is single threaded (i.e. only one domain may
> > > be created at a time, such as xapi). [1]
> > 
> > TTBOMK, Xapi actually _has_ solved this problem, even with ballooning
> > and PoD.  I don't know if they have any plans to support sharing,
> > swapping or tmem, though.
> 
> Is this because PoD never independently increases the size of a domain's
> allocation? 

AIUI xapi uses the domains' maximum allocations, centrally controlled,
to place an upper bound on the amount of guest memory that can be in
use.  Within those limits there can be ballooning activity.  But TBH I
don't know the details.

> > Adding a 'reservation' of free pages that may only be allocated by a
> > given domain should be straightforward enough, but I'm not sure it helps
> 
> It absolutely does help.  With tmem (and I think with paging), the
> total allocation of a domain may be increased without knowledge by
> the toolset.

But not past the domains' maximum allowance, right?  That's not the case
with paging, anyway.

> > much.  In the 'balloon-to-fit' model where all memory is already
> > allocated to some domain (or tmem), some part of the toolstack needs to
> > sort out freeing up the memory before allocating it to another VM.
> 
> By balloon-to-fit, do you mean that all RAM is occupied?  Tmem
> handles the "sort out freeing up the memory" entirely in the
> hypervisor, so the toolstack never knows.

Does tmem replace ballooning/sharing/swapping entirely?  I thought they
could coexist.  Or, if you jut mean that tmem owns all otherwise-free
memory and will relinquish it on demand, then the same problems occur
while the toolstack is moving memory from owned-by-guests to
owned-by-tmem.

> > Surely that component needs to handle the exclusion too - otherwise a
> > series of small VM creations could stall a large one indefinitely.
> 
> Not sure I understand this, but it seems feasible.

If you ask for a large VM and a small VM to be started at about the same
time, the small VM will always win (since you'll free enough memory for
the small VM before you free enough for the big one).  If you then ask
for another small VM it will win again, and so forth, indefinitely
postponing the large VM in the waiting-for-memory state, unless some
agent explicitly enforces that VMs be started in order.  If you have
such an agent you probably don't need a hypervisor interlock as well.

I think it would be better to back up a bit.  Maybe you could sketch out
how you think [lib]xl ought to be handling ballooning/swapping/sharing/tmem
when it's starting VMs.  I don't have a strong objection to accounting
free memory to particular domains if it turns out to be useful, but as
always I prefer not to have things happen in the hypervisor if they
could happen in less privileged code.

Tim.

  reply	other threads:[~2012-10-02 20:16 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-26 21:17 domain creation vs querying free memory (xend and xl) Dan Magenheimer
2012-09-27 11:26 ` Konrad Rzeszutek Wilk
2012-09-27 15:32   ` Dan Magenheimer
2012-09-27 15:24 ` George Shuklin
2012-09-28 16:08   ` Dario Faggioli
2012-10-02 18:17     ` Dan Magenheimer
2012-09-28 17:12 ` Ian Jackson
2012-10-01 20:03   ` Dan Magenheimer
2012-10-02  9:10     ` Tim Deegan
2012-10-02  9:47       ` Ian Campbell
2012-10-02 19:33       ` Dan Magenheimer
2012-10-02 20:16         ` Tim Deegan [this message]
2012-10-02 21:56           ` Dan Magenheimer
2012-10-04 10:06             ` Tim Deegan
2012-10-04 10:17               ` Ian Campbell
2012-10-04 13:20                 ` Andres Lagar-Cavilla
2012-10-04 13:25                   ` Ian Campbell
2012-10-04 16:54                   ` Dan Magenheimer
2012-10-04 17:00                     ` Andres Lagar-Cavilla
2012-10-05  9:44                     ` Ian Campbell
2012-10-05 11:40                     ` George Dunlap
2012-10-08  1:02                       ` Dan Magenheimer
2012-10-16 11:49                         ` George Dunlap
2012-10-16 17:51                           ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-17 18:33                               ` Andres Lagar-Cavilla
2012-10-17 19:46                                 ` Dan Magenheimer
2012-10-17 20:14                                   ` Andres Lagar-Cavilla
2012-10-17 22:07                                     ` Dan Magenheimer
2012-10-17 18:45                               ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-04 13:33               ` Andres Lagar-Cavilla
2012-10-04 16:59                 ` Dan Magenheimer
2012-10-04 17:08                   ` Andres Lagar-Cavilla
2012-10-04 17:18                     ` Dan Magenheimer
2012-10-04 17:30                       ` Andres Lagar-Cavilla
2012-10-04 17:55                         ` Dan Magenheimer
2012-10-05 14:25                           ` Andres Lagar-Cavilla
2012-10-07 23:43                             ` Dan Magenheimer
2012-10-04 16:36               ` Dan Magenheimer
2012-10-04 18:26     ` Olaf Hering
2012-10-04 19:38       ` Dan Magenheimer
2012-10-04 20:18         ` Olaf Hering
2012-10-04 20:35           ` Dan Magenheimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121002201624.GA98445@ocelot.phlegethon.org \
    --to=tim@xen.org \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andreslc@gridcentric.ca \
    --cc=dan.magenheimer@oracle.com \
    --cc=george.shuklin@gmail.com \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kurt.hackel@oracle.com \
    --cc=olaf@aepfle.de \
    --cc=raistlin@linux.it \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).