xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Tim Deegan <tim@xen.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Olaf Hering <olaf@aepfle.de>, Keir Fraser <keir@xen.org>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Kurt Hackel <kurt.hackel@oracle.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	xen-devel@lists.xen.org,
	George Shuklin <george.shuklin@gmail.com>,
	Dario Faggioli <raistlin@linux.it>,
	Andres Lagar-Cavilla <andreslc@gridcentric.ca>
Subject: Re: domain creation vs querying free memory (xend and xl)
Date: Thu, 4 Oct 2012 11:06:45 +0100	[thread overview]
Message-ID: <20121004100645.GC38243@ocelot.phlegethon.org> (raw)
In-Reply-To: <dad711f1-c63c-4958-888d-baba3f89a261@default>

At 14:56 -0700 on 02 Oct (1349189817), Dan Magenheimer wrote:
> > AIUI xapi uses the domains' maximum allocations, centrally controlled,
> > to place an upper bound on the amount of guest memory that can be in
> > use.  Within those limits there can be ballooning activity.  But TBH I
> > don't know the details.
> 
> Yes, that's the same as saying there is no memory-overcommit.

I'd say there is - but it's all done by ballooning, and it's centrally
enforced by lowering each domain's maxmem to its balloon target, so a
badly behaved guest can't balloon up and confuse things. 

> The original problem occurs only if there are multiple threads
> of execution that can be simultaneously asking the hypervisor
> to allocate memory without the knowledge of a single centralized
> "controller".

Absolutely.

> Tmem argues that doing "memory capacity transfers" at a page granularity
> can only be done efficiently in the hypervisor.  This is true for
> page-sharing when it breaks a "share" also... it can't go ask the
> toolstack to approve allocation of a new page every time a write to a shared
> page occurs.
> 
> Does that make sense?

Yes.  The page-sharing version can be handled by having a pool of
dedicated memory for breaking shares, and the toolstack asynchronously
replenish that, rather than allowing CoW to use up all memory in the
system.

> (rough proposed design re-attached below)

Thanks for that.  It describes a sensible-looking hypervisor interface,
but my question was really: what should xl do, in the presence of
ballooning, sharing, paging and tmem, to
 - decide whether a VM can be started at all;
 - control those four systems to shuffle memory around; and
 - resolve races sensibly to avoid small VMs deferring large ones.
(AIUI, xl already has some logic to handle the case of balloon-to-fit.)

The second of those three is the interesting one.  It seems to me that
if the tools can't force all other actors to give up memory (and not
immediately take it back) then they can't guarantee to be able to start
a new VM, even with the new reservation hypercalls.

Cheers,

Tim.

> > From: Dan Magenheimer
> > Sent: Monday, October 01, 2012 2:04 PM
> >    :
> >    :
> > Back to design brainstorming:
> > 
> > The way I am thinking about it, the tools need to be involved
> > to the extent that they would need to communicate to the
> > hypervisor the following facts (probably via new hypercall):
> > 
> > X1) I am launching a domain X and it is eventually going to
> >    consume up to a maximum of N MB.  Please tell me if
> >    there is sufficient RAM available AND, if so, reserve
> >    it until I tell you I am done. ("AND" implies transactional
> >    semantics)
> > X2) The launch of X is complete and I will not be requesting
> >    the allocation of any more RAM for it.  Please release
> >    the reservation, whether or not I've requested a total
> >    of N MB.
> > 
> > The calls may be nested or partially ordered, i.e.
> >    X1...Y1...Y2...X2
> >    X1...Y1...X2...Y2
> > and the hypervisor must be able to deal with this.
> > 
> > Then there would need to be two "versions" of "xm/xl free".
> > We can quibble about which should be the default, but
> > they would be:
> > 
> > - "xl --reserved free" asks the hypervisor how much RAM
> >    is available taking into account reservations
> > - "xm --raw free" asks the hypervisor for the instantaneous
> >    amount of RAM unallocated, not counting reservations
> > 
> > When the tools are not launching a domain (that is there
> > has been a matching X2 for all X1), the results of the
> > above "free" queries are always identical.
> > 
> > So, IanJ, does this match up with the design you were thinking
> > about?
> > 
> > Thanks,
> > Dan
> > 
> > [1] I think the core culprits are (a) the hypervisor accounts for
> > memory allocation of pages strictly on a first-come-first-served
> > basis and (b) the tools don't have any form of need-this-much-memory
> > "transaction" model

  reply	other threads:[~2012-10-04 10:06 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-26 21:17 domain creation vs querying free memory (xend and xl) Dan Magenheimer
2012-09-27 11:26 ` Konrad Rzeszutek Wilk
2012-09-27 15:32   ` Dan Magenheimer
2012-09-27 15:24 ` George Shuklin
2012-09-28 16:08   ` Dario Faggioli
2012-10-02 18:17     ` Dan Magenheimer
2012-09-28 17:12 ` Ian Jackson
2012-10-01 20:03   ` Dan Magenheimer
2012-10-02  9:10     ` Tim Deegan
2012-10-02  9:47       ` Ian Campbell
2012-10-02 19:33       ` Dan Magenheimer
2012-10-02 20:16         ` Tim Deegan
2012-10-02 21:56           ` Dan Magenheimer
2012-10-04 10:06             ` Tim Deegan [this message]
2012-10-04 10:17               ` Ian Campbell
2012-10-04 13:20                 ` Andres Lagar-Cavilla
2012-10-04 13:25                   ` Ian Campbell
2012-10-04 16:54                   ` Dan Magenheimer
2012-10-04 17:00                     ` Andres Lagar-Cavilla
2012-10-05  9:44                     ` Ian Campbell
2012-10-05 11:40                     ` George Dunlap
2012-10-08  1:02                       ` Dan Magenheimer
2012-10-16 11:49                         ` George Dunlap
2012-10-16 17:51                           ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-17 18:33                               ` Andres Lagar-Cavilla
2012-10-17 19:46                                 ` Dan Magenheimer
2012-10-17 20:14                                   ` Andres Lagar-Cavilla
2012-10-17 22:07                                     ` Dan Magenheimer
2012-10-17 18:45                               ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-04 13:33               ` Andres Lagar-Cavilla
2012-10-04 16:59                 ` Dan Magenheimer
2012-10-04 17:08                   ` Andres Lagar-Cavilla
2012-10-04 17:18                     ` Dan Magenheimer
2012-10-04 17:30                       ` Andres Lagar-Cavilla
2012-10-04 17:55                         ` Dan Magenheimer
2012-10-05 14:25                           ` Andres Lagar-Cavilla
2012-10-07 23:43                             ` Dan Magenheimer
2012-10-04 16:36               ` Dan Magenheimer
2012-10-04 18:26     ` Olaf Hering
2012-10-04 19:38       ` Dan Magenheimer
2012-10-04 20:18         ` Olaf Hering
2012-10-04 20:35           ` Dan Magenheimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121004100645.GC38243@ocelot.phlegethon.org \
    --to=tim@xen.org \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andreslc@gridcentric.ca \
    --cc=dan.magenheimer@oracle.com \
    --cc=george.shuklin@gmail.com \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kurt.hackel@oracle.com \
    --cc=olaf@aepfle.de \
    --cc=raistlin@linux.it \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).