Re: domain creation vs querying free memory (xend and xl)

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Dan Magenheimer <dan.magenheimer@oracle.com>
To: Olaf Hering <olaf@aepfle.de>
Cc: Keir Fraser <keir@xen.org>, Konrad Wilk <konrad.wilk@oracle.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Kurt Hackel <kurt.hackel@oracle.com>,
	tim@xen.org, xen-devel@lists.xen.org,
	George Shuklin <george.shuklin@gmail.com>,
	Dario Faggioli <raistlin@linux.it>,
	Andres Lagar-Cavilla <andreslc@gridcentric.ca>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>
Subject: Re: domain creation vs querying free memory (xend and xl)
Date: Thu, 4 Oct 2012 12:38:26 -0700 (PDT)	[thread overview]
Message-ID: <18147469-adb0-4a86-b36f-231cb412d112@default> (raw)
In-Reply-To: <20121004182613.GA9244@aepfle.de>

> From: Olaf Hering [mailto:olaf@aepfle.de]
> Subject: Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
> 
> On Mon, Oct 01, Dan Magenheimer wrote:
> 

Hi Olaf --

Thanks for the reply.

> domain. All of this needs math, not locking.
>  :
> As IanJ said, the memory handling code in libxl needs such a feature to
> do the math right. The proposed handling of
> sharing/paging/ballooning/PoD/tmem/... in libxl is just a small part of
> it.

Unfortunately, as you observe in some of the cases earlier in your reply,
it is more than a math problem for libxl... it is a crystal ball problem.
If xl launches a domain D at time T and it takes N seconds before it has
completed asking the hypervisor for all of the memory M that D will require
to successfully launch, then xl must determine at time T the maximum memory
allocated across all running domains for the future time period between
T and T+N.  In other words, xl must predict the future.

Clearly this is impossible especially when page-sharing is not communicating
its dynamic allocations (e.g. due to page-splitting) to libxl, and tmem
is not communicating allocations resulting from multiple domains
simultaenously making tmem hypercalls to libxl, and PoD is not communicating
its allocations to libxl, and in-guest-kernel selfballooning is not communicating
allocations to libxl.  Only the hypervisor is aware of every dynamic allocation
request.

So all libxl can do is guess about the future because races are
going to occur.  Multiple threads are simultaneously trying to
access a limited resource (pages of memory) and only the hypervisor
knows whether there is enough to deliver memory for all requests.

To me, the solution to racing for a shared resource is locking.
Naturally, you want the critical path to be as short as possible.
And you don't want to lock all instances of the resource (i.e.
every page in memory) if you can avoid it.  And you need to
ensure that the lock is honored for all requests to allocate
the shared resource, meaning in this case that it has to
be done in the hypervisor.

I think that's what the proposed design does:  It provides a
mechanism to ask the hypervisor to reserve a fixed amount of
memory M, some or all of which will eventually turn into
an allocation request; and a mechanism to ask the hypervisor
to no longer honor that reservation ("unreserve") whether or
not all of M has been allocated.  It essentially locks that
M amount of memory between reserve and unreserve so that other
dynamic allocations (page-sharing, tmem, PoD, OR another libxl
thread trying to create another domain) cannot sneak in and
claim memory capacity that has been reserved.

Does that make sense?

Thanks,
Dan

next prev parent reply	other threads:[~2012-10-04 19:38 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-26 21:17 domain creation vs querying free memory (xend and xl) Dan Magenheimer
2012-09-27 11:26 ` Konrad Rzeszutek Wilk
2012-09-27 15:32   ` Dan Magenheimer
2012-09-27 15:24 ` George Shuklin
2012-09-28 16:08   ` Dario Faggioli
2012-10-02 18:17     ` Dan Magenheimer
2012-09-28 17:12 ` Ian Jackson
2012-10-01 20:03   ` Dan Magenheimer
2012-10-02  9:10     ` Tim Deegan
2012-10-02  9:47       ` Ian Campbell
2012-10-02 19:33       ` Dan Magenheimer
2012-10-02 20:16         ` Tim Deegan
2012-10-02 21:56           ` Dan Magenheimer
2012-10-04 10:06             ` Tim Deegan
2012-10-04 10:17               ` Ian Campbell
2012-10-04 13:20                 ` Andres Lagar-Cavilla
2012-10-04 13:25                   ` Ian Campbell
2012-10-04 16:54                   ` Dan Magenheimer
2012-10-04 17:00                     ` Andres Lagar-Cavilla
2012-10-05  9:44                     ` Ian Campbell
2012-10-05 11:40                     ` George Dunlap
2012-10-08  1:02                       ` Dan Magenheimer
2012-10-16 11:49                         ` George Dunlap
2012-10-16 17:51                           ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-17 18:33                               ` Andres Lagar-Cavilla
2012-10-17 19:46                                 ` Dan Magenheimer
2012-10-17 20:14                                   ` Andres Lagar-Cavilla
2012-10-17 22:07                                     ` Dan Magenheimer
2012-10-17 18:45                               ` Dan Magenheimer
2012-10-17 17:35                             ` George Dunlap
2012-10-04 13:33               ` Andres Lagar-Cavilla
2012-10-04 16:59                 ` Dan Magenheimer
2012-10-04 17:08                   ` Andres Lagar-Cavilla
2012-10-04 17:18                     ` Dan Magenheimer
2012-10-04 17:30                       ` Andres Lagar-Cavilla
2012-10-04 17:55                         ` Dan Magenheimer
2012-10-05 14:25                           ` Andres Lagar-Cavilla
2012-10-07 23:43                             ` Dan Magenheimer
2012-10-04 16:36               ` Dan Magenheimer
2012-10-04 18:26     ` Olaf Hering
2012-10-04 19:38       ` Dan Magenheimer [this message]
2012-10-04 20:18         ` Olaf Hering
2012-10-04 20:35           ` Dan Magenheimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18147469-adb0-4a86-b36f-231cb412d112@default \
    --to=dan.magenheimer@oracle.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andreslc@gridcentric.ca \
    --cc=george.shuklin@gmail.com \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kurt.hackel@oracle.com \
    --cc=olaf@aepfle.de \
    --cc=raistlin@linux.it \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).