From: Tim Deegan <tim@xen.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Olaf Hering <olaf@aepfle.de>, "Keir (Xen.org)" <keir@xen.org>,
Ian Campbell <Ian.Campbell@citrix.com>,
Konrad Wilk <konrad.wilk@oracle.com>,
George Dunlap <George.Dunlap@eu.citrix.com>,
Kurt Hackel <kurt.hackel@oracle.com>,
George Shuklin <george.shuklin@gmail.com>,
xen-devel@lists.xen.org, Dario Faggioli <raistlin@linux.it>,
Zhigang Wang <zhigang.x.wang@oracle.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>
Subject: Re: Proposed new "memory capacity claim" hypercall/feature
Date: Mon, 29 Oct 2012 22:35:55 +0000 [thread overview]
Message-ID: <20121029223555.GA24388@ocelot.phlegethon.org> (raw)
In-Reply-To: <60d00f38-98a3-4ec2-acbd-b49dafaada56@default>
At 10:06 -0700 on 29 Oct (1351505175), Dan Magenheimer wrote:
> Hypervisor design/implementation overview:
>
> A domain currently does RAM accounting with two primary counters
> "tot_pages" and "max_pages". (For now, let's ignore shr_pages,
> paged_pages, and xenheap_pages, and I hope Olaf/Andre/others can
> provide further expertise and input.)
>
> Tot_pages is a struct_domain element in the hypervisor that tracks
> the number of physical RAM pageframes "owned" by the domain. The
> hypervisor enforces that tot_pages is never allowed to exceed another
> struct_domain element called max_pages.
>
> I would like to introduce a new counter, which records how
> much capacity is claimed for a domain which may or may not yet be
> mapped to physical RAM pageframes. To do so, I'd like to split
> the concept of tot_pages into two variables, tot_phys_pages and
> tot_claimed_pages and require the hypervisor to also enforce:
>
> d.tot_phys_pages <= d.tot_claimed_pages[3] <= d.max_pages
>
> I'd also split the hypervisor global "total_avail_pages" into
> "total_free_pages" and "total_unclaimed_pages". (I'm definitely
> going to need to study more the two-dimensional array "avail"...)
> The hypervisor must now do additional accounting to keep track
> of the sum of claims across all domains and also enforce the
> global:
>
> total_unclaimed_pages <= total_free_pages
>
> I think the memory_op hypercall can be extended to add two
> additional subops, XENMEM_claim and XENMEM_release. (Note: To
> support tmem, there will need to be two variations of XEN_claim,
> "hard claim" and "soft claim" [3].) The XEN_claim subop atomically
> evaluates total_unclaimed_pages against the new claim, claims
> the pages for the domain if possible and returns success or failure.
> The XEN_release "unsets" the domain's tot_claimed_pages (to an
> "illegal" value such as zero or MINUS_ONE).
>
> The hypervisor must also enforce some semantics: If an allocation
> occurs such that a domain's tot_phys_pages would equal or exceed
> d.tot_claimed_pages, then d.tot_claimed_pages becomes "unset".
> This enforces the temporary nature of a claim: Once a domain
> fully "occupies" its claim, the claim silently expires.
Why does that happen? If I understand you correctly, releasing the
claim is something the toolstack should do once it knows it's no longer
needed.
> In the case of a dying domain, a XENMEM_release operation
> is implied and must be executed by the hypervisor.
>
> Ideally, the quantity of unclaimed memory for each domain and
> for the system should be query-able. This may require additional
> memory_op hypercalls.
>
> I'd very much appreciate feedback on this proposed design!
As I said, I'm not opposed to this, though even after reading through
the other thread I'm not convinced that it's necessary (except in cases
where guest-controlled operations are allowed to consume unbounded
memory, which frankly gives me the heebie-jeebies).
I think it needs a plan for handling restricted memory allocations.
For example, some PV guests need their memory to come below a
certain machine address, or entirely in superpages, and certain
build-time allocations come from xenheap. How would you handle that
sort of thing?
Cheers,
Tim.
next prev parent reply other threads:[~2012-10-29 22:35 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-29 17:06 Proposed new "memory capacity claim" hypercall/feature Dan Magenheimer
2012-10-29 18:24 ` Keir Fraser
2012-10-29 21:08 ` Dan Magenheimer
2012-10-29 22:22 ` Keir Fraser
2012-10-29 23:03 ` Dan Magenheimer
2012-10-29 23:17 ` Keir Fraser
2012-10-30 15:13 ` Dan Magenheimer
2012-10-30 14:43 ` Keir Fraser
2012-10-30 16:33 ` Dan Magenheimer
2012-10-30 9:11 ` George Dunlap
2012-10-30 16:13 ` Dan Magenheimer
2012-10-29 22:35 ` Tim Deegan [this message]
2012-10-29 23:21 ` Dan Magenheimer
2012-10-30 8:13 ` Tim Deegan
2012-10-30 15:26 ` Dan Magenheimer
2012-10-30 8:29 ` Jan Beulich
2012-10-30 15:43 ` Dan Magenheimer
2012-10-30 16:04 ` Jan Beulich
2012-10-30 17:13 ` Dan Magenheimer
2012-10-31 8:14 ` Jan Beulich
2012-10-31 16:04 ` Dan Magenheimer
2012-10-31 16:19 ` Jan Beulich
2012-10-31 16:51 ` Dan Magenheimer
2012-11-02 9:01 ` Jan Beulich
2012-11-02 9:30 ` Keir Fraser
2012-11-04 19:43 ` Dan Magenheimer
2012-11-04 20:35 ` Tim Deegan
2012-11-05 0:23 ` Dan Magenheimer
2012-11-05 10:29 ` Ian Campbell
2012-11-05 14:54 ` Dan Magenheimer
2012-11-05 22:24 ` Ian Campbell
2012-11-05 22:58 ` Zhigang Wang
2012-11-05 22:58 ` Dan Magenheimer
2012-11-06 13:23 ` Ian Campbell
2012-11-05 22:33 ` Dan Magenheimer
2012-11-06 10:49 ` Jan Beulich
2012-11-05 9:16 ` Jan Beulich
2012-11-07 22:17 ` Dan Magenheimer
2012-11-08 7:36 ` Keir Fraser
2012-11-08 10:11 ` Ian Jackson
2012-11-08 10:57 ` Keir Fraser
2012-11-08 21:45 ` Dan Magenheimer
2012-11-12 11:03 ` Ian Jackson
2012-11-08 8:00 ` Jan Beulich
2012-11-08 8:18 ` Keir Fraser
2012-11-08 8:54 ` Jan Beulich
2012-11-08 9:12 ` Keir Fraser
2012-11-08 9:47 ` Jan Beulich
2012-11-08 10:50 ` Keir Fraser
2012-11-08 13:48 ` Jan Beulich
2012-11-08 19:16 ` Dan Magenheimer
2012-11-08 22:32 ` Keir Fraser
2012-11-09 8:47 ` Jan Beulich
2012-11-08 18:38 ` Dan Magenheimer
2012-11-05 17:14 ` George Dunlap
2012-11-05 18:21 ` Dan Magenheimer
2012-11-01 2:13 ` Dario Faggioli
2012-11-01 15:51 ` Dan Magenheimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121029223555.GA24388@ocelot.phlegethon.org \
--to=tim@xen.org \
--cc=George.Dunlap@eu.citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=dan.magenheimer@oracle.com \
--cc=george.shuklin@gmail.com \
--cc=keir@xen.org \
--cc=konrad.wilk@oracle.com \
--cc=kurt.hackel@oracle.com \
--cc=olaf@aepfle.de \
--cc=raistlin@linux.it \
--cc=xen-devel@lists.xen.org \
--cc=zhigang.x.wang@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).