From: Andres Lagar-Cavilla <andres@lagarcavilla.org>
To: xen-devel@lists.xen.org
Cc: ian.jackson@citrix.com, andres@gridcentric.ca, tim@xen.org,
ian.campbell@citrix.com, adin@gridcentric.ca
Subject: [PATCH 2 of 2] Document the sharing interface
Date: Mon, 02 Apr 2012 15:11:04 -0400 [thread overview]
Message-ID: <3f6b49097dced570f2de.1333393864@xdev.gridcentric.ca> (raw)
In-Reply-To: <patchbomb.1333393862@xdev.gridcentric.ca>
tools/libxc/xenctrl.h | 201 ++++++++++++++++++++++++++++++++++++++++++++-----
1 files changed, 179 insertions(+), 22 deletions(-)
(also make note about AMD support's experimental status)
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla>
diff -r f110cf1372a8 -r 3f6b49097dce tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -1251,23 +1251,6 @@ int xc_mmuext_op(xc_interface *xch, stru
/* System wide memory properties */
long xc_maximum_ram_page(xc_interface *xch);
-/**
- * This function returns the total number of pages freed by using sharing
- * on the system. For example, if two domains contain a single entry in
- * their p2m map that points to the same shared page (and no other pages
- * in the system are shared), then this function should return 1.
- */
-long xc_sharing_freed_pages(xc_interface *xch);
-
-/**
- * This function returns the total number of frames occupied by shared
- * pages on the system. This is independent of the number of domains
- * pointing at these frames. For example, in the above scenario this
- * should return 1. The following should hold:
- * memory usage without sharing = freed_pages + used_frames
- */
-long xc_sharing_used_frames(xc_interface *xch);
-
/* Get current total pages allocated to a domain. */
long xc_get_tot_pages(xc_interface *xch, uint32_t domid);
@@ -1894,7 +1877,7 @@ int xc_tmem_restore(xc_interface *xch, i
int xc_tmem_restore_extra(xc_interface *xch, int dom, int fd);
/**
- * mem_event operations
+ * mem_event operations. Internal use only.
*/
int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op,
unsigned int mode, uint32_t *port);
@@ -1902,6 +1885,12 @@ int xc_mem_event_memop(xc_interface *xch
unsigned int op, unsigned int mode,
uint64_t gfn, void *buffer);
+/**
+ * Mem paging operations.
+ * Paging is supported only on the x86 architecture in 64 bit mode, with
+ * Hardware-Assisted Paging (i.e. Intel EPT, AMD NPT). Moreover, AMD NPT
+ * support is considered experimental.
+ */
int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, uint32_t *port);
int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id);
int xc_mem_paging_nominate(xc_interface *xch, domid_t domain_id,
@@ -1911,30 +1900,133 @@ int xc_mem_paging_prep(xc_interface *xch
int xc_mem_paging_load(xc_interface *xch, domid_t domain_id,
unsigned long gfn, void *buffer);
+/**
+ * Access tracking operations.
+ * Supported only on Intel EPT 64 bit processors.
+ */
int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, uint32_t *port);
int xc_mem_access_disable(xc_interface *xch, domid_t domain_id);
int xc_mem_access_resume(xc_interface *xch, domid_t domain_id,
unsigned long gfn);
-/**
- * memshr operations
+/***
+ * Memory sharing operations.
+ *
+ * Unles otherwise noted, these calls return 0 on succes, -1 and errno on
+ * failure.
+ *
+ * Sharing is supported only on the x86 architecture in 64 bit mode, with
+ * Hardware-Assisted Paging (i.e. Intel EPT, AMD NPT). Moreover, AMD NPT
+ * support is considered experimental.
+
+ * Calls below return ENOSYS if not in the x86_64 architecture.
+ * Calls below return ENODEV if the domain does not support HAP.
+ * Calls below return ESRCH if the specified domain does not exist.
+ * Calls below return EPERM if the caller is unprivileged for this domain.
*/
+
+/* Turn on/off sharing for the domid, depending on the enable flag.
+ *
+ * Returns EXDEV if trying to enable and the domain has had a PCI device
+ * assigned for passthrough (these two features are mutually exclusive).
+ *
+ * When sharing for a domain is turned off, the domain may still reference
+ * shared pages. Unsharing happens lazily. */
int xc_memshr_control(xc_interface *xch,
domid_t domid,
int enable);
+
+/* Create a communication ring in which the hypervisor will place ENOMEM
+ * notifications.
+ *
+ * ENOMEM happens when unsharing pages: a Copy-on-Write duplicate needs to be
+ * allocated, and thus the out-of-memory error occurr.
+ *
+ * For complete examples on how to plumb a notification ring, look into
+ * xenpaging or xen-access.
+ *
+ * On receipt of a notification, the helper should ensure there is memory
+ * available to the domain before retrying.
+ *
+ * If a domain encounters an ENOMEM condition when sharing and this ring
+ * has not been set up, the hypervisor will crash the domain.
+ *
+ * Fails with:
+ * EINVAL if port is NULL
+ * EINVAL if the sharing ring has already been enabled
+ * ENOSYS if no guest gfn has been specified to host the ring via an hvm param
+ * EINVAL if the gfn for the ring has not been populated
+ * ENOENT if the gfn for the ring is paged out, or cannot be unshared
+ * EINVAL if the gfn for the ring cannot be written to
+ * EINVAL if the domain is dying
+ * ENOSPC if an event channel cannot be allocated for the ring
+ * ENOMEM if memory cannot be allocated for internal data structures
+ * EINVAL or EACCESS if the request is denied by the security policy
+ */
+
int xc_memshr_ring_enable(xc_interface *xch,
domid_t domid,
uint32_t *port);
+/* Disable the ring for ENOMEM communication.
+ * May fail with EINVAL if the ring was not enabled in the first place.
+ */
int xc_memshr_ring_disable(xc_interface *xch,
domid_t domid);
+
+/*
+ * Calls below return EINVAL if sharing has not been enabled for the domain
+ * Calls below return EINVAL if the domain is dying
+ */
+/* Once a reponse to an ENOMEM notification is prepared, the tool can
+ * notify the hypervisor to re-schedule the faulting vcpu of the domain with an
+ * event channel kick and/or this call. */
+int xc_memshr_domain_resume(xc_interface *xch,
+ domid_t domid);
+
+/* Select a page for sharing.
+ *
+ * A 64 bit opaque handle will be stored in handle. The hypervisor ensures
+ * that if the page is modified, the handle will be invalidated, and future
+ * users of it will fail. If the page has already been selected and is still
+ * associated to a valid handle, the existing handle will be returned.
+ *
+ * May fail with:
+ * EINVAL if the gfn is not populated or not sharable (mmio, etc)
+ * ENOMEM if internal data structures cannot be allocated
+ * E2BIG if the page is being referenced by other subsytems (e.g. qemu)
+ * ENOENT or EEXIST if there are internal hypervisor errors.
+ */
int xc_memshr_nominate_gfn(xc_interface *xch,
domid_t domid,
unsigned long gfn,
uint64_t *handle);
+/* Same as above, but instead of a guest frame number, the input is a grant
+ * reference provided by the guest.
+ *
+ * May fail with EINVAL if the grant reference is invalid.
+ */
int xc_memshr_nominate_gref(xc_interface *xch,
domid_t domid,
grant_ref_t gref,
uint64_t *handle);
+
+/* The three calls below may fail with
+ * 10 (or -XENMEM_SHARING_OP_S_HANDLE_INVALID) if the handle passed as source
+ * is invalid.
+ * 9 (or -XENMEM_SHARING_OP_C_HANDLE_INVALID) if the handle passed as client is
+ * invalid.
+ */
+/* Share two nominated guest pages.
+ *
+ * If the call succeeds, both pages will point to the same backing frame (or
+ * mfn). The hypervisor will verify the handles are still valid, but it will
+ * not perform any sanity checking on the contens of the pages (the selection
+ * mechanism for sharing candidates is entirely up to the user-space tool).
+ *
+ * After successful sharing, the client handle becomes invalid. Both <domain,
+ * gfn> tuples point to the same mfn with the same handle, the one specified as
+ * source. Either 3-tuple can be specified later for further re-sharing.
+ */
int xc_memshr_share_gfns(xc_interface *xch,
domid_t source_domain,
unsigned long source_gfn,
@@ -1942,6 +2034,11 @@ int xc_memshr_share_gfns(xc_interface *x
domid_t client_domain,
unsigned long client_gfn,
uint64_t client_handle);
+
+/* Same as above, but share two grant references instead.
+ *
+ * May fail with EINVAL if either grant reference is invalid.
+ */
int xc_memshr_share_grefs(xc_interface *xch,
domid_t source_domain,
grant_ref_t source_gref,
@@ -1949,22 +2046,82 @@ int xc_memshr_share_grefs(xc_interface *
domid_t client_domain,
grant_ref_t client_gref,
uint64_t client_handle);
+
+/* Allows to add to the guest physmap of the client domain a shared frame
+ * directly.
+ *
+ * May additionally fail with
+ * 9 (-XENMEM_SHARING_OP_C_HANDLE_INVALID) if the physmap entry for the gfn is
+ * not suitable.
+ * ENOMEM if internal data structures cannot be allocated.
+ * ENOENT if there is an internal hypervisor error.
+ */
int xc_memshr_add_to_physmap(xc_interface *xch,
domid_t source_domain,
unsigned long source_gfn,
uint64_t source_handle,
domid_t client_domain,
unsigned long client_gfn);
-int xc_memshr_domain_resume(xc_interface *xch,
- domid_t domid);
+
+/* Debug calls: return the number of pages referencing the shared frame backing
+ * the input argument. Should be one or greater.
+ *
+ * May fail with EINVAL if there is no backing shared frame for the input
+ * argument.
+ */
int xc_memshr_debug_gfn(xc_interface *xch,
domid_t domid,
unsigned long gfn);
+/* May additionally fail with EINVAL if the grant reference is invalid. */
int xc_memshr_debug_gref(xc_interface *xch,
domid_t domid,
grant_ref_t gref);
+
+/* Audits the share subsystem.
+ *
+ * Returns ENOSYS if not supported (may not be compiled into the hypervisor).
+ *
+ * Returns the number of errors found during auditing otherwise. May be (should
+ * be!) zero.
+ *
+ * If debugtrace support has been compiled into the hypervisor and is enabled,
+ * verbose descriptions for the errors are available in the hypervisor console.
+ */
int xc_memshr_audit(xc_interface *xch);
+/* Stats reporting.
+ *
+ * At any point in time, the following equality should hold for a host:
+ *
+ * Let dominfo(d) be the xc_dominfo_t struct filled by a call to
+ * xc_domain_getinfo(d)
+ *
+ * The summation of dominfo(d)->shr_pages for all domains in the system
+ * should be equal to
+ * xc_sharing_freed_pages + xc_sharing_used_frames
+ */
+/*
+ * This function returns the total number of pages freed by using sharing
+ * on the system. For example, if two domains contain a single entry in
+ * their p2m table that points to the same shared page (and no other pages
+ * in the system are shared), then this function should return 1.
+ */
+long xc_sharing_freed_pages(xc_interface *xch);
+
+/*
+ * This function returns the total number of frames occupied by shared
+ * pages on the system. This is independent of the number of domains
+ * pointing at these frames. For example, in the above scenario this
+ * should return 1. (And dominfo(d) for each of the two domains should return 1
+ * as well).
+ *
+ * Note that some of these sharing_used_frames may be referenced by
+ * a single domain page, and thus not realize any savings. The same
+ * applies to some of the pages counted in dominfo(d)->shr_pages.
+ */
+long xc_sharing_used_frames(xc_interface *xch);
+/*** End sharing interface ***/
+
int xc_flask_load(xc_interface *xc_handle, char *buf, uint32_t size);
int xc_flask_context_to_sid(xc_interface *xc_handle, char *buf, uint32_t size, uint32_t *sid);
int xc_flask_sid_to_context(xc_interface *xc_handle, int sid, char *buf, uint32_t size);
next prev parent reply other threads:[~2012-04-02 19:11 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-02 19:11 [PATCH 0 of 2] Sharing Documentation Andres Lagar-Cavilla
2012-04-02 19:11 ` [PATCH 1 of 2] x86/mem_sharing: Clean up debugging calls Andres Lagar-Cavilla
2012-04-02 19:11 ` Andres Lagar-Cavilla [this message]
2012-04-05 10:15 ` [PATCH 0 of 2] Sharing Documentation Tim Deegan
2012-04-12 14:18 ` Andres Lagar-Cavilla
2012-04-23 17:06 ` Andres Lagar-Cavilla
2012-04-24 18:08 ` Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3f6b49097dced570f2de.1333393864@xdev.gridcentric.ca \
--to=andres@lagarcavilla.org \
--cc=adin@gridcentric.ca \
--cc=andres@gridcentric.ca \
--cc=ian.campbell@citrix.com \
--cc=ian.jackson@citrix.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.