xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andres Lagar-Cavilla <andres@lagarcavilla.org>
To: xen-devel@lists.xensource.com
Cc: ian.campbell@citrix.com, andres@gridcentric.ca, tim@xen.org,
	keir.xen@gmail.com, JBeulich@suse.com, ian.jackson@citrix.com,
	adin@gridcentric.ca
Subject: [PATCH 09 of 12] Update memshr API and tools
Date: Sun, 15 Jan 2012 21:56:29 -0500	[thread overview]
Message-ID: <2d6a14ce5b0eda6cc26c.1326682589@xdev.gridcentric.ca> (raw)
In-Reply-To: <patchbomb.1326682580@xdev.gridcentric.ca>

 tools/blktap2/drivers/tapdisk-vbd.c |   6 +-
 tools/blktap2/drivers/tapdisk.h     |   2 +-
 tools/libxc/xc_memshr.c             |  63 ++++++++++++++++++++++++++++++++++--
 tools/libxc/xenctrl.h               |  19 ++++++++++-
 tools/memshr/bidir-daemon.c         |  34 ++++++++++++++-----
 tools/memshr/bidir-hash.h           |  12 ++++--
 tools/memshr/interface.c            |  30 ++++++++++-------
 tools/memshr/memshr.h               |  11 +++++-
 8 files changed, 139 insertions(+), 38 deletions(-)


This patch is the folded version of API updates, along with the associated tool
changes to ensure that the build is always consistent.

API updates:
- The source domain in the sharing calls is no longer assumed to be dom0.
- Previously, the mem sharing code would return an opaque handle to index
  shared pages (and nominees) in its global hash table.  By removing the hash
  table, the handle becomes a version, to avoid sharing a stale version of a
  page. Thus, libxc wrappers and tools need to be updated to recall the share
  functions with the information needed to fetch the page (which they readily
  have).

Tool updates:
The only (in-tree, that we know of) consumer of the mem sharing API is the
memshr tool. This is updated to use the new API.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Signed-off-by: Adin Scannell <adin@scannell.ca>

diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/blktap2/drivers/tapdisk-vbd.c
--- a/tools/blktap2/drivers/tapdisk-vbd.c
+++ b/tools/blktap2/drivers/tapdisk-vbd.c
@@ -1218,14 +1218,14 @@ __tapdisk_vbd_complete_td_request(td_vbd
 #ifdef MEMSHR
 		if (treq.op == TD_OP_READ
 		   && td_flag_test(image->flags, TD_OPEN_RDONLY)) {
-			uint64_t hnd  = treq.memshr_hnd;
+			share_tuple_t hnd = treq.memshr_hnd;
 			uint16_t uid  = image->memshr_id;
 			blkif_request_t *breq = &vreq->req;
 			uint64_t sec  = tapdisk_vbd_breq_get_sector(breq, treq);
 			int secs = breq->seg[treq.sidx].last_sect -
 			    breq->seg[treq.sidx].first_sect + 1;
 
-			if (hnd != 0)
+			if (hnd.handle != 0)
 				memshr_vbd_complete_ro_request(hnd, uid,
 								sec, secs);
 		}
@@ -1297,7 +1297,7 @@ __tapdisk_vbd_reissue_td_request(td_vbd_
 				/* Reset memshr handle. This'll prevent
 				 * memshr_vbd_complete_ro_request being called
 				 */
-				treq.memshr_hnd = 0;
+				treq.memshr_hnd.handle = 0;
 				td_complete_request(treq, 0);
 			} else
 				td_queue_read(parent, treq);
diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/blktap2/drivers/tapdisk.h
--- a/tools/blktap2/drivers/tapdisk.h
+++ b/tools/blktap2/drivers/tapdisk.h
@@ -140,7 +140,7 @@ struct td_request {
 	void                        *private;
     
 #ifdef MEMSHR
-	uint64_t                     memshr_hnd;
+	share_tuple_t                memshr_hnd;
 #endif
 };
 
diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/libxc/xc_memshr.c
--- a/tools/libxc/xc_memshr.c
+++ b/tools/libxc/xc_memshr.c
@@ -86,24 +86,79 @@ int xc_memshr_nominate_gref(xc_interface
     return ret;
 }
 
-int xc_memshr_share(xc_interface *xch,
-                    uint64_t source_handle,
-                    uint64_t client_handle)
+int xc_memshr_share_gfns(xc_interface *xch,
+                         domid_t source_domain,
+                         unsigned long source_gfn,
+                         uint64_t source_handle,
+                         domid_t client_domain,
+                         unsigned long client_gfn,
+                         uint64_t client_handle)
 {
     DECLARE_DOMCTL;
     struct xen_domctl_mem_sharing_op *op;
 
     domctl.cmd = XEN_DOMCTL_mem_sharing_op;
     domctl.interface_version = XEN_DOMCTL_INTERFACE_VERSION;
-    domctl.domain = 0;
+    domctl.domain = source_domain;
     op = &(domctl.u.mem_sharing_op);
     op->op = XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE;
     op->u.share.source_handle = source_handle;
+    op->u.share.source_gfn    = source_gfn;
+    op->u.share.client_domain = client_domain;
+    op->u.share.client_gfn    = client_gfn;
     op->u.share.client_handle = client_handle;
 
     return do_domctl(xch, &domctl);
 }
 
+int xc_memshr_share_grefs(xc_interface *xch,
+                          domid_t source_domain,
+                          grant_ref_t source_gref,
+                          uint64_t source_handle,
+                          domid_t client_domain,
+                          grant_ref_t client_gref,
+                          uint64_t client_handle)
+{
+    DECLARE_DOMCTL;
+    struct xen_domctl_mem_sharing_op *op;
+
+    domctl.cmd = XEN_DOMCTL_mem_sharing_op;
+    domctl.interface_version = XEN_DOMCTL_INTERFACE_VERSION;
+    domctl.domain = source_domain;
+    op = &(domctl.u.mem_sharing_op);
+    op->op = XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE;
+    op->u.share.source_handle = source_handle;
+    XEN_DOMCTL_MEM_SHARING_FIELD_MAKE_GREF(op->u.share.source_gfn, source_gref);
+    op->u.share.client_domain = client_domain;
+    XEN_DOMCTL_MEM_SHARING_FIELD_MAKE_GREF(op->u.share.client_gfn, client_gref);
+    op->u.share.client_handle = client_handle;
+
+    return do_domctl(xch, &domctl);
+}
+
+int xc_memshr_add_to_physmap(xc_interface *xch,
+                    domid_t source_domain,
+                    unsigned long source_gfn,
+                    uint64_t source_handle,
+                    domid_t client_domain,
+                    unsigned long client_gfn)
+{
+    DECLARE_DOMCTL;
+    struct xen_domctl_mem_sharing_op *op;
+
+    domctl.cmd                  = XEN_DOMCTL_mem_sharing_op;
+    domctl.interface_version    = XEN_DOMCTL_INTERFACE_VERSION;
+    domctl.domain               = source_domain;
+    op = &(domctl.u.mem_sharing_op);
+    op->op = XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP;
+    op->u.share.source_gfn      = source_gfn;
+    op->u.share.source_handle   = source_handle;
+    op->u.share.client_gfn      = client_gfn;
+    op->u.share.client_domain   = client_domain;
+
+    return do_domctl(xch, &domctl);
+}
+
 int xc_memshr_domain_resume(xc_interface *xch,
                             domid_t domid)
 {
diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -1894,9 +1894,26 @@ int xc_memshr_nominate_gref(xc_interface
                             domid_t domid,
                             grant_ref_t gref,
                             uint64_t *handle);
-int xc_memshr_share(xc_interface *xch,
+int xc_memshr_share_gfns(xc_interface *xch,
+                    domid_t source_domain,
+                    unsigned long source_gfn,
                     uint64_t source_handle,
+                    domid_t client_domain,
+                    unsigned long client_gfn,
                     uint64_t client_handle);
+int xc_memshr_share_grefs(xc_interface *xch,
+                    domid_t source_domain,
+                    grant_ref_t source_gref,
+                    uint64_t source_handle,
+                    domid_t client_domain,
+                    grant_ref_t client_gref,
+                    uint64_t client_handle);
+int xc_memshr_add_to_physmap(xc_interface *xch,
+                    domid_t source_domain,
+                    unsigned long source_gfn,
+                    uint64_t source_handle,
+                    domid_t client_domain,
+                    unsigned long client_gfn);
 int xc_memshr_domain_resume(xc_interface *xch,
                             domid_t domid);
 int xc_memshr_debug_gfn(xc_interface *xch,
diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/memshr/bidir-daemon.c
--- a/tools/memshr/bidir-daemon.c
+++ b/tools/memshr/bidir-daemon.c
@@ -19,16 +19,25 @@
 #include <pthread.h>
 #include <inttypes.h>
 #include <unistd.h>
+#include <errno.h>
 
 #include "bidir-hash.h"
 #include "memshr-priv.h"
 
 static struct blockshr_hash *blks_hash;
 
+/* Callback in the iterator, remember this value, and leave */
+int find_one(vbdblk_t k, share_tuple_t v, void *priv)
+{
+    share_tuple_t *rv = (share_tuple_t *) priv;
+    *rv = v;
+    /* Break out of iterator loop */
+    return 1;
+}
+
 void* bidir_daemon(void *unused)
 {
     uint32_t nr_ent, max_nr_ent, tab_size, max_load, min_load;
-    static uint64_t shrhnd = 1;
 
     while(1)
     {
@@ -41,20 +50,30 @@ void* bidir_daemon(void *unused)
         /* Remove some hints as soon as we get to 90% capacity */ 
         if(10 * nr_ent > 9 * max_nr_ent)
         {
-            uint64_t next_remove = shrhnd;
+            share_tuple_t next_remove;
             int to_remove;
             int ret;
 
             to_remove = 0.1 * max_nr_ent; 
             while(to_remove > 0) 
             {
-                ret = blockshr_shrhnd_remove(blks_hash, next_remove, NULL);
-                if(ret < 0)
+                /* We use the iterator to get one entry */
+                next_remove.handle = 0;
+                ret = blockshr_hash_iterator(blks_hash, find_one, &next_remove);
+
+                if ( !ret )
+                    if ( next_remove.handle == 0 )
+                        ret = -ESRCH;
+
+                if ( !ret )
+                    ret = blockshr_shrhnd_remove(blks_hash, next_remove, NULL);
+
+                if(ret <= 0)
                 {
                     /* We failed to remove an entry, because of a serious hash
                      * table error */
                     DPRINTF("Could not remove handle %"PRId64", error: %d\n",
-                            next_remove, ret);
+                            next_remove.handle, ret);
                     /* Force to exit the loop early */
                     to_remove = 0;
                 } else 
@@ -62,12 +81,7 @@ void* bidir_daemon(void *unused)
                 {
                     /* Managed to remove the entry. Note next_remove not
                      * incremented, in case there are duplicates */
-                    shrhnd = next_remove;
                     to_remove--;
-                } else
-                {
-                    /* Failed to remove, because there is no such handle */
-                    next_remove++;
                 }
             }
         }
diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/memshr/bidir-hash.h
--- a/tools/memshr/bidir-hash.h
+++ b/tools/memshr/bidir-hash.h
@@ -20,6 +20,7 @@
 #define __BIDIR_HASH_H__
 
 #include <stdint.h>
+#include <string.h>
 #include "memshr-priv.h"
 
 typedef struct vbdblk {
@@ -81,15 +82,16 @@ static int fgprtshr_mfn_cmp(uint32_t m1,
 #undef BIDIR_VALUE
 #undef BIDIR_KEY_T
 #undef BIDIR_VALUE_T
+
 /* TODO better hashes! */
 static inline uint32_t blockshr_block_hash(vbdblk_t block)
 {
     return (uint32_t)(block.sec) ^ (uint32_t)(block.disk_id);
 }
 
-static inline uint32_t blockshr_shrhnd_hash(uint64_t shrhnd)
+static inline uint32_t blockshr_shrhnd_hash(share_tuple_t shrhnd)
 {
-    return (uint32_t)shrhnd;
+    return ((uint32_t) shrhnd.handle);
 }
 
 static inline int blockshr_block_cmp(vbdblk_t b1, vbdblk_t b2)
@@ -97,15 +99,15 @@ static inline int blockshr_block_cmp(vbd
     return (b1.sec == b2.sec) && (b1.disk_id == b2.disk_id);
 }
 
-static inline int blockshr_shrhnd_cmp(uint64_t h1, uint64_t h2)
+static inline int blockshr_shrhnd_cmp(share_tuple_t h1, share_tuple_t h2)
 {
-    return (h1 == h2);
+    return ( !memcmp(&h1, &h2, sizeof(share_tuple_t)) );
 }
 #define BIDIR_NAME_PREFIX       blockshr
 #define BIDIR_KEY               block
 #define BIDIR_VALUE             shrhnd
 #define BIDIR_KEY_T             vbdblk_t
-#define BIDIR_VALUE_T           uint64_t
+#define BIDIR_VALUE_T           share_tuple_t
 #include "bidir-namedefs.h"
 
 #endif /* BLOCK_MAP */
diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/memshr/interface.c
--- a/tools/memshr/interface.c
+++ b/tools/memshr/interface.c
@@ -145,16 +145,17 @@ void memshr_vbd_image_put(uint16_t memsh
     
 int memshr_vbd_issue_ro_request(char *buf,
                                 grant_ref_t gref,
-                                uint16_t file_id, 
+                                uint16_t file_id,
                                 uint64_t sec, 
                                 int secs,
-                                uint64_t *hnd)
+                                share_tuple_t *hnd)
 {
     vbdblk_t blk;
-    uint64_t s_hnd, c_hnd;
+    share_tuple_t source_st, client_st;
+    uint64_t c_hnd;
     int ret;
 
-    *hnd = 0;
+    *hnd = (share_tuple_t){ 0, 0, 0 };
     if(!vbd_info.enabled) 
         return -1;
 
@@ -169,26 +170,31 @@ int memshr_vbd_issue_ro_request(char *bu
     /* If page couldn't be made sharable, we cannot do anything about it */
     if(ret != 0)
         return -3;
-    *hnd = c_hnd;
+
+    client_st = (share_tuple_t){ vbd_info.domid, gref, c_hnd };
+    *hnd = client_st;
 
     /* Check if we've read matching disk block previously */
     blk.sec     = sec;
     blk.disk_id = file_id;
-    if(blockshr_block_lookup(memshr.blks, blk, &s_hnd) > 0)
+    if(blockshr_block_lookup(memshr.blks, blk, &source_st) > 0)
     {
-        ret = xc_memshr_share(vbd_info.xc_handle, s_hnd, c_hnd);
+        ret = xc_memshr_share_grefs(vbd_info.xc_handle, source_st.domain, source_st.frame, 
+                                    source_st.handle, vbd_info.domid, gref, c_hnd);
         if(!ret) return 0;
         /* Handles failed to be shared => at least one of them must be invalid,
            remove the relevant ones from the map */
         switch(ret)
         {
             case XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID:
-                ret = blockshr_shrhnd_remove(memshr.blks, s_hnd, NULL);
-                if(ret) DPRINTF("Could not rm invl s_hnd: %"PRId64"\n", s_hnd);
+                ret = blockshr_shrhnd_remove(memshr.blks, source_st, NULL);
+                if(ret) DPRINTF("Could not rm invl s_hnd: %u %"PRId64" %"PRId64"\n", 
+                                    source_st.domain, source_st.frame, source_st.handle);
                 break;
             case XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID:
-                ret = blockshr_shrhnd_remove(memshr.blks, c_hnd, NULL);
-                if(ret) DPRINTF("Could not rm invl c_hnd: %"PRId64"\n", c_hnd);
+                ret = blockshr_shrhnd_remove(memshr.blks, client_st, NULL);
+                if(ret) DPRINTF("Could not rm invl c_hnd: %u %"PRId64" %"PRId64"\n", 
+                                    client_st.domain, client_st.frame, client_st.handle);
                 break;
             default:
                 break;
@@ -199,7 +205,7 @@ int memshr_vbd_issue_ro_request(char *bu
     return -4;
 }
 
-void memshr_vbd_complete_ro_request(uint64_t hnd,
+void memshr_vbd_complete_ro_request(share_tuple_t hnd,
                                     uint16_t file_id, 
                                     uint64_t sec, 
                                     int secs)
diff -r 288ac8cfea7f -r 2d6a14ce5b0e tools/memshr/memshr.h
--- a/tools/memshr/memshr.h
+++ b/tools/memshr/memshr.h
@@ -25,6 +25,13 @@
 
 typedef uint64_t xen_mfn_t;
 
+typedef struct share_tuple 
+{
+    uint32_t domain;
+    uint64_t frame;
+    uint64_t handle;
+} share_tuple_t;
+
 extern void memshr_set_domid(int domid);
 extern void memshr_daemon_initialize(void);
 extern void memshr_vbd_initialize(void);
@@ -35,9 +42,9 @@ extern int memshr_vbd_issue_ro_request(c
                                        uint16_t file_id, 
                                        uint64_t sec, 
                                        int secs,
-                                       uint64_t *hnd);
+                                       share_tuple_t *hnd);
 extern void memshr_vbd_complete_ro_request(
-                                       uint64_t hnd,
+                                       share_tuple_t hnd,
                                        uint16_t file_id, 
                                        uint64_t sec, 
                                        int secs);

  parent reply	other threads:[~2012-01-16  2:56 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-16  2:56 [PATCH 00 of 12] Sharing overhaul Andres Lagar-Cavilla
2012-01-16  2:56 ` [PATCH 01 of 12] x86/mm: Eliminate hash table in sharing code as index of shared mfns Andres Lagar-Cavilla
2012-01-19 11:39   ` Tim Deegan
2012-01-19 13:12     ` Andres Lagar-Cavilla
2012-01-16  2:56 ` [PATCH 02 of 12] x86/mm: Update mem sharing interface to (re)allow sharing of grants Andres Lagar-Cavilla
2012-01-19 11:56   ` Tim Deegan
2012-01-16  2:56 ` [PATCH 03 of 12] x86/mm: Add per-page locking for memory sharing, when audits are disabled Andres Lagar-Cavilla
2012-01-19 12:13   ` Tim Deegan
2012-01-19 13:02   ` Tim Deegan
2012-01-19 13:06     ` Andres Lagar-Cavilla
2012-01-19 13:13       ` Tim Deegan
2012-01-16  2:56 ` [PATCH 04 of 12] x86/mm: Enforce lock ordering for sharing page locks Andres Lagar-Cavilla
2012-01-19 12:18   ` Tim Deegan
2012-01-16  2:56 ` [PATCH 05 of 12] x86/mm: Check how many mfns are shared, in addition to how many are saved Andres Lagar-Cavilla
2012-01-16  2:56 ` [PATCH 06 of 12] x86/mm: New domctl: add a shared page to the physmap Andres Lagar-Cavilla
2012-01-19 13:04   ` Tim Deegan
2012-01-19 13:09     ` Andres Lagar-Cavilla
2012-01-16  2:56 ` [PATCH 07 of 12] Add the ability to poll stats about shared memory via the console Andres Lagar-Cavilla
2012-01-19 12:53   ` Tim Deegan
2012-01-16  2:56 ` [PATCH 08 of 12] x86/mm: use RCU in mem sharing audit list, eliminate global lock completely Andres Lagar-Cavilla
2012-01-19 12:59   ` Tim Deegan
2012-01-19 13:03     ` Andres Lagar-Cavilla
2012-01-19 13:14       ` Tim Deegan
2012-01-16  2:56 ` Andres Lagar-Cavilla [this message]
2012-01-23 15:14   ` [PATCH 09 of 12] Update memshr API and tools Ian Campbell
2012-01-16  2:56 ` [PATCH 10 of 12] Tools: Expose to libxc the total number of shared frames and space saved Andres Lagar-Cavilla
2012-01-16  2:56 ` [PATCH 11 of 12] Tools: Add a sharing command to xl for information about shared pages Andres Lagar-Cavilla
2012-01-19 12:14   ` Ian Campbell
2012-01-16  2:56 ` [PATCH 12 of 12] Memshrtool: tool to test and exercise the sharing subsystem Andres Lagar-Cavilla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d6a14ce5b0eda6cc26c.1326682589@xdev.gridcentric.ca \
    --to=andres@lagarcavilla.org \
    --cc=JBeulich@suse.com \
    --cc=adin@gridcentric.ca \
    --cc=andres@gridcentric.ca \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@citrix.com \
    --cc=keir.xen@gmail.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).