xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] Tmem bug-fixes and cleanups.
@ 2015-08-27 11:01 Konrad Rzeszutek Wilk
  2015-08-27 11:01 ` [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest Konrad Rzeszutek Wilk
                   ` (9 more replies)
  0 siblings, 10 replies; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:01 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2

Hey!

At the Xenhackathon we spoke that the tmem code needs a bit of cleanups
and simplification. One of the things that Andrew mentioned was that the
TMEM_CONTROL should really be part of the sysctl hypercall. As I ventured
this path I realized there were some other issues that need to be taken
care of (like shared pools blowing up).

This patchset has been tested with 32/64 guests, with a 64 hypervisor
and 32 bit toolstack (and also with 64bit toolstack) with success.

For fun I've also created an Linux module:
http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/tmem_test/tmem_test.c
that I will expand to cover in the future more interesting hypercall
uses.

Going forward the next step will be to move the 'tmem_control' function to
its own file to simplify the code.

The patches are also in my git tree:

git://xenbits.xen.org/people/konradwilk/xen.git for-4.6/tmem.cleanups.v2

 tools/libxc/include/xenctrl.h          |   2 +-
 tools/libxc/xc_tmem.c                  | 111 ++++++++++----------
 tools/libxl/libxl.c                    |  22 ++--
 tools/python/xen/lowlevel/xc/xc.c      |  27 +++--
 tools/xenstat/libxenstat/src/xenstat.c |   6 +-
 xen/common/sysctl.c                    |   7 +-
 xen/common/tmem.c                      | 183 +++++++++++++++++----------------
 xen/include/public/sysctl.h            |  44 ++++++++
 xen/include/public/tmem.h              |  39 +------
 xen/include/xen/tmem.h                 |   3 +
 xen/include/xen/tmem_xen.h             |   1 -
 xen/include/xsm/dummy.h                |   6 --
 xen/include/xsm/xsm.h                  |   6 --
 xen/xsm/dummy.c                        |   1 -
 xen/xsm/flask/hooks.c                  |   9 +-
 xen/xsm/flask/policy/access_vectors    |   2 +-
 16 files changed, 237 insertions(+), 232 deletions(-)
Konrad Rzeszutek Wilk (8):
      tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest.
      tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock.
      tmem: remove in xc_tmem_control_oid duplicate set_xen_guest_handle call
      tmem: Remove xc_tmem_control mystical arg3
      tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
      tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall.
      tmem: Remove extra spaces at end and some hard tabbing.
      tmem: Spelling mistakes.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
@ 2015-08-27 11:01 ` Konrad Rzeszutek Wilk
  2015-08-27 11:21   ` Wei Liu
  2015-08-27 11:01 ` [PATCH v2 2/8] tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock Konrad Rzeszutek Wilk
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:01 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2; +Cc: Konrad Rzeszutek Wilk

When we are using shared pools we have an global array
(on which we put the pool), and an array of pools per domain.
We also have an shared list of clients (guests) _except_
for the very first domain that created the shared pool.

To deal with multiple guests using an shared pool we have an
ref count and a linked list. Whenever an new user of
a the shared pool joins we increase the ref count and add to
the linked list. Whenever an user quits the shared pool
we decrement and remove from the linked list.

Unfortunately this ref counting and linked list never
worked properly. There are multiple issues:

 1) If we have one shared pool (and only one guest creating it)
    - we do not add it to the shared list of clients. Which
    means the logic in 'shared_pool_quit' never removed
    the pool from global_shared_pools. That meant when the pool
    was de-allocated - we still had an pointer to the pool
    which would be accessed by tmemc_list_client (xl tmem-list -a)
    and hit a NULL page!

 2). If we have two shared pools in a domain - it (shared_pool_quit)
     would remove the domain from the share_list linked list, decrements
     the refcount to zero - and remove the pool from the global shared pool.
     When done it would also clear the client->pools[] to NULL for itself.
     Which is good. However since there are two shared pools in the domain
     the next entry in the client->pools[] would have a stale pointer to
     the just de-allocated pool. Accessing it and trying to de-allocate it
     would lead to crashes or hypervisor hang - depending on the build.

Fun times!

To trigger this use
http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/tmem_test/tmem_test.c

This patch fixes it by making the very first domain that created
an shared pool to follow the same logic as every domain that is
joining a shared pool. That is increment the refcount and also
add itself to the shared list of domains using it.

We also remove an ASSERT that incorrectly assumed
that only one shared pool would exist for a domain.

And to mirror the reporting logic in shared_pool_join
we also add a printk to advertise inter-domain shared pool
joining.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/common/tmem.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index f2dc26e..572944e 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -1037,6 +1037,9 @@ static int shared_pool_join(struct tmem_pool *pool, struct client *new_client)
         tmem_client_info("adding new %s %d to shared pool owned by %s %d\n",
                     tmem_client_str, new_client->cli_id, tmem_client_str,
                     pool->client->cli_id);
+    else if (pool->shared_count)
+        tmem_client_info("inter-guest sharing of shared pool %s by client %d\n",
+                         tmem_client_str, pool->client->cli_id);
     ++pool->shared_count;
     return 0;
 }
@@ -1056,7 +1059,10 @@ static void shared_pool_reassign(struct tmem_pool *pool)
     }
     old_client->pools[pool->pool_id] = NULL;
     sl = list_entry(pool->share_list.next, struct share_list, share_list);
-    ASSERT(sl->client != old_client);
+    /*
+     * The sl->client can be old_client if there are multiple shared pools
+     * within an guest.
+     */
     pool->client = new_client = sl->client;
     for (poolid = 0; poolid < MAX_POOLS_PER_DOMAIN; poolid++)
         if (new_client->pools[poolid] == pool)
@@ -1982,6 +1988,8 @@ static int do_tmem_new_pool(domid_t this_cli_id,
         {
             INIT_LIST_HEAD(&pool->share_list);
             pool->shared_count = 0;
+            if (shared_pool_join(pool, client))
+                goto fail;
             global_shared_pools[first_unused_s_poolid] = pool;
         }
     }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 2/8] tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
  2015-08-27 11:01 ` [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest Konrad Rzeszutek Wilk
@ 2015-08-27 11:01 ` Konrad Rzeszutek Wilk
  2015-08-27 14:55   ` Jan Beulich
  2015-08-27 11:01 ` [PATCH v2 3/8] tmem: remove in xc_tmem_control_oid duplicate set_xen_guest_handle call Konrad Rzeszutek Wilk
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:01 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2; +Cc: Konrad Rzeszutek Wilk

Manipulating the obj-> structures requires us to hold the
pool->rwlock lock. Lets make that obvious in this function to
catch any errant users (none found, but we may in future).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/common/tmem.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index 572944e..567ccc5 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -915,6 +915,11 @@ static int obj_rb_insert(struct rb_root *root, struct tmem_object_root *obj)
 {
     struct rb_node **new, *parent = NULL;
     struct tmem_object_root *this;
+    struct tmem_pool *pool;
+
+    pool = obj->pool;
+    ASSERT(pool != NULL);
+    ASSERT_WRITELOCK(&pool->pool_rwlock);
 
     new = &(root->rb_node);
     while ( *new )
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 3/8] tmem: remove in xc_tmem_control_oid duplicate set_xen_guest_handle call
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
  2015-08-27 11:01 ` [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest Konrad Rzeszutek Wilk
  2015-08-27 11:01 ` [PATCH v2 2/8] tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock Konrad Rzeszutek Wilk
@ 2015-08-27 11:01 ` Konrad Rzeszutek Wilk
  2015-08-27 11:01 ` [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3 Konrad Rzeszutek Wilk
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:01 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2; +Cc: Konrad Rzeszutek Wilk

We are doing another call to set_xen_guest_handle right
after the xc_hypercall_bounce_pre (the correct place to do it).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/libxc/xc_tmem.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c
index 8352233..0d1bfb4 100644
--- a/tools/libxc/xc_tmem.c
+++ b/tools/libxc/xc_tmem.c
@@ -110,7 +110,6 @@ int xc_tmem_control_oid(xc_interface *xch,
     op.pool_id = pool_id;
     op.u.ctrl.subop = subop;
     op.u.ctrl.cli_id = cli_id;
-    set_xen_guest_handle(op.u.ctrl.buf,buf);
     op.u.ctrl.arg1 = arg1;
     op.u.ctrl.arg2 = arg2;
     op.u.ctrl.oid[0] = oid.oid[0];
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
                   ` (2 preceding siblings ...)
  2015-08-27 11:01 ` [PATCH v2 3/8] tmem: remove in xc_tmem_control_oid duplicate set_xen_guest_handle call Konrad Rzeszutek Wilk
@ 2015-08-27 11:01 ` Konrad Rzeszutek Wilk
  2015-08-27 11:15   ` Wei Liu
  2015-08-27 12:53   ` Andrew Cooper
  2015-08-27 11:02 ` [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl Konrad Rzeszutek Wilk
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:01 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2; +Cc: Konrad Rzeszutek Wilk

It mentions it but it is never used. The hypercall interface
knows nothing of this sort of thing either. Lets just remove it.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/libxc/include/xenctrl.h          |  2 +-
 tools/libxc/xc_tmem.c                  | 38 ++++++++++++++++------------------
 tools/libxl/libxl.c                    | 10 ++++-----
 tools/python/xen/lowlevel/xc/xc.c      |  7 +++----
 tools/xenstat/libxenstat/src/xenstat.c |  4 ++--
 5 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index de3c0ad..9e34769 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2304,7 +2304,7 @@ int xc_tmem_control_oid(xc_interface *xch, int32_t pool_id, uint32_t subop,
                         struct tmem_oid oid, void *buf);
 int xc_tmem_control(xc_interface *xch,
                     int32_t pool_id, uint32_t subop, uint32_t cli_id,
-                    uint32_t arg1, uint32_t arg2, uint64_t arg3, void *buf);
+                    uint32_t arg1, uint32_t arg2, void *buf);
 int xc_tmem_auth(xc_interface *xch, int cli_id, char *uuid_str, int arg1);
 int xc_tmem_save(xc_interface *xch, int dom, int live, int fd, int field_marker);
 int xc_tmem_save_extra(xc_interface *xch, int dom, int fd, int field_marker);
diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c
index 0d1bfb4..467001b 100644
--- a/tools/libxc/xc_tmem.c
+++ b/tools/libxc/xc_tmem.c
@@ -51,7 +51,6 @@ int xc_tmem_control(xc_interface *xch,
                     uint32_t cli_id,
                     uint32_t arg1,
                     uint32_t arg2,
-                    uint64_t arg3,
                     void *buf)
 {
     tmem_op_t op;
@@ -64,7 +63,6 @@ int xc_tmem_control(xc_interface *xch,
     op.u.ctrl.cli_id = cli_id;
     op.u.ctrl.arg1 = arg1;
     op.u.ctrl.arg2 = arg2;
-    /* use xc_tmem_control_oid if arg3 is required */
     op.u.ctrl.oid[0] = 0;
     op.u.ctrl.oid[1] = 0;
     op.u.ctrl.oid[2] = 0;
@@ -220,28 +218,28 @@ int xc_tmem_save(xc_interface *xch,
     uint32_t minusone = -1;
     struct tmem_handle *h;
 
-    if ( xc_tmem_control(xch,0,TMEMC_SAVE_BEGIN,dom,live,0,0,NULL) <= 0 )
+    if ( xc_tmem_control(xch,0,TMEMC_SAVE_BEGIN,dom,live,0,NULL) <= 0 )
         return 0;
 
     if ( write_exact(io_fd, &marker, sizeof(marker)) )
         return -1;
-    version = xc_tmem_control(xch,0,TMEMC_SAVE_GET_VERSION,0,0,0,0,NULL);
+    version = xc_tmem_control(xch,0,TMEMC_SAVE_GET_VERSION,0,0,0,NULL);
     if ( write_exact(io_fd, &version, sizeof(version)) )
         return -1;
-    max_pools = xc_tmem_control(xch,0,TMEMC_SAVE_GET_MAXPOOLS,0,0,0,0,NULL);
+    max_pools = xc_tmem_control(xch,0,TMEMC_SAVE_GET_MAXPOOLS,0,0,0,NULL);
     if ( write_exact(io_fd, &max_pools, sizeof(max_pools)) )
         return -1;
     if ( version == -1 || max_pools == -1 )
         return -1;
     if ( write_exact(io_fd, &minusone, sizeof(minusone)) )
         return -1;
-    flags = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_FLAGS,dom,0,0,0,NULL);
+    flags = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_FLAGS,dom,0,0,NULL);
     if ( write_exact(io_fd, &flags, sizeof(flags)) )
         return -1;
-    weight = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_WEIGHT,dom,0,0,0,NULL);
+    weight = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_WEIGHT,dom,0,0,NULL);
     if ( write_exact(io_fd, &weight, sizeof(weight)) )
         return -1;
-    cap = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_CAP,dom,0,0,0,NULL);
+    cap = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_CAP,dom,0,0,NULL);
     if ( write_exact(io_fd, &cap, sizeof(cap)) )
         return -1;
     if ( flags == -1 || weight == -1 || cap == -1 )
@@ -258,14 +256,14 @@ int xc_tmem_save(xc_interface *xch,
         int checksum = 0;
 
         /* get pool id, flags, pagesize, n_pages, uuid */
-        flags = xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_FLAGS,dom,0,0,0,NULL);
+        flags = xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_FLAGS,dom,0,0,NULL);
         if ( flags != -1 )
         {
             pool_id = i;
-            n_pages = xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_NPAGES,dom,0,0,0,NULL);
+            n_pages = xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_NPAGES,dom,0,0,NULL);
             if ( !(flags & TMEM_POOL_PERSIST) )
                 n_pages = 0;
-            (void)xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_UUID,dom,sizeof(uuid),0,0,&uuid);
+            (void)xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_UUID,dom,sizeof(uuid),0,&uuid);
             if ( write_exact(io_fd, &pool_id, sizeof(pool_id)) )
                 return -1;
             if ( write_exact(io_fd, &flags, sizeof(flags)) )
@@ -290,7 +288,7 @@ int xc_tmem_save(xc_interface *xch,
                 int ret;
                 if ( (ret = xc_tmem_control(xch, pool_id,
                                             TMEMC_SAVE_GET_NEXT_PAGE, dom,
-                                            bufsize, 0, 0, buf)) > 0 )
+                                            bufsize, 0, buf)) > 0 )
                 {
                     h = (struct tmem_handle *)buf;
                     if ( write_exact(io_fd, &h->oid, sizeof(h->oid)) )
@@ -335,7 +333,7 @@ int xc_tmem_save_extra(xc_interface *xch, int dom, int io_fd, int field_marker)
     if ( write_exact(io_fd, &marker, sizeof(marker)) )
         return -1;
     while ( xc_tmem_control(xch, 0, TMEMC_SAVE_GET_NEXT_INV, dom,
-                            sizeof(handle),0,0,&handle) > 0 ) {
+                            sizeof(handle),0,&handle) > 0 ) {
         if ( write_exact(io_fd, &handle.pool_id, sizeof(handle.pool_id)) )
             return -1;
         if ( write_exact(io_fd, &handle.oid, sizeof(handle.oid)) )
@@ -357,7 +355,7 @@ int xc_tmem_save_extra(xc_interface *xch, int dom, int io_fd, int field_marker)
 /* only called for live migration */
 void xc_tmem_save_done(xc_interface *xch, int dom)
 {
-    xc_tmem_control(xch,0,TMEMC_SAVE_END,dom,0,0,0,NULL);
+    xc_tmem_control(xch,0,TMEMC_SAVE_END,dom,0,0,NULL);
 }
 
 /* restore routines */
@@ -391,7 +389,7 @@ int xc_tmem_restore(xc_interface *xch, int dom, int io_fd)
     uint32_t weight, cap, flags;
     int checksum = 0;
 
-    save_version = xc_tmem_control(xch,0,TMEMC_SAVE_GET_VERSION,dom,0,0,0,NULL);
+    save_version = xc_tmem_control(xch,0,TMEMC_SAVE_GET_VERSION,dom,0,0,NULL);
     if ( save_version == -1 )
         return -1; /* domain doesn't exist */
     if ( read_exact(io_fd, &this_version, sizeof(this_version)) )
@@ -403,23 +401,23 @@ int xc_tmem_restore(xc_interface *xch, int dom, int io_fd)
         return -1;
     if ( minusone != -1 )
         return -1;
-    if ( xc_tmem_control(xch,0,TMEMC_RESTORE_BEGIN,dom,0,0,0,NULL) < 0 )
+    if ( xc_tmem_control(xch,0,TMEMC_RESTORE_BEGIN,dom,0,0,NULL) < 0 )
         return -1;
     if ( read_exact(io_fd, &flags, sizeof(flags)) )
         return -1;
     if ( flags & TMEM_CLIENT_COMPRESS )
-        if ( xc_tmem_control(xch,0,TMEMC_SET_COMPRESS,dom,1,0,0,NULL) < 0 )
+        if ( xc_tmem_control(xch,0,TMEMC_SET_COMPRESS,dom,1,0,NULL) < 0 )
             return -1;
     if ( flags & TMEM_CLIENT_FROZEN )
-        if ( xc_tmem_control(xch,0,TMEMC_FREEZE,dom,0,0,0,NULL) < 0 )
+        if ( xc_tmem_control(xch,0,TMEMC_FREEZE,dom,0,0,NULL) < 0 )
             return -1;
     if ( read_exact(io_fd, &weight, sizeof(weight)) )
         return -1;
-    if ( xc_tmem_control(xch,0,TMEMC_SET_WEIGHT,dom,0,0,0,NULL) < 0 )
+    if ( xc_tmem_control(xch,0,TMEMC_SET_WEIGHT,dom,0,0,NULL) < 0 )
         return -1;
     if ( read_exact(io_fd, &cap, sizeof(cap)) )
         return -1;
-    if ( xc_tmem_control(xch,0,TMEMC_SET_CAP,dom,0,0,0,NULL) < 0 )
+    if ( xc_tmem_control(xch,0,TMEMC_SET_CAP,dom,0,0,NULL) < 0 )
         return -1;
     if ( read_exact(io_fd, &minusone, sizeof(minusone)) )
         return -1;
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 4f2eb24..e564c22 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -6046,7 +6046,7 @@ char *libxl_tmem_list(libxl_ctx *ctx, uint32_t domid, int use_long)
     char _buf[32768];
 
     rc = xc_tmem_control(ctx->xch, -1, TMEMC_LIST, domid, 32768, use_long,
-                         0, _buf);
+                         _buf);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
             "Can not get tmem list");
@@ -6061,7 +6061,7 @@ int libxl_tmem_freeze(libxl_ctx *ctx, uint32_t domid)
     int rc;
 
     rc = xc_tmem_control(ctx->xch, -1, TMEMC_FREEZE, domid, 0, 0,
-                         0, NULL);
+                         NULL);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
             "Can not freeze tmem pools");
@@ -6076,7 +6076,7 @@ int libxl_tmem_thaw(libxl_ctx *ctx, uint32_t domid)
     int rc;
 
     rc = xc_tmem_control(ctx->xch, -1, TMEMC_THAW, domid, 0, 0,
-                         0, NULL);
+                         NULL);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
             "Can not thaw tmem pools");
@@ -6108,7 +6108,7 @@ int libxl_tmem_set(libxl_ctx *ctx, uint32_t domid, char* name, uint32_t set)
             "Invalid set, valid sets are <weight|cap|compress>");
         return ERROR_INVAL;
     }
-    rc = xc_tmem_control(ctx->xch, -1, subop, domid, set, 0, 0, NULL);
+    rc = xc_tmem_control(ctx->xch, -1, subop, domid, set, 0, NULL);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
             "Can not set tmem %s", name);
@@ -6137,7 +6137,7 @@ int libxl_tmem_freeable(libxl_ctx *ctx)
 {
     int rc;
 
-    rc = xc_tmem_control(ctx->xch, -1, TMEMC_QUERY_FREEABLE_MB, -1, 0, 0, 0, 0);
+    rc = xc_tmem_control(ctx->xch, -1, TMEMC_QUERY_FREEABLE_MB, -1, 0, 0, 0);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
             "Can not get tmem freeable memory");
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index 2c36eb2..a0395dc 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1798,21 +1798,20 @@ static PyObject *pyxc_tmem_control(XcObject *self,
     uint32_t cli_id;
     uint32_t arg1;
     uint32_t arg2;
-    uint64_t arg3;
     char *buf;
     char _buffer[32768], *buffer = _buffer;
     int rc;
 
-    static char *kwd_list[] = { "pool_id", "subop", "cli_id", "arg1", "arg2", "arg3", "buf", NULL };
+    static char *kwd_list[] = { "pool_id", "subop", "cli_id", "arg1", "arg2", "buf", NULL };
 
     if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iiiiiis", kwd_list,
-                        &pool_id, &subop, &cli_id, &arg1, &arg2, &arg3, &buf) )
+                        &pool_id, &subop, &cli_id, &arg1, &arg2, &buf) )
         return NULL;
 
     if ( (subop == TMEMC_LIST) && (arg1 > 32768) )
         arg1 = 32768;
 
-    if ( (rc = xc_tmem_control(self->xc_handle, pool_id, subop, cli_id, arg1, arg2, arg3, buffer)) < 0 )
+    if ( (rc = xc_tmem_control(self->xc_handle, pool_id, subop, cli_id, arg1, arg2, buffer)) < 0 )
         return Py_BuildValue("i", rc);
 
     switch (subop) {
diff --git a/tools/xenstat/libxenstat/src/xenstat.c b/tools/xenstat/libxenstat/src/xenstat.c
index dfffbbc..24b57d0 100644
--- a/tools/xenstat/libxenstat/src/xenstat.c
+++ b/tools/xenstat/libxenstat/src/xenstat.c
@@ -150,7 +150,7 @@ void domain_get_tmem_stats(xenstat_handle * handle, xenstat_domain * domain)
 	char buffer[4096];
 
 	if (xc_tmem_control(handle->xc_handle,-1,TMEMC_LIST,domain->id,
-                        sizeof(buffer),-1,-1,buffer) < 0)
+                        sizeof(buffer),-1,buffer) < 0)
 		return;
 	domain->tmem_stats.curr_eph_pages = parse(buffer,"Ec");
 	domain->tmem_stats.succ_eph_gets = parse(buffer,"Ge");
@@ -191,7 +191,7 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 	    * handle->page_size;
 
 	rc = xc_tmem_control(handle->xc_handle, -1,
-                         TMEMC_QUERY_FREEABLE_MB, -1, 0, 0, 0, NULL);
+                         TMEMC_QUERY_FREEABLE_MB, -1, 0, 0, NULL);
 	node->freeable_mb = (rc < 0) ? 0 : rc;
 	/* malloc(0) is not portable, so allocate a single domain.  This will
 	 * be resized below. */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
                   ` (3 preceding siblings ...)
  2015-08-27 11:01 ` [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3 Konrad Rzeszutek Wilk
@ 2015-08-27 11:02 ` Konrad Rzeszutek Wilk
  2015-08-27 11:22   ` Wei Liu
                     ` (2 more replies)
  2015-08-27 11:02 ` [PATCH v2 6/8] tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall Konrad Rzeszutek Wilk
                   ` (4 subsequent siblings)
  9 siblings, 3 replies; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:02 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2; +Cc: Konrad Rzeszutek Wilk

The operations are to be used by an control domain to set parameters,
list pools, clients, and to be used during migration.

There is no need to have them in the tmem hypercall path.

This patch moves code without adding fixes - and in fact in
some cases makes the parameters soo long that they hurt eyes - but
that is for another patch.

Note that in regards to existing users:

 - Only the control domain could call it - which meant that if
   a guest called it would get -EPERM, so we are OK there.
   In practice no guests called this TMEM_CONTROL command.

 - The spec: https://oss.oracle.com/projects/tmem/dist/documentation/api/tmemspec-v001.pdf
   mentions: "TBD [Not sure if this is really needed.]"
   which is a carte blanche as any to do this!

Note: The XSM check is the same - we just move it from do_tmem_op
to do_sysctl.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/libxc/xc_tmem.c                  | 104 ++++++++++++-------------
 tools/libxl/libxl.c                    |  14 ++--
 tools/python/xen/lowlevel/xc/xc.c      |  20 ++---
 tools/xenstat/libxenstat/src/xenstat.c |   4 +-
 xen/common/sysctl.c                    |   7 +-
 xen/common/tmem.c                      | 136 ++++++++++++++++-----------------
 xen/include/public/sysctl.h            |  44 +++++++++++
 xen/include/public/tmem.h              |  39 +---------
 xen/include/xen/tmem.h                 |   3 +
 xen/include/xen/tmem_xen.h             |   1 -
 xen/xsm/flask/hooks.c                  |   3 +
 11 files changed, 195 insertions(+), 180 deletions(-)

diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c
index 467001b..2f6ea17 100644
--- a/tools/libxc/xc_tmem.c
+++ b/tools/libxc/xc_tmem.c
@@ -47,27 +47,27 @@ static int do_tmem_op(xc_interface *xch, tmem_op_t *op)
 
 int xc_tmem_control(xc_interface *xch,
                     int32_t pool_id,
-                    uint32_t subop,
+                    uint32_t cmd,
                     uint32_t cli_id,
                     uint32_t arg1,
                     uint32_t arg2,
                     void *buf)
 {
-    tmem_op_t op;
+    DECLARE_SYSCTL;
     DECLARE_HYPERCALL_BOUNCE(buf, arg1, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
     int rc;
 
-    op.cmd = TMEM_CONTROL;
-    op.pool_id = pool_id;
-    op.u.ctrl.subop = subop;
-    op.u.ctrl.cli_id = cli_id;
-    op.u.ctrl.arg1 = arg1;
-    op.u.ctrl.arg2 = arg2;
-    op.u.ctrl.oid[0] = 0;
-    op.u.ctrl.oid[1] = 0;
-    op.u.ctrl.oid[2] = 0;
-
-    if ( subop == TMEMC_LIST && arg1 != 0 )
+    sysctl.cmd = XEN_SYSCTL_tmem_op;
+    sysctl.u.tmem_op.pool_id = pool_id;
+    sysctl.u.tmem_op.cmd = cmd;
+    sysctl.u.tmem_op.cli_id = cli_id;
+    sysctl.u.tmem_op.arg1 = arg1;
+    sysctl.u.tmem_op.arg2 = arg2;
+    sysctl.u.tmem_op.oid[0] = 0;
+    sysctl.u.tmem_op.oid[1] = 0;
+    sysctl.u.tmem_op.oid[2] = 0;
+
+    if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 )
     {
         if ( buf == NULL )
         {
@@ -81,11 +81,11 @@ int xc_tmem_control(xc_interface *xch,
         }
     }
 
-    set_xen_guest_handle(op.u.ctrl.buf, buf);
+    set_xen_guest_handle(sysctl.u.tmem_op.buf, buf);
 
-    rc = do_tmem_op(xch, &op);
+    rc = do_sysctl(xch, &sysctl);
 
-    if (subop == TMEMC_LIST && arg1 != 0)
+    if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 )
             xc_hypercall_bounce_post(xch, buf);
 
     return rc;
@@ -93,28 +93,28 @@ int xc_tmem_control(xc_interface *xch,
 
 int xc_tmem_control_oid(xc_interface *xch,
                         int32_t pool_id,
-                        uint32_t subop,
+                        uint32_t cmd,
                         uint32_t cli_id,
                         uint32_t arg1,
                         uint32_t arg2,
                         struct tmem_oid oid,
                         void *buf)
 {
-    tmem_op_t op;
+    DECLARE_SYSCTL;
     DECLARE_HYPERCALL_BOUNCE(buf, arg1, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
     int rc;
 
-    op.cmd = TMEM_CONTROL;
-    op.pool_id = pool_id;
-    op.u.ctrl.subop = subop;
-    op.u.ctrl.cli_id = cli_id;
-    op.u.ctrl.arg1 = arg1;
-    op.u.ctrl.arg2 = arg2;
-    op.u.ctrl.oid[0] = oid.oid[0];
-    op.u.ctrl.oid[1] = oid.oid[1];
-    op.u.ctrl.oid[2] = oid.oid[2];
-
-    if ( subop == TMEMC_LIST && arg1 != 0 )
+    sysctl.cmd = XEN_SYSCTL_tmem_op;
+    sysctl.u.tmem_op.pool_id = pool_id;
+    sysctl.u.tmem_op.cmd = cmd;
+    sysctl.u.tmem_op.cli_id = cli_id;
+    sysctl.u.tmem_op.arg1 = arg1;
+    sysctl.u.tmem_op.arg2 = arg2;
+    sysctl.u.tmem_op.oid[0] = oid.oid[0];
+    sysctl.u.tmem_op.oid[1] = oid.oid[1];
+    sysctl.u.tmem_op.oid[2] = oid.oid[2];
+
+    if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 )
     {
         if ( buf == NULL )
         {
@@ -128,11 +128,11 @@ int xc_tmem_control_oid(xc_interface *xch,
         }
     }
 
-    set_xen_guest_handle(op.u.ctrl.buf, buf);
+    set_xen_guest_handle(sysctl.u.tmem_op.buf, buf);
 
-    rc = do_tmem_op(xch, &op);
+    rc = do_sysctl(xch, &sysctl);
 
-    if (subop == TMEMC_LIST && arg1 != 0)
+    if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 )
             xc_hypercall_bounce_post(xch, buf);
 
     return rc;
@@ -218,28 +218,28 @@ int xc_tmem_save(xc_interface *xch,
     uint32_t minusone = -1;
     struct tmem_handle *h;
 
-    if ( xc_tmem_control(xch,0,TMEMC_SAVE_BEGIN,dom,live,0,NULL) <= 0 )
+    if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_BEGIN,dom,live,0,NULL) <= 0 )
         return 0;
 
     if ( write_exact(io_fd, &marker, sizeof(marker)) )
         return -1;
-    version = xc_tmem_control(xch,0,TMEMC_SAVE_GET_VERSION,0,0,0,NULL);
+    version = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION,0,0,0,NULL);
     if ( write_exact(io_fd, &version, sizeof(version)) )
         return -1;
-    max_pools = xc_tmem_control(xch,0,TMEMC_SAVE_GET_MAXPOOLS,0,0,0,NULL);
+    max_pools = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS,0,0,0,NULL);
     if ( write_exact(io_fd, &max_pools, sizeof(max_pools)) )
         return -1;
     if ( version == -1 || max_pools == -1 )
         return -1;
     if ( write_exact(io_fd, &minusone, sizeof(minusone)) )
         return -1;
-    flags = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_FLAGS,dom,0,0,NULL);
+    flags = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS,dom,0,0,NULL);
     if ( write_exact(io_fd, &flags, sizeof(flags)) )
         return -1;
-    weight = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_WEIGHT,dom,0,0,NULL);
+    weight = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT,dom,0,0,NULL);
     if ( write_exact(io_fd, &weight, sizeof(weight)) )
         return -1;
-    cap = xc_tmem_control(xch,0,TMEMC_SAVE_GET_CLIENT_CAP,dom,0,0,NULL);
+    cap = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP,dom,0,0,NULL);
     if ( write_exact(io_fd, &cap, sizeof(cap)) )
         return -1;
     if ( flags == -1 || weight == -1 || cap == -1 )
@@ -256,14 +256,14 @@ int xc_tmem_save(xc_interface *xch,
         int checksum = 0;
 
         /* get pool id, flags, pagesize, n_pages, uuid */
-        flags = xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_FLAGS,dom,0,0,NULL);
+        flags = xc_tmem_control(xch,i,XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS,dom,0,0,NULL);
         if ( flags != -1 )
         {
             pool_id = i;
-            n_pages = xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_NPAGES,dom,0,0,NULL);
+            n_pages = xc_tmem_control(xch,i,XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES,dom,0,0,NULL);
             if ( !(flags & TMEM_POOL_PERSIST) )
                 n_pages = 0;
-            (void)xc_tmem_control(xch,i,TMEMC_SAVE_GET_POOL_UUID,dom,sizeof(uuid),0,&uuid);
+            (void)xc_tmem_control(xch,i,XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID,dom,sizeof(uuid),0,&uuid);
             if ( write_exact(io_fd, &pool_id, sizeof(pool_id)) )
                 return -1;
             if ( write_exact(io_fd, &flags, sizeof(flags)) )
@@ -287,7 +287,7 @@ int xc_tmem_save(xc_interface *xch,
             {
                 int ret;
                 if ( (ret = xc_tmem_control(xch, pool_id,
-                                            TMEMC_SAVE_GET_NEXT_PAGE, dom,
+                                            XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE, dom,
                                             bufsize, 0, buf)) > 0 )
                 {
                     h = (struct tmem_handle *)buf;
@@ -332,7 +332,7 @@ int xc_tmem_save_extra(xc_interface *xch, int dom, int io_fd, int field_marker)
 
     if ( write_exact(io_fd, &marker, sizeof(marker)) )
         return -1;
-    while ( xc_tmem_control(xch, 0, TMEMC_SAVE_GET_NEXT_INV, dom,
+    while ( xc_tmem_control(xch, 0, XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV, dom,
                             sizeof(handle),0,&handle) > 0 ) {
         if ( write_exact(io_fd, &handle.pool_id, sizeof(handle.pool_id)) )
             return -1;
@@ -355,7 +355,7 @@ int xc_tmem_save_extra(xc_interface *xch, int dom, int io_fd, int field_marker)
 /* only called for live migration */
 void xc_tmem_save_done(xc_interface *xch, int dom)
 {
-    xc_tmem_control(xch,0,TMEMC_SAVE_END,dom,0,0,NULL);
+    xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_END,dom,0,0,NULL);
 }
 
 /* restore routines */
@@ -389,7 +389,7 @@ int xc_tmem_restore(xc_interface *xch, int dom, int io_fd)
     uint32_t weight, cap, flags;
     int checksum = 0;
 
-    save_version = xc_tmem_control(xch,0,TMEMC_SAVE_GET_VERSION,dom,0,0,NULL);
+    save_version = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION,dom,0,0,NULL);
     if ( save_version == -1 )
         return -1; /* domain doesn't exist */
     if ( read_exact(io_fd, &this_version, sizeof(this_version)) )
@@ -401,23 +401,23 @@ int xc_tmem_restore(xc_interface *xch, int dom, int io_fd)
         return -1;
     if ( minusone != -1 )
         return -1;
-    if ( xc_tmem_control(xch,0,TMEMC_RESTORE_BEGIN,dom,0,0,NULL) < 0 )
+    if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN,dom,0,0,NULL) < 0 )
         return -1;
     if ( read_exact(io_fd, &flags, sizeof(flags)) )
         return -1;
     if ( flags & TMEM_CLIENT_COMPRESS )
-        if ( xc_tmem_control(xch,0,TMEMC_SET_COMPRESS,dom,1,0,NULL) < 0 )
+        if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SET_COMPRESS,dom,1,0,NULL) < 0 )
             return -1;
     if ( flags & TMEM_CLIENT_FROZEN )
-        if ( xc_tmem_control(xch,0,TMEMC_FREEZE,dom,0,0,NULL) < 0 )
+        if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_FREEZE,dom,0,0,NULL) < 0 )
             return -1;
     if ( read_exact(io_fd, &weight, sizeof(weight)) )
         return -1;
-    if ( xc_tmem_control(xch,0,TMEMC_SET_WEIGHT,dom,0,0,NULL) < 0 )
+    if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SET_WEIGHT,dom,0,0,NULL) < 0 )
         return -1;
     if ( read_exact(io_fd, &cap, sizeof(cap)) )
         return -1;
-    if ( xc_tmem_control(xch,0,TMEMC_SET_CAP,dom,0,0,NULL) < 0 )
+    if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SET_CAP,dom,0,0,NULL) < 0 )
         return -1;
     if ( read_exact(io_fd, &minusone, sizeof(minusone)) )
         return -1;
@@ -464,7 +464,7 @@ int xc_tmem_restore(xc_interface *xch, int dom, int io_fd)
                 return -1;
             checksum += *buf;
             if ( (rc = xc_tmem_control_oid(xch, pool_id,
-                                           TMEMC_RESTORE_PUT_PAGE, dom,
+                                           XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE, dom,
                                            bufsize, index, oid, buf)) <= 0 )
             {
                 DPRINTF("xc_tmem_restore: putting page failed, rc=%d\n",rc);
@@ -496,7 +496,7 @@ int xc_tmem_restore_extra(xc_interface *xch, int dom, int io_fd)
             return -1;
         if ( read_exact(io_fd, &index, sizeof(index)) )
             return -1;
-        if ( xc_tmem_control_oid(xch, pool_id, TMEMC_RESTORE_FLUSH_PAGE, dom,
+        if ( xc_tmem_control_oid(xch, pool_id, XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE, dom,
                              0,index,oid,NULL) <= 0 )
             return -1;
         count++;
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index e564c22..10d1909 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -6045,7 +6045,7 @@ char *libxl_tmem_list(libxl_ctx *ctx, uint32_t domid, int use_long)
     int rc;
     char _buf[32768];
 
-    rc = xc_tmem_control(ctx->xch, -1, TMEMC_LIST, domid, 32768, use_long,
+    rc = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_LIST, domid, 32768, use_long,
                          _buf);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
@@ -6060,7 +6060,7 @@ int libxl_tmem_freeze(libxl_ctx *ctx, uint32_t domid)
 {
     int rc;
 
-    rc = xc_tmem_control(ctx->xch, -1, TMEMC_FREEZE, domid, 0, 0,
+    rc = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_FREEZE, domid, 0, 0,
                          NULL);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
@@ -6075,7 +6075,7 @@ int libxl_tmem_thaw(libxl_ctx *ctx, uint32_t domid)
 {
     int rc;
 
-    rc = xc_tmem_control(ctx->xch, -1, TMEMC_THAW, domid, 0, 0,
+    rc = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_THAW, domid, 0, 0,
                          NULL);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
@@ -6089,11 +6089,11 @@ int libxl_tmem_thaw(libxl_ctx *ctx, uint32_t domid)
 static int32_t tmem_setop_from_string(char *set_name)
 {
     if (!strcmp(set_name, "weight"))
-        return TMEMC_SET_WEIGHT;
+        return XEN_SYSCTL_TMEM_OP_SET_WEIGHT;
     else if (!strcmp(set_name, "cap"))
-        return TMEMC_SET_CAP;
+        return XEN_SYSCTL_TMEM_OP_SET_CAP;
     else if (!strcmp(set_name, "compress"))
-        return TMEMC_SET_COMPRESS;
+        return XEN_SYSCTL_TMEM_OP_SET_COMPRESS;
     else
         return -1;
 }
@@ -6137,7 +6137,7 @@ int libxl_tmem_freeable(libxl_ctx *ctx)
 {
     int rc;
 
-    rc = xc_tmem_control(ctx->xch, -1, TMEMC_QUERY_FREEABLE_MB, -1, 0, 0, 0);
+    rc = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB, -1, 0, 0, 0);
     if (rc < 0) {
         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc,
             "Can not get tmem freeable memory");
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index a0395dc..5264854 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1808,25 +1808,25 @@ static PyObject *pyxc_tmem_control(XcObject *self,
                         &pool_id, &subop, &cli_id, &arg1, &arg2, &buf) )
         return NULL;
 
-    if ( (subop == TMEMC_LIST) && (arg1 > 32768) )
+    if ( (subop == XEN_SYSCTL_TMEM_OP_LIST) && (arg1 > 32768) )
         arg1 = 32768;
 
     if ( (rc = xc_tmem_control(self->xc_handle, pool_id, subop, cli_id, arg1, arg2, buffer)) < 0 )
         return Py_BuildValue("i", rc);
 
     switch (subop) {
-        case TMEMC_LIST:
+        case XEN_SYSCTL_TMEM_OP_LIST:
             return Py_BuildValue("s", buffer);
-        case TMEMC_FLUSH:
+        case XEN_SYSCTL_TMEM_OP_FLUSH:
             return Py_BuildValue("i", rc);
-        case TMEMC_QUERY_FREEABLE_MB:
+        case XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB:
             return Py_BuildValue("i", rc);
-        case TMEMC_THAW:
-        case TMEMC_FREEZE:
-        case TMEMC_DESTROY:
-        case TMEMC_SET_WEIGHT:
-        case TMEMC_SET_CAP:
-        case TMEMC_SET_COMPRESS:
+        case XEN_SYSCTL_TMEM_OP_THAW:
+        case XEN_SYSCTL_TMEM_OP_FREEZE:
+        case XEN_SYSCTL_TMEM_OP_DESTROY:
+        case XEN_SYSCTL_TMEM_OP_SET_WEIGHT:
+        case XEN_SYSCTL_TMEM_OP_SET_CAP:
+        case XEN_SYSCTL_TMEM_OP_SET_COMPRESS:
         default:
             break;
     }
diff --git a/tools/xenstat/libxenstat/src/xenstat.c b/tools/xenstat/libxenstat/src/xenstat.c
index 24b57d0..3495f3f 100644
--- a/tools/xenstat/libxenstat/src/xenstat.c
+++ b/tools/xenstat/libxenstat/src/xenstat.c
@@ -149,7 +149,7 @@ void domain_get_tmem_stats(xenstat_handle * handle, xenstat_domain * domain)
 {
 	char buffer[4096];
 
-	if (xc_tmem_control(handle->xc_handle,-1,TMEMC_LIST,domain->id,
+	if (xc_tmem_control(handle->xc_handle,-1,XEN_SYSCTL_TMEM_OP_LIST,domain->id,
                         sizeof(buffer),-1,buffer) < 0)
 		return;
 	domain->tmem_stats.curr_eph_pages = parse(buffer,"Ec");
@@ -191,7 +191,7 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 	    * handle->page_size;
 
 	rc = xc_tmem_control(handle->xc_handle, -1,
-                         TMEMC_QUERY_FREEABLE_MB, -1, 0, 0, NULL);
+                         XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB, -1, 0, 0, NULL);
 	node->freeable_mb = (rc < 0) ? 0 : rc;
 	/* malloc(0) is not portable, so allocate a single domain.  This will
 	 * be resized below. */
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index f1c0c76..a4287c3 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -14,6 +14,7 @@
 #include <xen/domain.h>
 #include <xen/event.h>
 #include <xen/domain_page.h>
+#include <xen/tmem.h>
 #include <xen/trace.h>
 #include <xen/console.h>
 #include <xen/iocap.h>
@@ -68,7 +69,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
     case XEN_SYSCTL_tbuf_op:
         ret = tb_control(&op->u.tbuf_op);
         break;
-    
+
+    case XEN_SYSCTL_tmem_op:
+        ret = tmem_control(&op->u.tmem_op);
+        break;
+
     case XEN_SYSCTL_sched_id:
         op->u.sched_id.sched_id = sched_id();
         break;
diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index 567ccc5..1c331ac 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -18,6 +18,7 @@
 #include <xen/tmem_xen.h> /* host-specific (eg Xen) code goes here */
 #endif
 
+#include <public/sysctl.h>
 #include <xen/tmem.h>
 #include <xen/rbtree.h>
 #include <xen/radix-tree.h>
@@ -2014,8 +2015,8 @@ fail:
 static int tmemc_freeze_pools(domid_t cli_id, int arg)
 {
     struct client *client;
-    bool_t freeze = (arg == TMEMC_FREEZE) ? 1 : 0;
-    bool_t destroy = (arg == TMEMC_DESTROY) ? 1 : 0;
+    bool_t freeze = (arg == XEN_SYSCTL_TMEM_OP_FREEZE) ? 1 : 0;
+    bool_t destroy = (arg == XEN_SYSCTL_TMEM_OP_DESTROY) ? 1 : 0;
     char *s;
 
     s = destroy ? "destroyed" : ( freeze ? "frozen" : "thawed" );
@@ -2232,7 +2233,7 @@ static int __tmemc_set_var(struct client *client, uint32_t subop, uint32_t arg1)
 
     switch (subop)
     {
-    case TMEMC_SET_WEIGHT:
+    case XEN_SYSCTL_TMEM_OP_SET_WEIGHT:
         old_weight = client->weight;
         client->weight = arg1;
         tmem_client_info("tmem: weight set to %d for %s=%d\n",
@@ -2240,12 +2241,12 @@ static int __tmemc_set_var(struct client *client, uint32_t subop, uint32_t arg1)
         atomic_sub(old_weight,&client_weight_total);
         atomic_add(client->weight,&client_weight_total);
         break;
-    case TMEMC_SET_CAP:
+    case XEN_SYSCTL_TMEM_OP_SET_CAP:
         client->cap = arg1;
         tmem_client_info("tmem: cap set to %d for %s=%d\n",
                         arg1, tmem_cli_id_str, cli_id);
         break;
-    case TMEMC_SET_COMPRESS:
+    case XEN_SYSCTL_TMEM_OP_SET_COMPRESS:
         if ( tmem_dedup_enabled() )
         {
             tmem_client_warn("tmem: compression %s for all %ss, cannot be changed when tmem_dedup is enabled\n",
@@ -2348,7 +2349,7 @@ static int tmemc_save_subop(int cli_id, uint32_t pool_id,
 
     switch(subop)
     {
-    case TMEMC_SAVE_BEGIN:
+    case XEN_SYSCTL_TMEM_OP_SAVE_BEGIN:
         if ( client == NULL )
             return 0;
         for (p = 0; p < MAX_POOLS_PER_DOMAIN; p++)
@@ -2365,33 +2366,33 @@ static int tmemc_save_subop(int cli_id, uint32_t pool_id,
             client->live_migrating = 1;
         rc = 1;
         break;
-    case TMEMC_RESTORE_BEGIN:
+    case XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN:
         if ( client == NULL && (client = client_create(cli_id)) != NULL )
             return 1;
         break;
-    case TMEMC_SAVE_GET_VERSION:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION:
         rc = TMEM_SPEC_VERSION;
         break;
-    case TMEMC_SAVE_GET_MAXPOOLS:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS:
         rc = MAX_POOLS_PER_DOMAIN;
         break;
-    case TMEMC_SAVE_GET_CLIENT_WEIGHT:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT:
         if ( client == NULL )
             break;
         rc = client->weight == -1 ? -2 : client->weight;
         break;
-    case TMEMC_SAVE_GET_CLIENT_CAP:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP:
         if ( client == NULL )
             break;
         rc = client->cap == -1 ? -2 : client->cap;
         break;
-    case TMEMC_SAVE_GET_CLIENT_FLAGS:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS:
         if ( client == NULL )
             break;
         rc = (client->compress ? TMEM_CLIENT_COMPRESS : 0 ) |
              (client->was_frozen ? TMEM_CLIENT_FROZEN : 0 );
         break;
-    case TMEMC_SAVE_GET_POOL_FLAGS:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS:
          if ( pool == NULL )
              break;
          rc = (pool->persistent ? TMEM_POOL_PERSIST : 0) |
@@ -2399,19 +2400,19 @@ static int tmemc_save_subop(int cli_id, uint32_t pool_id,
               (POOL_PAGESHIFT << TMEM_POOL_PAGESIZE_SHIFT) |
               (TMEM_SPEC_VERSION << TMEM_POOL_VERSION_SHIFT);
         break;
-    case TMEMC_SAVE_GET_POOL_NPAGES:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES:
          if ( pool == NULL )
              break;
         rc = _atomic_read(pool->pgp_count);
         break;
-    case TMEMC_SAVE_GET_POOL_UUID:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID:
          if ( pool == NULL )
              break;
         rc = 0;
         if ( copy_to_guest(guest_handle_cast(buf, void), pool->uuid, 2) )
             rc = -EFAULT;
         break;
-    case TMEMC_SAVE_END:
+    case XEN_SYSCTL_TMEM_OP_SAVE_END:
         if ( client == NULL )
             break;
         client->live_migrating = 0;
@@ -2555,77 +2556,72 @@ static int tmemc_restore_flush_page(int cli_id, uint32_t pool_id, struct oid *oi
     return do_tmem_flush_page(pool,oidp,index);
 }
 
-static int do_tmem_control(struct tmem_op *op)
+int tmem_control(struct xen_sysctl_tmem_op *op)
 {
     int ret;
     uint32_t pool_id = op->pool_id;
-    uint32_t subop = op->u.ctrl.subop;
-    struct oid *oidp = (struct oid *)(&op->u.ctrl.oid[0]);
+    uint32_t cmd = op->cmd;
+    struct oid *oidp = (struct oid *)(&op->oid[0]);
 
-    if ( xsm_tmem_control(XSM_PRIV) )
-        return -EPERM;
+    write_lock(&tmem_rwlock);
 
-    switch(subop)
+    switch (cmd)
     {
-    case TMEMC_THAW:
-    case TMEMC_FREEZE:
-    case TMEMC_DESTROY:
-        ret = tmemc_freeze_pools(op->u.ctrl.cli_id,subop);
+    case XEN_SYSCTL_TMEM_OP_THAW:
+    case XEN_SYSCTL_TMEM_OP_FREEZE:
+    case XEN_SYSCTL_TMEM_OP_DESTROY:
+        ret = tmemc_freeze_pools(op->cli_id, cmd);
         break;
-    case TMEMC_FLUSH:
-        ret = tmemc_flush_mem(op->u.ctrl.cli_id,op->u.ctrl.arg1);
+    case XEN_SYSCTL_TMEM_OP_FLUSH:
+        ret = tmemc_flush_mem(op->cli_id,op->arg1);
         break;
-    case TMEMC_LIST:
-        ret = tmemc_list(op->u.ctrl.cli_id,
-                         guest_handle_cast(op->u.ctrl.buf, char),
-                         op->u.ctrl.arg1,op->u.ctrl.arg2);
+    case XEN_SYSCTL_TMEM_OP_LIST:
+        ret = tmemc_list(op->cli_id,
+                         guest_handle_cast(op->buf, char), op->arg1, op->arg2);
         break;
-    case TMEMC_SET_WEIGHT:
-    case TMEMC_SET_CAP:
-    case TMEMC_SET_COMPRESS:
-        ret = tmemc_set_var(op->u.ctrl.cli_id,subop,op->u.ctrl.arg1);
+    case XEN_SYSCTL_TMEM_OP_SET_WEIGHT:
+    case XEN_SYSCTL_TMEM_OP_SET_CAP:
+    case XEN_SYSCTL_TMEM_OP_SET_COMPRESS:
+        ret = tmemc_set_var(op->cli_id, cmd, op->arg1);
         break;
-    case TMEMC_QUERY_FREEABLE_MB:
+    case XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB:
         ret = tmem_freeable_pages() >> (20 - PAGE_SHIFT);
         break;
-    case TMEMC_SAVE_BEGIN:
-    case TMEMC_RESTORE_BEGIN:
-    case TMEMC_SAVE_GET_VERSION:
-    case TMEMC_SAVE_GET_MAXPOOLS:
-    case TMEMC_SAVE_GET_CLIENT_WEIGHT:
-    case TMEMC_SAVE_GET_CLIENT_CAP:
-    case TMEMC_SAVE_GET_CLIENT_FLAGS:
-    case TMEMC_SAVE_GET_POOL_FLAGS:
-    case TMEMC_SAVE_GET_POOL_NPAGES:
-    case TMEMC_SAVE_GET_POOL_UUID:
-    case TMEMC_SAVE_END:
-        ret = tmemc_save_subop(op->u.ctrl.cli_id,pool_id,subop,
-                               guest_handle_cast(op->u.ctrl.buf, char),
-                               op->u.ctrl.arg1);
+    case XEN_SYSCTL_TMEM_OP_SAVE_BEGIN:
+    case XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES:
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID:
+    case XEN_SYSCTL_TMEM_OP_SAVE_END:
+        ret = tmemc_save_subop(op->cli_id, pool_id, cmd,
+                               guest_handle_cast(op->buf, char), op->arg1);
         break;
-    case TMEMC_SAVE_GET_NEXT_PAGE:
-        ret = tmemc_save_get_next_page(op->u.ctrl.cli_id, pool_id,
-                                       guest_handle_cast(op->u.ctrl.buf, char),
-                                       op->u.ctrl.arg1);
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE:
+        ret = tmemc_save_get_next_page(op->cli_id, pool_id,
+                                       guest_handle_cast(op->buf, char), op->arg1);
         break;
-    case TMEMC_SAVE_GET_NEXT_INV:
-        ret = tmemc_save_get_next_inv(op->u.ctrl.cli_id,
-                                      guest_handle_cast(op->u.ctrl.buf, char),
-                                      op->u.ctrl.arg1);
+    case XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV:
+        ret = tmemc_save_get_next_inv(op->cli_id,
+                                      guest_handle_cast(op->buf, char), op->arg1);
         break;
-    case TMEMC_RESTORE_PUT_PAGE:
-        ret = tmemc_restore_put_page(op->u.ctrl.cli_id,pool_id,
-                                     oidp, op->u.ctrl.arg2,
-                                     guest_handle_cast(op->u.ctrl.buf, char),
-                                     op->u.ctrl.arg1);
+    case XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE:
+        ret = tmemc_restore_put_page(op->cli_id, pool_id, oidp, op->arg2,
+                                     guest_handle_cast(op->buf, char), op->arg1);
         break;
-    case TMEMC_RESTORE_FLUSH_PAGE:
-        ret = tmemc_restore_flush_page(op->u.ctrl.cli_id,pool_id,
-                                       oidp, op->u.ctrl.arg2);
+    case XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE:
+        ret = tmemc_restore_flush_page(op->cli_id, pool_id, oidp, op->arg2);
         break;
     default:
         ret = -1;
     }
+
+    write_unlock(&tmem_rwlock);
+
     return ret;
 }
 
@@ -2666,9 +2662,9 @@ long do_tmem_op(tmem_cli_op_t uops)
     /* Acquire wirte lock for all command at first */
     write_lock(&tmem_rwlock);
 
-    if ( op.cmd == TMEM_CONTROL )
+    if ( op.cmd == TMEM_CONTROL_MOVED )
     {
-        rc = do_tmem_control(&op);
+        rc = -ENOSYS;
     }
     else if ( op.cmd == TMEM_AUTH )
     {
@@ -2788,7 +2784,7 @@ void tmem_destroy(void *v)
     write_unlock(&tmem_rwlock);
 }
 
-#define MAX_EVICTS 10  /* should be variable or set via TMEMC_ ?? */
+#define MAX_EVICTS 10  /* should be variable or set via XEN_SYSCTL_TMEM_OP_ ?? */
 void *tmem_relinquish_pages(unsigned int order, unsigned int memflags)
 {
     struct page_info *pfp;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 58c9be2..21a6448 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
 typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
 
+#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
+
+#define XEN_SYSCTL_TMEM_OP_THAW                   0
+#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
+#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
+#define XEN_SYSCTL_TMEM_OP_DESTROY                3
+#define XEN_SYSCTL_TMEM_OP_LIST                   4
+#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
+#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
+#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
+#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
+#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
+#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
+#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
+#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
+#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
+#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
+
+struct xen_sysctl_tmem_op {
+    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
+    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
+    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
+                           for all others can be the domain id or
+                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
+    uint32_t arg1;      /* IN: If not applicable to command use 0. */
+    uint32_t arg2;      /* IN: If not applicable to command use 0. */
+    uint8_t  pad[4];    /* Padding so structure is the same under 32 and 64. */
+    uint64_t oid[3];    /* IN: If not applicable to command use 0. */
+    XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save and restore ops. */
+};
+typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
+
 struct xen_sysctl {
     uint32_t cmd;
 #define XEN_SYSCTL_readconsole                    1
@@ -734,6 +776,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_psr_cmt_op                    21
 #define XEN_SYSCTL_pcitopoinfo                   22
 #define XEN_SYSCTL_psr_cat_op                    23
+#define XEN_SYSCTL_tmem_op                       24
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
@@ -758,6 +801,7 @@ struct xen_sysctl {
         struct xen_sysctl_coverage_op       coverage_op;
         struct xen_sysctl_psr_cmt_op        psr_cmt_op;
         struct xen_sysctl_psr_cat_op        psr_cat_op;
+        struct xen_sysctl_tmem_op           tmem_op;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/include/public/tmem.h b/xen/include/public/tmem.h
index 4fd2fc6..208f5a6 100644
--- a/xen/include/public/tmem.h
+++ b/xen/include/public/tmem.h
@@ -33,7 +33,7 @@
 #define TMEM_SPEC_VERSION          1
 
 /* Commands to HYPERVISOR_tmem_op() */
-#define TMEM_CONTROL               0
+#define TMEM_CONTROL_MOVED         0
 #define TMEM_NEW_POOL              1
 #define TMEM_DESTROY_POOL          2
 #define TMEM_PUT_PAGE              4
@@ -48,35 +48,9 @@
 #endif
 
 /* Privileged commands to HYPERVISOR_tmem_op() */
-#define TMEM_AUTH                 101 
+#define TMEM_AUTH                 101
 #define TMEM_RESTORE_NEW          102
 
-/* Subops for HYPERVISOR_tmem_op(TMEM_CONTROL) */
-#define TMEMC_THAW                   0
-#define TMEMC_FREEZE                 1
-#define TMEMC_FLUSH                  2
-#define TMEMC_DESTROY                3
-#define TMEMC_LIST                   4
-#define TMEMC_SET_WEIGHT             5
-#define TMEMC_SET_CAP                6
-#define TMEMC_SET_COMPRESS           7
-#define TMEMC_QUERY_FREEABLE_MB      8
-#define TMEMC_SAVE_BEGIN             10
-#define TMEMC_SAVE_GET_VERSION       11
-#define TMEMC_SAVE_GET_MAXPOOLS      12
-#define TMEMC_SAVE_GET_CLIENT_WEIGHT 13
-#define TMEMC_SAVE_GET_CLIENT_CAP    14
-#define TMEMC_SAVE_GET_CLIENT_FLAGS  15
-#define TMEMC_SAVE_GET_POOL_FLAGS    16
-#define TMEMC_SAVE_GET_POOL_NPAGES   17
-#define TMEMC_SAVE_GET_POOL_UUID     18
-#define TMEMC_SAVE_GET_NEXT_PAGE     19
-#define TMEMC_SAVE_GET_NEXT_INV      20
-#define TMEMC_SAVE_END               21
-#define TMEMC_RESTORE_BEGIN          30
-#define TMEMC_RESTORE_PUT_PAGE       32
-#define TMEMC_RESTORE_FLUSH_PAGE     33
-
 /* Bits for HYPERVISOR_tmem_op(TMEM_NEW_POOL) */
 #define TMEM_POOL_PERSIST          1
 #define TMEM_POOL_SHARED           2
@@ -110,16 +84,7 @@ struct tmem_op {
             uint32_t flags;
             uint32_t arg1;
         } creat; /* for cmd == TMEM_NEW_POOL, TMEM_AUTH, TMEM_RESTORE_NEW */
-        struct { 
-            uint32_t subop;
-            uint32_t cli_id;
-            uint32_t arg1;
-            uint32_t arg2;
-            uint64_t oid[3];
-            tmem_cli_va_t buf;
-        } ctrl; /* for cmd == TMEM_CONTROL */
         struct {
-            
             uint64_t oid[3];
             uint32_t index;
             uint32_t tmem_offset;
diff --git a/xen/include/xen/tmem.h b/xen/include/xen/tmem.h
index 5dbf9d5..32a542a 100644
--- a/xen/include/xen/tmem.h
+++ b/xen/include/xen/tmem.h
@@ -9,6 +9,9 @@
 #ifndef __XEN_TMEM_H__
 #define __XEN_TMEM_H__
 
+struct xen_sysctl_tmem_op;
+
+extern int tmem_control(struct xen_sysctl_tmem_op *op);
 extern void tmem_destroy(void *);
 extern void *tmem_relinquish_pages(unsigned int, unsigned int);
 extern unsigned long tmem_freeable_pages(void);
diff --git a/xen/include/xen/tmem_xen.h b/xen/include/xen/tmem_xen.h
index c0d6831..ca88461 100644
--- a/xen/include/xen/tmem_xen.h
+++ b/xen/include/xen/tmem_xen.h
@@ -297,7 +297,6 @@ static inline int tmem_get_tmemop_from_client(tmem_op_t *op, tmem_cli_op_t uops)
         switch ( cop.cmd )
         {
         case TMEM_NEW_POOL:   u = XLAT_tmem_op_u_creat; break;
-        case TMEM_CONTROL:    u = XLAT_tmem_op_u_ctrl;  break;
         case TMEM_AUTH:       u = XLAT_tmem_op_u_creat; break;
         case TMEM_RESTORE_NEW:u = XLAT_tmem_op_u_creat; break;
         default:              u = XLAT_tmem_op_u_gen ;  break;
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 7a4522e..cfad13c 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -761,6 +761,9 @@ static int flask_sysctl(int cmd)
     case XEN_SYSCTL_tbuf_op:
         return domain_has_xen(current->domain, XEN__TBUFCONTROL);
 
+    case XEN_SYSCTL_tmem_op:
+        return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
+
     case XEN_SYSCTL_sched_id:
         return domain_has_xen(current->domain, XEN__GETSCHEDULER);
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 6/8] tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
                   ` (4 preceding siblings ...)
  2015-08-27 11:02 ` [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl Konrad Rzeszutek Wilk
@ 2015-08-27 11:02 ` Konrad Rzeszutek Wilk
  2015-08-27 20:59   ` Daniel De Graaf
  2015-08-27 11:02 ` [PATCH v2 7/8] tmem: Remove extra spaces at end and some hard tabbing Konrad Rzeszutek Wilk
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:02 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2
  Cc: Daniel De Graaf, Konrad Rzeszutek Wilk

The sysctl is where the tmem control operations are done and the
XSM checks are done via there. The old mechanism (to check
for control tmem op XSM from do_tmem_op) is not needed anymore.

CC:  Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/include/xsm/dummy.h             | 6 ------
 xen/include/xsm/xsm.h               | 6 ------
 xen/xsm/dummy.c                     | 1 -
 xen/xsm/flask/hooks.c               | 6 ------
 xen/xsm/flask/policy/access_vectors | 2 +-
 5 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index bbbfce7..9fe372c 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -427,12 +427,6 @@ static XSM_INLINE int xsm_tmem_op(XSM_DEFAULT_VOID)
     return xsm_default_action(action, current->domain, NULL);
 }
 
-static XSM_INLINE int xsm_tmem_control(XSM_DEFAULT_VOID)
-{
-    XSM_ASSERT_ACTION(XSM_PRIV);
-    return xsm_default_action(action, current->domain, NULL);
-}
-
 static XSM_INLINE long xsm_do_xsm_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) op)
 {
     return -ENOSYS;
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3678a93..ba3caed 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -137,7 +137,6 @@ struct xsm_operations {
 
     int (*page_offline)(uint32_t cmd);
     int (*tmem_op)(void);
-    int (*tmem_control)(void);
 
     long (*do_xsm_op) (XEN_GUEST_HANDLE_PARAM(xsm_op_t) op);
 #ifdef CONFIG_COMPAT
@@ -557,11 +556,6 @@ static inline int xsm_tmem_op(xsm_default_t def)
     return xsm_ops->tmem_op();
 }
 
-static inline int xsm_tmem_control(xsm_default_t def)
-{
-    return xsm_ops->tmem_control();
-}
-
 static inline long xsm_do_xsm_op (XEN_GUEST_HANDLE_PARAM(xsm_op_t) op)
 {
     return xsm_ops->do_xsm_op(op);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 21b1bf8..72eba40 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -112,7 +112,6 @@ void xsm_fixup_ops (struct xsm_operations *ops)
 
     set_to_dummy_if_null(ops, page_offline);
     set_to_dummy_if_null(ops, tmem_op);
-    set_to_dummy_if_null(ops, tmem_control);
     set_to_dummy_if_null(ops, hvm_param);
     set_to_dummy_if_null(ops, hvm_control);
     set_to_dummy_if_null(ops, hvm_param_nested);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index cfad13c..5f5f181 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1132,11 +1132,6 @@ static inline int flask_tmem_op(void)
     return domain_has_xen(current->domain, XEN__TMEM_OP);
 }
 
-static inline int flask_tmem_control(void)
-{
-    return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
-}
-
 static int flask_add_to_physmap(struct domain *d1, struct domain *d2)
 {
     return domain_has_perm(d1, d2, SECCLASS_MMU, MMU__PHYSMAP);
@@ -1696,7 +1691,6 @@ static struct xsm_operations flask_ops = {
 
     .page_offline = flask_page_offline,
     .tmem_op = flask_tmem_op,
-    .tmem_control = flask_tmem_control,
     .hvm_param = flask_hvm_param,
     .hvm_control = flask_hvm_param,
     .hvm_param_nested = flask_hvm_param_nested,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 71495fd..0aa68f8 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -69,7 +69,7 @@ class xen
     cpupool_op
 # tmem hypercall (any access)
     tmem_op
-# TMEM_CONTROL command of tmem hypercall
+# XEN_SYSCTL_tmem_op command of tmem (part of sysctl)
     tmem_control
 # XEN_SYSCTL_scheduler_op with XEN_DOMCTL_SCHEDOP_getinfo, XEN_SYSCTL_sched_id
     getscheduler
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 7/8] tmem: Remove extra spaces at end and some hard tabbing.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
                   ` (5 preceding siblings ...)
  2015-08-27 11:02 ` [PATCH v2 6/8] tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall Konrad Rzeszutek Wilk
@ 2015-08-27 11:02 ` Konrad Rzeszutek Wilk
  2015-08-27 13:19   ` Andrew Cooper
  2015-08-27 11:02 ` [PATCH v2 8/8] tmem: Spelling mistakes Konrad Rzeszutek Wilk
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:02 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2; +Cc: Konrad Rzeszutek Wilk

My editor marks these in red glowing red so removing them to
make it easier to focus on code.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/common/tmem.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index 1c331ac..66d2852 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -111,7 +111,7 @@ struct tmem_pool {
     atomic_t pgp_count;
     int pgp_count_max;
     long obj_count;  /* atomicity depends on pool_rwlock held for write */
-    long obj_count_max;  
+    long obj_count_max;
     unsigned long objnode_count, objnode_count_max;
     uint64_t sum_life_cycles;
     uint64_t sum_evicted_cycles;
@@ -1092,7 +1092,7 @@ static int shared_pool_quit(struct tmem_pool *pool, domid_t cli_id)
 
     ASSERT(is_shared(pool));
     ASSERT(pool->client != NULL);
-    
+
     ASSERT_WRITELOCK(&tmem_rwlock);
     pool_destroy_objs(pool, cli_id);
     list_for_each_entry(sl,&pool->share_list, share_list)
@@ -1180,7 +1180,7 @@ static struct client *client_create(domid_t cli_id)
     }
     if ( !d->is_dying ) {
         d->tmem_client = client;
-	client->domain = d;
+        client->domain = d;
     }
     rcu_unlock_domain(d);
 
@@ -1229,7 +1229,7 @@ static bool_t client_over_quota(struct client *client)
     int total = _atomic_read(client_weight_total);
 
     ASSERT(client != NULL);
-    if ( (total == 0) || (client->weight == 0) || 
+    if ( (total == 0) || (client->weight == 0) ||
           (client->eph_count == 0) )
         return 0;
     return ( ((global_eph_count*100L) / client->eph_count ) >
@@ -1414,7 +1414,7 @@ static int do_tmem_put_compress(struct tmem_page_descriptor *pgp, xen_pfn_t cmfn
     void *dst, *p;
     size_t size;
     int ret = 0;
-    
+
     ASSERT(pgp != NULL);
     ASSERT(pgp->us.obj != NULL);
     ASSERT_SPINLOCK(&pgp->us.obj->obj_spinlock);
@@ -1559,7 +1559,7 @@ refind:
         {
             /* no puts allowed into a frozen pool (except dup puts) */
             if ( client->frozen )
-	        goto unlock_obj;
+	            goto unlock_obj;
         }
     }
     else
@@ -1572,10 +1572,10 @@ refind:
 
         write_lock(&pool->pool_rwlock);
         /*
-	 * Parallel callers may already allocated obj and inserted to obj_rb_root
-	 * before us.
-	 */
-        if (!obj_rb_insert(&pool->obj_rb_root[oid_hash(oidp)], obj))
+         * Parallel callers may already allocated obj and inserted to obj_rb_root
+         * before us.
+         */
+        if ( !obj_rb_insert(&pool->obj_rb_root[oid_hash(oidp)], obj) )
         {
             tmem_free(obj, pool);
             write_unlock(&pool->pool_rwlock);
@@ -1945,7 +1945,7 @@ static int do_tmem_new_pool(domid_t this_cli_id,
                      (client->shared_auth_uuid[i][1] == uuid_hi) )
                     break;
             if ( i == MAX_GLOBAL_SHARED_POOLS )
-	    {
+	        {
                 tmem_client_info("Shared auth failed, create non shared pool instead!\n");
                 pool->shared = 0;
                 goto out;
@@ -2107,7 +2107,7 @@ static int tmemc_list_client(struct client *c, tmem_cli_va_param_t buf,
              p->obj_count, p->obj_count_max,
              p->objnode_count, p->objnode_count_max,
              p->good_puts, p->puts,p->dup_puts_flushed, p->dup_puts_replaced,
-             p->no_mem_puts, 
+             p->no_mem_puts,
              p->found_gets, p->gets,
              p->flushs_found, p->flushs, p->flush_objs_found, p->flush_objs);
         if ( sum + n >= len )
@@ -2146,7 +2146,7 @@ static int tmemc_list_shared(tmem_cli_va_param_t buf, int off, uint32_t len,
              p->obj_count, p->obj_count_max,
              p->objnode_count, p->objnode_count_max,
              p->good_puts, p->puts,p->dup_puts_flushed, p->dup_puts_replaced,
-             p->no_mem_puts, 
+             p->no_mem_puts,
              p->found_gets, p->gets,
              p->flushs_found, p->flushs, p->flush_objs_found, p->flush_objs);
         if ( sum + n >= len )
@@ -2456,7 +2456,7 @@ static int tmemc_save_get_next_page(int cli_id, uint32_t pool_id,
         /* process the first one */
         pool->cur_pgp = pgp = list_entry((&pool->persistent_page_list)->next,
                          struct tmem_page_descriptor,us.pool_pers_pages);
-    } else if ( list_is_last(&pool->cur_pgp->us.pool_pers_pages, 
+    } else if ( list_is_last(&pool->cur_pgp->us.pool_pers_pages,
                              &pool->persistent_page_list) )
     {
         /* already processed the last one in the list */
@@ -2504,7 +2504,7 @@ static int tmemc_save_get_next_inv(int cli_id, tmem_cli_va_param_t buf,
         pgp = list_entry((&client->persistent_invalidated_list)->next,
                          struct tmem_page_descriptor,client_inv_pages);
         client->cur_pgp = pgp;
-    } else if ( list_is_last(&client->cur_pgp->client_inv_pages, 
+    } else if ( list_is_last(&client->cur_pgp->client_inv_pages,
                              &client->persistent_invalidated_list) )
     {
         client->cur_pgp = NULL;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 8/8] tmem: Spelling mistakes.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
                   ` (6 preceding siblings ...)
  2015-08-27 11:02 ` [PATCH v2 7/8] tmem: Remove extra spaces at end and some hard tabbing Konrad Rzeszutek Wilk
@ 2015-08-27 11:02 ` Konrad Rzeszutek Wilk
  2015-08-27 15:14   ` Jan Beulich
  2015-08-27 13:30 ` [PATCH v2] Tmem bug-fixes and cleanups Andrew Cooper
  2015-08-27 14:10 ` Wei Liu
  9 siblings, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 11:02 UTC (permalink / raw)
  To: xen-devel, jbeulich, andrew.cooper3, wei.liu2; +Cc: Konrad Rzeszutek Wilk

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/common/tmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index 66d2852..9bd75d8 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -2659,7 +2659,7 @@ long do_tmem_op(tmem_cli_op_t uops)
         return -EFAULT;
     }
 
-    /* Acquire wirte lock for all command at first */
+    /* Acquire write lock for all command at first */
     write_lock(&tmem_rwlock);
 
     if ( op.cmd == TMEM_CONTROL_MOVED )
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3
  2015-08-27 11:01 ` [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3 Konrad Rzeszutek Wilk
@ 2015-08-27 11:15   ` Wei Liu
  2015-08-27 12:53   ` Andrew Cooper
  1 sibling, 0 replies; 32+ messages in thread
From: Wei Liu @ 2015-08-27 11:15 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, wei.liu2, jbeulich, andrew.cooper3

On Thu, Aug 27, 2015 at 07:01:59AM -0400, Konrad Rzeszutek Wilk wrote:
> It mentions it but it is never used. The hypercall interface
> knows nothing of this sort of thing either. Lets just remove it.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest.
  2015-08-27 11:01 ` [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest Konrad Rzeszutek Wilk
@ 2015-08-27 11:21   ` Wei Liu
  0 siblings, 0 replies; 32+ messages in thread
From: Wei Liu @ 2015-08-27 11:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, wei.liu2, jbeulich, andrew.cooper3

On Thu, Aug 27, 2015 at 07:01:56AM -0400, Konrad Rzeszutek Wilk wrote:
[...]
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  xen/common/tmem.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/tmem.c b/xen/common/tmem.c
> index f2dc26e..572944e 100644
> --- a/xen/common/tmem.c
> +++ b/xen/common/tmem.c
> @@ -1037,6 +1037,9 @@ static int shared_pool_join(struct tmem_pool *pool, struct client *new_client)
>          tmem_client_info("adding new %s %d to shared pool owned by %s %d\n",
>                      tmem_client_str, new_client->cli_id, tmem_client_str,
>                      pool->client->cli_id);
> +    else if (pool->shared_count)

Coding style.

> +        tmem_client_info("inter-guest sharing of shared pool %s by client %d\n",
> +                         tmem_client_str, pool->client->cli_id);
>      ++pool->shared_count;
>      return 0;
>  }
> @@ -1056,7 +1059,10 @@ static void shared_pool_reassign(struct tmem_pool *pool)
>      }
>      old_client->pools[pool->pool_id] = NULL;
>      sl = list_entry(pool->share_list.next, struct share_list, share_list);
> -    ASSERT(sl->client != old_client);
> +    /*
> +     * The sl->client can be old_client if there are multiple shared pools
> +     * within an guest.
> +     */
>      pool->client = new_client = sl->client;
>      for (poolid = 0; poolid < MAX_POOLS_PER_DOMAIN; poolid++)
>          if (new_client->pools[poolid] == pool)
> @@ -1982,6 +1988,8 @@ static int do_tmem_new_pool(domid_t this_cli_id,
>          {
>              INIT_LIST_HEAD(&pool->share_list);
>              pool->shared_count = 0;
> +            if (shared_pool_join(pool, client))

Coding style.

> +                goto fail;
>              global_shared_pools[first_unused_s_poolid] = pool;
>          }
>      }
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 11:02 ` [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl Konrad Rzeszutek Wilk
@ 2015-08-27 11:22   ` Wei Liu
  2015-08-27 13:12   ` Andrew Cooper
  2015-08-27 15:11   ` Jan Beulich
  2 siblings, 0 replies; 32+ messages in thread
From: Wei Liu @ 2015-08-27 11:22 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, wei.liu2, jbeulich, andrew.cooper3

On Thu, Aug 27, 2015 at 07:02:00AM -0400, Konrad Rzeszutek Wilk wrote:
> The operations are to be used by an control domain to set parameters,
> list pools, clients, and to be used during migration.
> 
> There is no need to have them in the tmem hypercall path.
> 
> This patch moves code without adding fixes - and in fact in
> some cases makes the parameters soo long that they hurt eyes - but
> that is for another patch.
> 
> Note that in regards to existing users:
> 
>  - Only the control domain could call it - which meant that if
>    a guest called it would get -EPERM, so we are OK there.
>    In practice no guests called this TMEM_CONTROL command.
> 
>  - The spec: https://oss.oracle.com/projects/tmem/dist/documentation/api/tmemspec-v001.pdf
>    mentions: "TBD [Not sure if this is really needed.]"
>    which is a carte blanche as any to do this!
> 
> Note: The XSM check is the same - we just move it from do_tmem_op
> to do_sysctl.
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Toolstack changes look mechanical:

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3
  2015-08-27 11:01 ` [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3 Konrad Rzeszutek Wilk
  2015-08-27 11:15   ` Wei Liu
@ 2015-08-27 12:53   ` Andrew Cooper
  2015-08-27 18:38     ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 32+ messages in thread
From: Andrew Cooper @ 2015-08-27 12:53 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, jbeulich, wei.liu2

On 27/08/15 12:01, Konrad Rzeszutek Wilk wrote:
> It mentions it but it is never used. The hypercall interface
> knows nothing of this sort of thing either. Lets just remove it.
>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

It would be nice if you could take the opportunity of changing every
caller to fix the style issues in affected code.  There are a huge
number of missing spaces in the parameter lists of calls to
xc_tmem_control()

~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 11:02 ` [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl Konrad Rzeszutek Wilk
  2015-08-27 11:22   ` Wei Liu
@ 2015-08-27 13:12   ` Andrew Cooper
  2015-08-27 15:13     ` Jan Beulich
  2015-08-27 18:47     ` Konrad Rzeszutek Wilk
  2015-08-27 15:11   ` Jan Beulich
  2 siblings, 2 replies; 32+ messages in thread
From: Andrew Cooper @ 2015-08-27 13:12 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, jbeulich, wei.liu2

On 27/08/15 12:02, Konrad Rzeszutek Wilk wrote:
> --- a/tools/python/xen/lowlevel/xc/xc.c
> +++ b/tools/python/xen/lowlevel/xc/xc.c
> @@ -1808,25 +1808,25 @@ static PyObject *pyxc_tmem_control(XcObject *self,
>                          &pool_id, &subop, &cli_id, &arg1, &arg2, &buf) )
>          return NULL;
>  
> -    if ( (subop == TMEMC_LIST) && (arg1 > 32768) )
> +    if ( (subop == XEN_SYSCTL_TMEM_OP_LIST) && (arg1 > 32768) )
>          arg1 = 32768;
>  
>      if ( (rc = xc_tmem_control(self->xc_handle, pool_id, subop, cli_id, arg1, arg2, buffer)) < 0 )
>          return Py_BuildValue("i", rc);
>  
>      switch (subop) {
> -        case TMEMC_LIST:
> +        case XEN_SYSCTL_TMEM_OP_LIST:
>              return Py_BuildValue("s", buffer);
> -        case TMEMC_FLUSH:
> +        case XEN_SYSCTL_TMEM_OP_FLUSH:
>              return Py_BuildValue("i", rc);
> -        case TMEMC_QUERY_FREEABLE_MB:
> +        case XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB:
>              return Py_BuildValue("i", rc);
> -        case TMEMC_THAW:
> -        case TMEMC_FREEZE:
> -        case TMEMC_DESTROY:
> -        case TMEMC_SET_WEIGHT:
> -        case TMEMC_SET_CAP:
> -        case TMEMC_SET_COMPRESS:
> +        case XEN_SYSCTL_TMEM_OP_THAW:
> +        case XEN_SYSCTL_TMEM_OP_FREEZE:
> +        case XEN_SYSCTL_TMEM_OP_DESTROY:
> +        case XEN_SYSCTL_TMEM_OP_SET_WEIGHT:
> +        case XEN_SYSCTL_TMEM_OP_SET_CAP:
> +        case XEN_SYSCTL_TMEM_OP_SET_COMPRESS:

Any case statements falling through into default like this can simply be
dropped.

> @@ -2555,77 +2556,72 @@ static int tmemc_restore_flush_page(int cli_id, uint32_t pool_id, struct oid *oi
>      return do_tmem_flush_page(pool,oidp,index);
>  }
>  
> -static int do_tmem_control(struct tmem_op *op)
> +int tmem_control(struct xen_sysctl_tmem_op *op)
>  {
>      int ret;
>      uint32_t pool_id = op->pool_id;
> -    uint32_t subop = op->u.ctrl.subop;
> -    struct oid *oidp = (struct oid *)(&op->u.ctrl.oid[0]);
> +    uint32_t cmd = op->cmd;
> +    struct oid *oidp = (struct oid *)(&op->oid[0]);

Please put a struct xen_sysctl_tmem_oid definition in sysctl.h and avoid
this typesystem abuse.

> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
>  typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
>  
> +#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
> +
> +#define XEN_SYSCTL_TMEM_OP_THAW                   0
> +#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
> +#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
> +#define XEN_SYSCTL_TMEM_OP_DESTROY                3
> +#define XEN_SYSCTL_TMEM_OP_LIST                   4
> +#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
> +#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
> +#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
> +#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
> +#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
> +#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
> +#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
> +#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
> +#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
> +
> +struct xen_sysctl_tmem_op {
> +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
> +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
> +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
> +                           for all others can be the domain id or
> +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
> +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
> +    uint32_t arg2;      /* IN: If not applicable to command use 0. */

Please can this interface be fixed as part of the move, even if it is in
subsequent patches for clarity.

Part of the issue with the XSA-17 security audit was that these args are
completely opaque.

> +    uint8_t  pad[4];    /* Padding so structure is the same under 32 and 64. */

probably better to use a uint32_t here.

> +    uint64_t oid[3];    /* IN: If not applicable to command use 0. */
> +    XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save and restore ops. */
> +};
> +typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
> +
>  struct xen_sysctl {
>      uint32_t cmd;
>  #define XEN_SYSCTL_readconsole                    1
> @@ -734,6 +776,7 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_psr_cmt_op                    21
>  #define XEN_SYSCTL_pcitopoinfo                   22
>  #define XEN_SYSCTL_psr_cat_op                    23
> +#define XEN_SYSCTL_tmem_op                       24
>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>      union {
>          struct xen_sysctl_readconsole       readconsole;
> @@ -758,6 +801,7 @@ struct xen_sysctl {
>          struct xen_sysctl_coverage_op       coverage_op;
>          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
>          struct xen_sysctl_psr_cat_op        psr_cat_op;
> +        struct xen_sysctl_tmem_op           tmem_op;
>          uint8_t                             pad[128];
>      } u;
>  };
> diff --git a/xen/include/public/tmem.h b/xen/include/public/tmem.h
> index 4fd2fc6..208f5a6 100644
> --- a/xen/include/public/tmem.h
> +++ b/xen/include/public/tmem.h
> @@ -33,7 +33,7 @@
>  #define TMEM_SPEC_VERSION          1
>  
>  /* Commands to HYPERVISOR_tmem_op() */
> -#define TMEM_CONTROL               0
> +#define TMEM_CONTROL_MOVED         0

If anything, this should be deleted as opposed to being renamed.  The
_MOVED suffix is confusing, as TMEM_CONTROL_MOVED sounds like it could
be a plausible TMEM operation.

>  #define TMEM_NEW_POOL              1
>  #define TMEM_DESTROY_POOL          2
>  #define TMEM_PUT_PAGE              4
> @@ -48,35 +48,9 @@
>  #endif
>  
>  /* Privileged commands to HYPERVISOR_tmem_op() */
> -#define TMEM_AUTH                 101 
> +#define TMEM_AUTH                 101
>  #define TMEM_RESTORE_NEW          102
>  
> -/* Subops for HYPERVISOR_tmem_op(TMEM_CONTROL) */
> -#define TMEMC_THAW                   0
> -#define TMEMC_FREEZE                 1
> -#define TMEMC_FLUSH                  2
> -#define TMEMC_DESTROY                3
> -#define TMEMC_LIST                   4
> -#define TMEMC_SET_WEIGHT             5
> -#define TMEMC_SET_CAP                6
> -#define TMEMC_SET_COMPRESS           7
> -#define TMEMC_QUERY_FREEABLE_MB      8
> -#define TMEMC_SAVE_BEGIN             10
> -#define TMEMC_SAVE_GET_VERSION       11
> -#define TMEMC_SAVE_GET_MAXPOOLS      12
> -#define TMEMC_SAVE_GET_CLIENT_WEIGHT 13
> -#define TMEMC_SAVE_GET_CLIENT_CAP    14
> -#define TMEMC_SAVE_GET_CLIENT_FLAGS  15
> -#define TMEMC_SAVE_GET_POOL_FLAGS    16
> -#define TMEMC_SAVE_GET_POOL_NPAGES   17
> -#define TMEMC_SAVE_GET_POOL_UUID     18
> -#define TMEMC_SAVE_GET_NEXT_PAGE     19
> -#define TMEMC_SAVE_GET_NEXT_INV      20
> -#define TMEMC_SAVE_END               21
> -#define TMEMC_RESTORE_BEGIN          30
> -#define TMEMC_RESTORE_PUT_PAGE       32
> -#define TMEMC_RESTORE_FLUSH_PAGE     33
> -
>  /* Bits for HYPERVISOR_tmem_op(TMEM_NEW_POOL) */
>  #define TMEM_POOL_PERSIST          1
>  #define TMEM_POOL_SHARED           2
> @@ -110,16 +84,7 @@ struct tmem_op {
>              uint32_t flags;
>              uint32_t arg1;
>          } creat; /* for cmd == TMEM_NEW_POOL, TMEM_AUTH, TMEM_RESTORE_NEW */
> -        struct { 
> -            uint32_t subop;
> -            uint32_t cli_id;
> -            uint32_t arg1;
> -            uint32_t arg2;
> -            uint64_t oid[3];
> -            tmem_cli_va_t buf;
> -        } ctrl; /* for cmd == TMEM_CONTROL */
>          struct {
> -            
>              uint64_t oid[3];
>              uint32_t index;
>              uint32_t tmem_offset;
> diff --git a/xen/include/xen/tmem.h b/xen/include/xen/tmem.h
> index 5dbf9d5..32a542a 100644
> --- a/xen/include/xen/tmem.h
> +++ b/xen/include/xen/tmem.h
> @@ -9,6 +9,9 @@
>  #ifndef __XEN_TMEM_H__
>  #define __XEN_TMEM_H__
>  
> +struct xen_sysctl_tmem_op;

#include<public/sysctl.h> please.

~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 7/8] tmem: Remove extra spaces at end and some hard tabbing.
  2015-08-27 11:02 ` [PATCH v2 7/8] tmem: Remove extra spaces at end and some hard tabbing Konrad Rzeszutek Wilk
@ 2015-08-27 13:19   ` Andrew Cooper
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Cooper @ 2015-08-27 13:19 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, jbeulich, wei.liu2

On 27/08/15 12:02, Konrad Rzeszutek Wilk wrote:
> @@ -1559,7 +1559,7 @@ refind:
>          {
>              /* no puts allowed into a frozen pool (except dup puts) */
>              if ( client->frozen )
> -	        goto unlock_obj;
> +	            goto unlock_obj;

Need to lose 3 spaces here.

>          }
>      }
>      else
> @@ -1572,10 +1572,10 @@ refind:
>  
>          write_lock(&pool->pool_rwlock);
>          /*
> -	 * Parallel callers may already allocated obj and inserted to obj_rb_root
> -	 * before us.
> -	 */
> -        if (!obj_rb_insert(&pool->obj_rb_root[oid_hash(oidp)], obj))
> +         * Parallel callers may already allocated obj and inserted to obj_rb_root
> +         * before us.
> +         */
> +        if ( !obj_rb_insert(&pool->obj_rb_root[oid_hash(oidp)], obj) )
>          {
>              tmem_free(obj, pool);
>              write_unlock(&pool->pool_rwlock);
> @@ -1945,7 +1945,7 @@ static int do_tmem_new_pool(domid_t this_cli_id,
>                       (client->shared_auth_uuid[i][1] == uuid_hi) )
>                      break;
>              if ( i == MAX_GLOBAL_SHARED_POOLS )
> -	    {
> +	        {

And here.

~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2] Tmem bug-fixes and cleanups.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
                   ` (7 preceding siblings ...)
  2015-08-27 11:02 ` [PATCH v2 8/8] tmem: Spelling mistakes Konrad Rzeszutek Wilk
@ 2015-08-27 13:30 ` Andrew Cooper
  2015-08-27 14:10 ` Wei Liu
  9 siblings, 0 replies; 32+ messages in thread
From: Andrew Cooper @ 2015-08-27 13:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, jbeulich, wei.liu2

On 27/08/15 12:01, Konrad Rzeszutek Wilk wrote:
> Hey!
>
> At the Xenhackathon we spoke that the tmem code needs a bit of cleanups
> and simplification. One of the things that Andrew mentioned was that the
> TMEM_CONTROL should really be part of the sysctl hypercall. As I ventured
> this path I realized there were some other issues that need to be taken
> care of (like shared pools blowing up).
>
> This patchset has been tested with 32/64 guests, with a 64 hypervisor
> and 32 bit toolstack (and also with 64bit toolstack) with success.
>
> For fun I've also created an Linux module:
> http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/tmem_test/tmem_test.c
> that I will expand to cover in the future more interesting hypercall
> uses.
>
> Going forward the next step will be to move the 'tmem_control' function to
> its own file to simplify the code.
>
> The patches are also in my git tree:

Patches 1-4, 6-8 are definitely a net improvement, so Reviewed-by:
Andrew Cooper <andrew.cooper3@citrix.com>, with the proviso that I am
not knowlegable about tmem internals.

~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2] Tmem bug-fixes and cleanups.
  2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
                   ` (8 preceding siblings ...)
  2015-08-27 13:30 ` [PATCH v2] Tmem bug-fixes and cleanups Andrew Cooper
@ 2015-08-27 14:10 ` Wei Liu
  9 siblings, 0 replies; 32+ messages in thread
From: Wei Liu @ 2015-08-27 14:10 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, wei.liu2, jbeulich, andrew.cooper3

On Thu, Aug 27, 2015 at 07:01:55AM -0400, Konrad Rzeszutek Wilk wrote:
> Hey!
> 
> At the Xenhackathon we spoke that the tmem code needs a bit of cleanups
> and simplification. One of the things that Andrew mentioned was that the
> TMEM_CONTROL should really be part of the sysctl hypercall. As I ventured
> this path I realized there were some other issues that need to be taken
> care of (like shared pools blowing up).
> 
> This patchset has been tested with 32/64 guests, with a 64 hypervisor
> and 32 bit toolstack (and also with 64bit toolstack) with success.
> 
> For fun I've also created an Linux module:
> http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/tmem_test/tmem_test.c
> that I will expand to cover in the future more interesting hypercall
> uses.
> 
> Going forward the next step will be to move the 'tmem_control' function to
> its own file to simplify the code.
> 
> The patches are also in my git tree:
> 
> git://xenbits.xen.org/people/konradwilk/xen.git for-4.6/tmem.cleanups.v2
> 
>  tools/libxc/include/xenctrl.h          |   2 +-
>  tools/libxc/xc_tmem.c                  | 111 ++++++++++----------
>  tools/libxl/libxl.c                    |  22 ++--
>  tools/python/xen/lowlevel/xc/xc.c      |  27 +++--
>  tools/xenstat/libxenstat/src/xenstat.c |   6 +-
>  xen/common/sysctl.c                    |   7 +-
>  xen/common/tmem.c                      | 183 +++++++++++++++++----------------
>  xen/include/public/sysctl.h            |  44 ++++++++
>  xen/include/public/tmem.h              |  39 +------
>  xen/include/xen/tmem.h                 |   3 +
>  xen/include/xen/tmem_xen.h             |   1 -
>  xen/include/xsm/dummy.h                |   6 --
>  xen/include/xsm/xsm.h                  |   6 --
>  xen/xsm/dummy.c                        |   1 -
>  xen/xsm/flask/hooks.c                  |   9 +-
>  xen/xsm/flask/policy/access_vectors    |   2 +-
>  16 files changed, 237 insertions(+), 232 deletions(-)
> Konrad Rzeszutek Wilk (8):
>       tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest.
>       tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock.
>       tmem: remove in xc_tmem_control_oid duplicate set_xen_guest_handle call
>       tmem: Remove xc_tmem_control mystical arg3
>       tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
>       tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall.
>       tmem: Remove extra spaces at end and some hard tabbing.
>       tmem: Spelling mistakes.

This series is pretty isolated to TMEM. I'm in principle happy to let
this series go in:

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

The only one that needs more careful thought is #5, I'm going to leave
that to HV maintainers.

Wei.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/8] tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock.
  2015-08-27 11:01 ` [PATCH v2 2/8] tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock Konrad Rzeszutek Wilk
@ 2015-08-27 14:55   ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2015-08-27 14:55 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: andrew.cooper3, wei.liu2, xen-devel

>>> On 27.08.15 at 13:01, <konrad.wilk@oracle.com> wrote:
> --- a/xen/common/tmem.c
> +++ b/xen/common/tmem.c
> @@ -915,6 +915,11 @@ static int obj_rb_insert(struct rb_root *root, struct 
> tmem_object_root *obj)
>  {
>      struct rb_node **new, *parent = NULL;
>      struct tmem_object_root *this;
> +    struct tmem_pool *pool;
> +
> +    pool = obj->pool;
> +    ASSERT(pool != NULL);
> +    ASSERT_WRITELOCK(&pool->pool_rwlock);

Considering the new variable is used only in ASSERT()s, I'd
recommend against its introduction. But it's your code ...

Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 11:02 ` [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl Konrad Rzeszutek Wilk
  2015-08-27 11:22   ` Wei Liu
  2015-08-27 13:12   ` Andrew Cooper
@ 2015-08-27 15:11   ` Jan Beulich
  2015-08-27 18:43     ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2015-08-27 15:11 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: andrew.cooper3, wei.liu2, xen-devel

>>> On 27.08.15 at 13:02, <konrad.wilk@oracle.com> wrote:
> @@ -2666,9 +2662,9 @@ long do_tmem_op(tmem_cli_op_t uops)
>      /* Acquire wirte lock for all command at first */
>      write_lock(&tmem_rwlock);
>  
> -    if ( op.cmd == TMEM_CONTROL )
> +    if ( op.cmd == TMEM_CONTROL_MOVED )
>      {
> -        rc = do_tmem_control(&op);
> +        rc = -ENOSYS;

As in other similar cases I'd prefer -EOPNOTSUPP here to prevent
callers from implying the tmem hypercall as a whole is unimplemented.

> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
>  typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
>  
> +#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
> +
> +#define XEN_SYSCTL_TMEM_OP_THAW                   0
> +#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
> +#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
> +#define XEN_SYSCTL_TMEM_OP_DESTROY                3
> +#define XEN_SYSCTL_TMEM_OP_LIST                   4
> +#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
> +#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
> +#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
> +#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
> +#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
> +#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
> +#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
> +#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
> +#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33

Perhaps better to have all these in tmem.h, to not clutter this
header?

> +struct xen_sysctl_tmem_op {
> +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
> +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
> +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
> +                           for all others can be the domain id or
> +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
> +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
> +    uint32_t arg2;      /* IN: If not applicable to command use 0. */
> +    uint8_t  pad[4];    /* Padding so structure is the same under 32 and 64. */

uint32_t please. And despite this being an (easily changeable) sysctl,
verifying that it's zero on input would be nice.

> --- a/xen/include/public/tmem.h
> +++ b/xen/include/public/tmem.h
> @@ -33,7 +33,7 @@
>  #define TMEM_SPEC_VERSION          1
>  
>  /* Commands to HYPERVISOR_tmem_op() */
> -#define TMEM_CONTROL               0
> +#define TMEM_CONTROL_MOVED         0

Perhaps say where it moved in a brief comment?

> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -761,6 +761,9 @@ static int flask_sysctl(int cmd)
>      case XEN_SYSCTL_tbuf_op:
>          return domain_has_xen(current->domain, XEN__TBUFCONTROL);
>  
> +    case XEN_SYSCTL_tmem_op:
> +        return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
> +
>      case XEN_SYSCTL_sched_id:
>          return domain_has_xen(current->domain, XEN__GETSCHEDULER);

Hmm, these cases appear to be roughly sorted numerically, i.e.
yours would normally go at the end.

Despite the comments I see no reason for this (once adjusted where
needed) not to go in for 4.6.

Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 13:12   ` Andrew Cooper
@ 2015-08-27 15:13     ` Jan Beulich
  2015-08-27 18:47     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2015-08-27 15:13 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, wei.liu2

>>> On 27.08.15 at 15:12, <andrew.cooper3@citrix.com> wrote:
> On 27/08/15 12:02, Konrad Rzeszutek Wilk wrote:
>> --- a/xen/include/xen/tmem.h
>> +++ b/xen/include/xen/tmem.h
>> @@ -9,6 +9,9 @@
>>  #ifndef __XEN_TMEM_H__
>>  #define __XEN_TMEM_H__
>>  
>> +struct xen_sysctl_tmem_op;
> 
> #include<public/sysctl.h> please.

Why? Forward declarations like this help not including all headers
everywhere.

Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 8/8] tmem: Spelling mistakes.
  2015-08-27 11:02 ` [PATCH v2 8/8] tmem: Spelling mistakes Konrad Rzeszutek Wilk
@ 2015-08-27 15:14   ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2015-08-27 15:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: andrew.cooper3, wei.liu2, xen-devel

>>> On 27.08.15 at 13:02, <konrad.wilk@oracle.com> wrote:
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  xen/common/tmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/common/tmem.c b/xen/common/tmem.c
> index 66d2852..9bd75d8 100644
> --- a/xen/common/tmem.c
> +++ b/xen/common/tmem.c
> @@ -2659,7 +2659,7 @@ long do_tmem_op(tmem_cli_op_t uops)
>          return -EFAULT;
>      }
>  
> -    /* Acquire wirte lock for all command at first */
> +    /* Acquire write lock for all command at first */

Mind adding the missing full stop at once?

Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3
  2015-08-27 12:53   ` Andrew Cooper
@ 2015-08-27 18:38     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 18:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, wei.liu2, jbeulich

On Thu, Aug 27, 2015 at 01:53:17PM +0100, Andrew Cooper wrote:
> On 27/08/15 12:01, Konrad Rzeszutek Wilk wrote:
> > It mentions it but it is never used. The hypercall interface
> > knows nothing of this sort of thing either. Lets just remove it.
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> It would be nice if you could take the opportunity of changing every
> caller to fix the style issues in affected code.  There are a huge
> number of missing spaces in the parameter lists of calls to
> xc_tmem_control()

I was soo tempted to do it - but I've been hammered in the rule of
'one logical change per patch' so abstained from mission creep.

Do not fear - I will send out another patch that will fix this up :-)
> 
> ~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 15:11   ` Jan Beulich
@ 2015-08-27 18:43     ` Konrad Rzeszutek Wilk
  2015-08-27 19:02       ` Andrew Cooper
  2015-08-28  7:03       ` Jan Beulich
  0 siblings, 2 replies; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 18:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, wei.liu2, xen-devel

On Thu, Aug 27, 2015 at 09:11:35AM -0600, Jan Beulich wrote:
> >>> On 27.08.15 at 13:02, <konrad.wilk@oracle.com> wrote:
> > @@ -2666,9 +2662,9 @@ long do_tmem_op(tmem_cli_op_t uops)
> >      /* Acquire wirte lock for all command at first */
> >      write_lock(&tmem_rwlock);
> >  
> > -    if ( op.cmd == TMEM_CONTROL )
> > +    if ( op.cmd == TMEM_CONTROL_MOVED )
> >      {
> > -        rc = do_tmem_control(&op);
> > +        rc = -ENOSYS;
> 
> As in other similar cases I'd prefer -EOPNOTSUPP here to prevent
> callers from implying the tmem hypercall as a whole is unimplemented.
> 
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
> >  typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
> >  
> > +#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
> > +
> > +#define XEN_SYSCTL_TMEM_OP_THAW                   0
> > +#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
> > +#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
> > +#define XEN_SYSCTL_TMEM_OP_DESTROY                3
> > +#define XEN_SYSCTL_TMEM_OP_LIST                   4
> > +#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
> > +#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
> > +#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
> > +#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
> 
> Perhaps better to have all these in tmem.h, to not clutter this
> header?

Yes and no. The other sysctl had their #defines for commands here so I figured I
would follow that rule. But I am OK keeping it in tmem.h and just do

/* For the 'cmd' values consule tmem.h. Look for XEN_SYSCTL_TMEM_OP_* */
> 
> > +struct xen_sysctl_tmem_op {
> > +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
> > +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
> > +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
> > +                           for all others can be the domain id or
> > +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
> > +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
> > +    uint32_t arg2;      /* IN: If not applicable to command use 0. */
> > +    uint8_t  pad[4];    /* Padding so structure is the same under 32 and 64. */
> 
> uint32_t please. And despite this being an (easily changeable) sysctl,
> verifying that it's zero on input would be nice.

Yes! That was what I forgotten!
> 
> > --- a/xen/include/public/tmem.h
> > +++ b/xen/include/public/tmem.h
> > @@ -33,7 +33,7 @@
> >  #define TMEM_SPEC_VERSION          1
> >  
> >  /* Commands to HYPERVISOR_tmem_op() */
> > -#define TMEM_CONTROL               0
> > +#define TMEM_CONTROL_MOVED         0
> 
> Perhaps say where it moved in a brief comment?

Yes, good idea.

/* Deprecated. These operations are done now via XEN_SYSCTL_tmem_op
 * using the sysctl hypercall. */

> 
> > --- a/xen/xsm/flask/hooks.c
> > +++ b/xen/xsm/flask/hooks.c
> > @@ -761,6 +761,9 @@ static int flask_sysctl(int cmd)
> >      case XEN_SYSCTL_tbuf_op:
> >          return domain_has_xen(current->domain, XEN__TBUFCONTROL);
> >  
> > +    case XEN_SYSCTL_tmem_op:
> > +        return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
> > +
> >      case XEN_SYSCTL_sched_id:
> >          return domain_has_xen(current->domain, XEN__GETSCHEDULER);
> 
> Hmm, these cases appear to be roughly sorted numerically, i.e.
> yours would normally go at the end.

I recall a comment from Andrew asking the newly introduced commands to
be in alphabetical order. But perhaps that was for domctl which is more
.. ah.. random?

Anyhow will make it in numerical order.
> 
> Despite the comments I see no reason for this (once adjusted where
> needed) not to go in for 4.6.

Thank you.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 13:12   ` Andrew Cooper
  2015-08-27 15:13     ` Jan Beulich
@ 2015-08-27 18:47     ` Konrad Rzeszutek Wilk
  2015-08-27 18:58       ` Andrew Cooper
  1 sibling, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 18:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, wei.liu2, jbeulich

On Thu, Aug 27, 2015 at 02:12:35PM +0100, Andrew Cooper wrote:
> On 27/08/15 12:02, Konrad Rzeszutek Wilk wrote:
> > --- a/tools/python/xen/lowlevel/xc/xc.c
> > +++ b/tools/python/xen/lowlevel/xc/xc.c
> > @@ -1808,25 +1808,25 @@ static PyObject *pyxc_tmem_control(XcObject *self,
> >                          &pool_id, &subop, &cli_id, &arg1, &arg2, &buf) )
> >          return NULL;
> >  
> > -    if ( (subop == TMEMC_LIST) && (arg1 > 32768) )
> > +    if ( (subop == XEN_SYSCTL_TMEM_OP_LIST) && (arg1 > 32768) )
> >          arg1 = 32768;
> >  
> >      if ( (rc = xc_tmem_control(self->xc_handle, pool_id, subop, cli_id, arg1, arg2, buffer)) < 0 )
> >          return Py_BuildValue("i", rc);
> >  
> >      switch (subop) {
> > -        case TMEMC_LIST:
> > +        case XEN_SYSCTL_TMEM_OP_LIST:
> >              return Py_BuildValue("s", buffer);
> > -        case TMEMC_FLUSH:
> > +        case XEN_SYSCTL_TMEM_OP_FLUSH:
> >              return Py_BuildValue("i", rc);
> > -        case TMEMC_QUERY_FREEABLE_MB:
> > +        case XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB:
> >              return Py_BuildValue("i", rc);
> > -        case TMEMC_THAW:
> > -        case TMEMC_FREEZE:
> > -        case TMEMC_DESTROY:
> > -        case TMEMC_SET_WEIGHT:
> > -        case TMEMC_SET_CAP:
> > -        case TMEMC_SET_COMPRESS:
> > +        case XEN_SYSCTL_TMEM_OP_THAW:
> > +        case XEN_SYSCTL_TMEM_OP_FREEZE:
> > +        case XEN_SYSCTL_TMEM_OP_DESTROY:
> > +        case XEN_SYSCTL_TMEM_OP_SET_WEIGHT:
> > +        case XEN_SYSCTL_TMEM_OP_SET_CAP:
> > +        case XEN_SYSCTL_TMEM_OP_SET_COMPRESS:
> 
> Any case statements falling through into default like this can simply be
> dropped.

Lets do that in another patch. This patch is just to move the operation
from one hypercall to another - with the minimum amount of changes.

I will gladly modify the case statements in subsequent patches.

> 
> > @@ -2555,77 +2556,72 @@ static int tmemc_restore_flush_page(int cli_id, uint32_t pool_id, struct oid *oi
> >      return do_tmem_flush_page(pool,oidp,index);
> >  }
> >  
> > -static int do_tmem_control(struct tmem_op *op)
> > +int tmem_control(struct xen_sysctl_tmem_op *op)
> >  {
> >      int ret;
> >      uint32_t pool_id = op->pool_id;
> > -    uint32_t subop = op->u.ctrl.subop;
> > -    struct oid *oidp = (struct oid *)(&op->u.ctrl.oid[0]);
> > +    uint32_t cmd = op->cmd;
> > +    struct oid *oidp = (struct oid *)(&op->oid[0]);
> 
> Please put a struct xen_sysctl_tmem_oid definition in sysctl.h and avoid
> this typesystem abuse.

Again, I tried to make this move as minimal as possible and not
introduce other sensible changes - and defer them to later patches.


> 
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
> >  typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
> >  
> > +#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
> > +
> > +#define XEN_SYSCTL_TMEM_OP_THAW                   0
> > +#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
> > +#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
> > +#define XEN_SYSCTL_TMEM_OP_DESTROY                3
> > +#define XEN_SYSCTL_TMEM_OP_LIST                   4
> > +#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
> > +#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
> > +#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
> > +#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
> > +#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
> > +
> > +struct xen_sysctl_tmem_op {
> > +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
> > +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
> > +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
> > +                           for all others can be the domain id or
> > +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
> > +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
> > +    uint32_t arg2;      /* IN: If not applicable to command use 0. */
> 
> Please can this interface be fixed as part of the move, even if it is in
> subsequent patches for clarity.

I will gladly fix this interface in further patches. By all means!
> 
> Part of the issue with the XSA-17 security audit was that these args are
> completely opaque.
> 
> > +    uint8_t  pad[4];    /* Padding so structure is the same under 32 and 64. */
> 
> probably better to use a uint32_t here.

Yes.
> 
> > +    uint64_t oid[3];    /* IN: If not applicable to command use 0. */
> > +    XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save and restore ops. */
> > +};
> > +typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
> > +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
> > +
> >  struct xen_sysctl {
> >      uint32_t cmd;
> >  #define XEN_SYSCTL_readconsole                    1
> > @@ -734,6 +776,7 @@ struct xen_sysctl {
> >  #define XEN_SYSCTL_psr_cmt_op                    21
> >  #define XEN_SYSCTL_pcitopoinfo                   22
> >  #define XEN_SYSCTL_psr_cat_op                    23
> > +#define XEN_SYSCTL_tmem_op                       24
> >      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
> >      union {
> >          struct xen_sysctl_readconsole       readconsole;
> > @@ -758,6 +801,7 @@ struct xen_sysctl {
> >          struct xen_sysctl_coverage_op       coverage_op;
> >          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
> >          struct xen_sysctl_psr_cat_op        psr_cat_op;
> > +        struct xen_sysctl_tmem_op           tmem_op;
> >          uint8_t                             pad[128];
> >      } u;
> >  };
> > diff --git a/xen/include/public/tmem.h b/xen/include/public/tmem.h
> > index 4fd2fc6..208f5a6 100644
> > --- a/xen/include/public/tmem.h
> > +++ b/xen/include/public/tmem.h
> > @@ -33,7 +33,7 @@
> >  #define TMEM_SPEC_VERSION          1
> >  
> >  /* Commands to HYPERVISOR_tmem_op() */
> > -#define TMEM_CONTROL               0
> > +#define TMEM_CONTROL_MOVED         0
> 
> If anything, this should be deleted as opposed to being renamed.  The
> _MOVED suffix is confusing, as TMEM_CONTROL_MOVED sounds like it could
> be a plausible TMEM operation.

Ah, will do what Jan asked and just put a big comment. And keep the old
name (or maybe add 'DEPRECATED'?)
> 
> >  #define TMEM_NEW_POOL              1
> >  #define TMEM_DESTROY_POOL          2
> >  #define TMEM_PUT_PAGE              4
> > @@ -48,35 +48,9 @@
> >  #endif
> >  
> >  /* Privileged commands to HYPERVISOR_tmem_op() */
> > -#define TMEM_AUTH                 101 
> > +#define TMEM_AUTH                 101
> >  #define TMEM_RESTORE_NEW          102
> >  
> > -/* Subops for HYPERVISOR_tmem_op(TMEM_CONTROL) */
> > -#define TMEMC_THAW                   0
> > -#define TMEMC_FREEZE                 1
> > -#define TMEMC_FLUSH                  2
> > -#define TMEMC_DESTROY                3
> > -#define TMEMC_LIST                   4
> > -#define TMEMC_SET_WEIGHT             5
> > -#define TMEMC_SET_CAP                6
> > -#define TMEMC_SET_COMPRESS           7
> > -#define TMEMC_QUERY_FREEABLE_MB      8
> > -#define TMEMC_SAVE_BEGIN             10
> > -#define TMEMC_SAVE_GET_VERSION       11
> > -#define TMEMC_SAVE_GET_MAXPOOLS      12
> > -#define TMEMC_SAVE_GET_CLIENT_WEIGHT 13
> > -#define TMEMC_SAVE_GET_CLIENT_CAP    14
> > -#define TMEMC_SAVE_GET_CLIENT_FLAGS  15
> > -#define TMEMC_SAVE_GET_POOL_FLAGS    16
> > -#define TMEMC_SAVE_GET_POOL_NPAGES   17
> > -#define TMEMC_SAVE_GET_POOL_UUID     18
> > -#define TMEMC_SAVE_GET_NEXT_PAGE     19
> > -#define TMEMC_SAVE_GET_NEXT_INV      20
> > -#define TMEMC_SAVE_END               21
> > -#define TMEMC_RESTORE_BEGIN          30
> > -#define TMEMC_RESTORE_PUT_PAGE       32
> > -#define TMEMC_RESTORE_FLUSH_PAGE     33
> > -
> >  /* Bits for HYPERVISOR_tmem_op(TMEM_NEW_POOL) */
> >  #define TMEM_POOL_PERSIST          1
> >  #define TMEM_POOL_SHARED           2
> > @@ -110,16 +84,7 @@ struct tmem_op {
> >              uint32_t flags;
> >              uint32_t arg1;
> >          } creat; /* for cmd == TMEM_NEW_POOL, TMEM_AUTH, TMEM_RESTORE_NEW */
> > -        struct { 
> > -            uint32_t subop;
> > -            uint32_t cli_id;
> > -            uint32_t arg1;
> > -            uint32_t arg2;
> > -            uint64_t oid[3];
> > -            tmem_cli_va_t buf;
> > -        } ctrl; /* for cmd == TMEM_CONTROL */
> >          struct {
> > -            
> >              uint64_t oid[3];
> >              uint32_t index;
> >              uint32_t tmem_offset;
> > diff --git a/xen/include/xen/tmem.h b/xen/include/xen/tmem.h
> > index 5dbf9d5..32a542a 100644
> > --- a/xen/include/xen/tmem.h
> > +++ b/xen/include/xen/tmem.h
> > @@ -9,6 +9,9 @@
> >  #ifndef __XEN_TMEM_H__
> >  #define __XEN_TMEM_H__
> >  
> > +struct xen_sysctl_tmem_op;
> 
> #include<public/sysctl.h> please.

I tried that and it blew up with tons of other dependencies. This makes
it much simpler without adding extra #include.
> 
> ~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 18:47     ` Konrad Rzeszutek Wilk
@ 2015-08-27 18:58       ` Andrew Cooper
  2015-08-27 19:20         ` Konrad Rzeszutek Wilk
  2015-08-28 14:15         ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 32+ messages in thread
From: Andrew Cooper @ 2015-08-27 18:58 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, wei.liu2, jbeulich

On 27/08/15 19:47, Konrad Rzeszutek Wilk wrote:
>
>>> --- a/xen/include/public/sysctl.h
>>> +++ b/xen/include/public/sysctl.h
>>> @@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
>>>  typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
>>>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
>>>  
>>> +#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
>>> +
>>> +#define XEN_SYSCTL_TMEM_OP_THAW                   0
>>> +#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
>>> +#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
>>> +#define XEN_SYSCTL_TMEM_OP_DESTROY                3
>>> +#define XEN_SYSCTL_TMEM_OP_LIST                   4
>>> +#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
>>> +#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
>>> +#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
>>> +#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
>>> +#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
>>> +#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
>>> +#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
>>> +
>>> +struct xen_sysctl_tmem_op {
>>> +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
>>> +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
>>> +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
>>> +                           for all others can be the domain id or
>>> +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
>>> +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
>>> +    uint32_t arg2;      /* IN: If not applicable to command use 0. */
>> Please can this interface be fixed as part of the move, even if it is in
>> subsequent patches for clarity.
> I will gladly fix this interface in further patches. By all means!

What I wish to avoid is 4.6 releasing with the API in this state, as
adjusting valgrind and strace to compensate will be miserable.

The best solution would be to have this patch and the fixups adjacent in
the series, at which point the valgrind/strace adjustments can start
with the clean API for 4.6

>>> +    uint64_t oid[3];    /* IN: If not applicable to command use 0. */
>>> +    XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save and restore ops. */
>>> +};
>>> +typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
>>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
>>> +
>>>  struct xen_sysctl {
>>>      uint32_t cmd;
>>>  #define XEN_SYSCTL_readconsole                    1
>>> @@ -734,6 +776,7 @@ struct xen_sysctl {
>>>  #define XEN_SYSCTL_psr_cmt_op                    21
>>>  #define XEN_SYSCTL_pcitopoinfo                   22
>>>  #define XEN_SYSCTL_psr_cat_op                    23
>>> +#define XEN_SYSCTL_tmem_op                       24
>>>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>>>      union {
>>>          struct xen_sysctl_readconsole       readconsole;
>>> @@ -758,6 +801,7 @@ struct xen_sysctl {
>>>          struct xen_sysctl_coverage_op       coverage_op;
>>>          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
>>>          struct xen_sysctl_psr_cat_op        psr_cat_op;
>>> +        struct xen_sysctl_tmem_op           tmem_op;
>>>          uint8_t                             pad[128];
>>>      } u;
>>>  };
>>> diff --git a/xen/include/public/tmem.h b/xen/include/public/tmem.h
>>> index 4fd2fc6..208f5a6 100644
>>> --- a/xen/include/public/tmem.h
>>> +++ b/xen/include/public/tmem.h
>>> @@ -33,7 +33,7 @@
>>>  #define TMEM_SPEC_VERSION          1
>>>  
>>>  /* Commands to HYPERVISOR_tmem_op() */
>>> -#define TMEM_CONTROL               0
>>> +#define TMEM_CONTROL_MOVED         0
>> If anything, this should be deleted as opposed to being renamed.  The
>> _MOVED suffix is confusing, as TMEM_CONTROL_MOVED sounds like it could
>> be a plausible TMEM operation.
> Ah, will do what Jan asked and just put a big comment. And keep the old
> name (or maybe add 'DEPRECATED'?)

If you are going to the effort of renaming, then just delete.  Either
way will break the build of unsuspecting code, and the latter leaves us
with less junk.

A comment however will be useful in all cases.

>>>  #define TMEM_NEW_POOL              1
>>>  #define TMEM_DESTROY_POOL          2
>>>  #define TMEM_PUT_PAGE              4
>>> @@ -48,35 +48,9 @@
>>>  #endif
>>>  
>>>  /* Privileged commands to HYPERVISOR_tmem_op() */
>>> -#define TMEM_AUTH                 101 
>>> +#define TMEM_AUTH                 101
>>>  #define TMEM_RESTORE_NEW          102
>>>  
>>> -/* Subops for HYPERVISOR_tmem_op(TMEM_CONTROL) */
>>> -#define TMEMC_THAW                   0
>>> -#define TMEMC_FREEZE                 1
>>> -#define TMEMC_FLUSH                  2
>>> -#define TMEMC_DESTROY                3
>>> -#define TMEMC_LIST                   4
>>> -#define TMEMC_SET_WEIGHT             5
>>> -#define TMEMC_SET_CAP                6
>>> -#define TMEMC_SET_COMPRESS           7
>>> -#define TMEMC_QUERY_FREEABLE_MB      8
>>> -#define TMEMC_SAVE_BEGIN             10
>>> -#define TMEMC_SAVE_GET_VERSION       11
>>> -#define TMEMC_SAVE_GET_MAXPOOLS      12
>>> -#define TMEMC_SAVE_GET_CLIENT_WEIGHT 13
>>> -#define TMEMC_SAVE_GET_CLIENT_CAP    14
>>> -#define TMEMC_SAVE_GET_CLIENT_FLAGS  15
>>> -#define TMEMC_SAVE_GET_POOL_FLAGS    16
>>> -#define TMEMC_SAVE_GET_POOL_NPAGES   17
>>> -#define TMEMC_SAVE_GET_POOL_UUID     18
>>> -#define TMEMC_SAVE_GET_NEXT_PAGE     19
>>> -#define TMEMC_SAVE_GET_NEXT_INV      20
>>> -#define TMEMC_SAVE_END               21
>>> -#define TMEMC_RESTORE_BEGIN          30
>>> -#define TMEMC_RESTORE_PUT_PAGE       32
>>> -#define TMEMC_RESTORE_FLUSH_PAGE     33
>>> -
>>>  /* Bits for HYPERVISOR_tmem_op(TMEM_NEW_POOL) */
>>>  #define TMEM_POOL_PERSIST          1
>>>  #define TMEM_POOL_SHARED           2
>>> @@ -110,16 +84,7 @@ struct tmem_op {
>>>              uint32_t flags;
>>>              uint32_t arg1;
>>>          } creat; /* for cmd == TMEM_NEW_POOL, TMEM_AUTH, TMEM_RESTORE_NEW */
>>> -        struct { 
>>> -            uint32_t subop;
>>> -            uint32_t cli_id;
>>> -            uint32_t arg1;
>>> -            uint32_t arg2;
>>> -            uint64_t oid[3];
>>> -            tmem_cli_va_t buf;
>>> -        } ctrl; /* for cmd == TMEM_CONTROL */
>>>          struct {
>>> -            
>>>              uint64_t oid[3];
>>>              uint32_t index;
>>>              uint32_t tmem_offset;
>>> diff --git a/xen/include/xen/tmem.h b/xen/include/xen/tmem.h
>>> index 5dbf9d5..32a542a 100644
>>> --- a/xen/include/xen/tmem.h
>>> +++ b/xen/include/xen/tmem.h
>>> @@ -9,6 +9,9 @@
>>>  #ifndef __XEN_TMEM_H__
>>>  #define __XEN_TMEM_H__
>>>  
>>> +struct xen_sysctl_tmem_op;
>> #include<public/sysctl.h> please.
> I tried that and it blew up with tons of other dependencies. This makes
> it much simpler without adding extra #include.

Fair enough.

~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 18:43     ` Konrad Rzeszutek Wilk
@ 2015-08-27 19:02       ` Andrew Cooper
  2015-08-28  7:03       ` Jan Beulich
  1 sibling, 0 replies; 32+ messages in thread
From: Andrew Cooper @ 2015-08-27 19:02 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Jan Beulich; +Cc: xen-devel, wei.liu2

On 27/08/15 19:43, Konrad Rzeszutek Wilk wrote:
>
>>> --- a/xen/include/public/sysctl.h
>>> +++ b/xen/include/public/sysctl.h
>>> @@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
>>>  typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
>>>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
>>>  
>>> +#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
>>> +
>>> +#define XEN_SYSCTL_TMEM_OP_THAW                   0
>>> +#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
>>> +#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
>>> +#define XEN_SYSCTL_TMEM_OP_DESTROY                3
>>> +#define XEN_SYSCTL_TMEM_OP_LIST                   4
>>> +#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
>>> +#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
>>> +#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
>>> +#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
>>> +#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
>>> +#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
>>> +#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
>>> +#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
>> Perhaps better to have all these in tmem.h, to not clutter this
>> header?
> Yes and no. The other sysctl had their #defines for commands here so I figured I
> would follow that rule. But I am OK keeping it in tmem.h and just do
>
> /* For the 'cmd' values consule tmem.h. Look for XEN_SYSCTL_TMEM_OP_* */

Better to stay consistent with sysctl.h.   Separating the structure and
the subops is unhelpful to people reading the code.

>>> +struct xen_sysctl_tmem_op {
>>> +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
>>> +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
>>> +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
>>> +                           for all others can be the domain id or
>>> +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
>>> +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
>>> +    uint32_t arg2;      /* IN: If not applicable to command use 0. */
>>> +    uint8_t  pad[4];    /* Padding so structure is the same under 32 and 64. */
>> uint32_t please. And despite this being an (easily changeable) sysctl,
>> verifying that it's zero on input would be nice.
> Yes! That was what I forgotten!
>>> --- a/xen/include/public/tmem.h
>>> +++ b/xen/include/public/tmem.h
>>> @@ -33,7 +33,7 @@
>>>  #define TMEM_SPEC_VERSION          1
>>>  
>>>  /* Commands to HYPERVISOR_tmem_op() */
>>> -#define TMEM_CONTROL               0
>>> +#define TMEM_CONTROL_MOVED         0
>> Perhaps say where it moved in a brief comment?
> Yes, good idea.
>
> /* Deprecated. These operations are done now via XEN_SYSCTL_tmem_op
>  * using the sysctl hypercall. */

I hope "XEN_SYSCTL_tmem_op" is sufficient to imply "using the sysctl
hypercall" ;)

>
>>> --- a/xen/xsm/flask/hooks.c
>>> +++ b/xen/xsm/flask/hooks.c
>>> @@ -761,6 +761,9 @@ static int flask_sysctl(int cmd)
>>>      case XEN_SYSCTL_tbuf_op:
>>>          return domain_has_xen(current->domain, XEN__TBUFCONTROL);
>>>  
>>> +    case XEN_SYSCTL_tmem_op:
>>> +        return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
>>> +
>>>      case XEN_SYSCTL_sched_id:
>>>          return domain_has_xen(current->domain, XEN__GETSCHEDULER);
>> Hmm, these cases appear to be roughly sorted numerically, i.e.
>> yours would normally go at the end.
> I recall a comment from Andrew asking the newly introduced commands to
> be in alphabetical order. But perhaps that was for domctl which is more
> .. ah.. random?

Really? that doesn't sound like me.  Apologies if it was.  (Alphabetic
for obj-y in a Makefile perhaps?)

Numeric would be far better here.

~Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 18:58       ` Andrew Cooper
@ 2015-08-27 19:20         ` Konrad Rzeszutek Wilk
  2015-08-28  8:31           ` Jan Beulich
  2015-08-28 14:15         ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-27 19:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, wei.liu2, jbeulich

> >>> +struct xen_sysctl_tmem_op {
> >>> +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
> >>> +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
> >>> +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
> >>> +                           for all others can be the domain id or
> >>> +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
> >>> +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
> >>> +    uint32_t arg2;      /* IN: If not applicable to command use 0. */
> >> Please can this interface be fixed as part of the move, even if it is in
> >> subsequent patches for clarity.
> > I will gladly fix this interface in further patches. By all means!
> 
> What I wish to avoid is 4.6 releasing with the API in this state, as
> adjusting valgrind and strace to compensate will be miserable.

Right.
> 
> The best solution would be to have this patch and the fixups adjacent in
> the series, at which point the valgrind/strace adjustments can start
> with the clean API for 4.6

Yes. I shall post this patchset plus extra patches next week.

> 
> >>> +    uint64_t oid[3];    /* IN: If not applicable to command use 0. */
> >>> +    XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save and restore ops. */
> >>> +};
> >>> +typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
> >>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
> >>> +
> >>>  struct xen_sysctl {
> >>>      uint32_t cmd;
> >>>  #define XEN_SYSCTL_readconsole                    1
> >>> @@ -734,6 +776,7 @@ struct xen_sysctl {
> >>>  #define XEN_SYSCTL_psr_cmt_op                    21
> >>>  #define XEN_SYSCTL_pcitopoinfo                   22
> >>>  #define XEN_SYSCTL_psr_cat_op                    23
> >>> +#define XEN_SYSCTL_tmem_op                       24
> >>>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
> >>>      union {
> >>>          struct xen_sysctl_readconsole       readconsole;
> >>> @@ -758,6 +801,7 @@ struct xen_sysctl {
> >>>          struct xen_sysctl_coverage_op       coverage_op;
> >>>          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
> >>>          struct xen_sysctl_psr_cat_op        psr_cat_op;
> >>> +        struct xen_sysctl_tmem_op           tmem_op;
> >>>          uint8_t                             pad[128];
> >>>      } u;
> >>>  };
> >>> diff --git a/xen/include/public/tmem.h b/xen/include/public/tmem.h
> >>> index 4fd2fc6..208f5a6 100644
> >>> --- a/xen/include/public/tmem.h
> >>> +++ b/xen/include/public/tmem.h
> >>> @@ -33,7 +33,7 @@
> >>>  #define TMEM_SPEC_VERSION          1
> >>>  
> >>>  /* Commands to HYPERVISOR_tmem_op() */
> >>> -#define TMEM_CONTROL               0
> >>> +#define TMEM_CONTROL_MOVED         0
> >> If anything, this should be deleted as opposed to being renamed.  The
> >> _MOVED suffix is confusing, as TMEM_CONTROL_MOVED sounds like it could
> >> be a plausible TMEM operation.
> > Ah, will do what Jan asked and just put a big comment. And keep the old
> > name (or maybe add 'DEPRECATED'?)
> 
> If you are going to the effort of renaming, then just delete.  Either
> way will break the build of unsuspecting code, and the latter leaves us
> with less junk.

It will break the tmem hypervisor code as it has a check for
TMEM_CONTROL (which per Jan suggestion should return -E.. something).

> 
> A comment however will be useful in all cases.

Aye.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 6/8] tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall.
  2015-08-27 11:02 ` [PATCH v2 6/8] tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall Konrad Rzeszutek Wilk
@ 2015-08-27 20:59   ` Daniel De Graaf
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel De Graaf @ 2015-08-27 20:59 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, jbeulich, andrew.cooper3,
	wei.liu2

On 27/08/15 07:02, Konrad Rzeszutek Wilk wrote:
> The sysctl is where the tmem control operations are done and the
> XSM checks are done via there. The old mechanism (to check
> for control tmem op XSM from do_tmem_op) is not needed anymore.
>
> CC:  Daniel De Graaf <dgdegra@tycho.nsa.gov>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Both this and patch 5 (which adds the sysctl check):
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 18:43     ` Konrad Rzeszutek Wilk
  2015-08-27 19:02       ` Andrew Cooper
@ 2015-08-28  7:03       ` Jan Beulich
  1 sibling, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2015-08-28  7:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: andrew.cooper3, wei.liu2, xen-devel

>>> On 27.08.15 at 20:43, <konrad.wilk@oracle.com> wrote:
> On Thu, Aug 27, 2015 at 09:11:35AM -0600, Jan Beulich wrote:
>> >>> On 27.08.15 at 13:02, <konrad.wilk@oracle.com> wrote:
>> > --- a/xen/include/public/sysctl.h
>> > +++ b/xen/include/public/sysctl.h
>> > @@ -710,6 +710,48 @@ struct xen_sysctl_psr_cat_op {
>> >  typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
>> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
>> >  
>> > +#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
>> > +
>> > +#define XEN_SYSCTL_TMEM_OP_THAW                   0
>> > +#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
>> > +#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
>> > +#define XEN_SYSCTL_TMEM_OP_DESTROY                3
>> > +#define XEN_SYSCTL_TMEM_OP_LIST                   4
>> > +#define XEN_SYSCTL_TMEM_OP_SET_WEIGHT             5
>> > +#define XEN_SYSCTL_TMEM_OP_SET_CAP                6
>> > +#define XEN_SYSCTL_TMEM_OP_SET_COMPRESS           7
>> > +#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION       11
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS      12
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP    14
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS  15
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS    16
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES   17
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID     18
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
>> > +#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
>> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
>> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
>> > +#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
>> 
>> Perhaps better to have all these in tmem.h, to not clutter this
>> header?
> 
> Yes and no. The other sysctl had their #defines for commands here so I 
> figured I
> would follow that rule. But I am OK keeping it in tmem.h and just do
> 
> /* For the 'cmd' values consule tmem.h. Look for XEN_SYSCTL_TMEM_OP_* */

Having seen Andrew's reply, I'd like to clarify that I'm not insisting
on where these go - I just voiced my opinion without meaning for
it to stand in the way of this going in.

>> > --- a/xen/include/public/tmem.h
>> > +++ b/xen/include/public/tmem.h
>> > @@ -33,7 +33,7 @@
>> >  #define TMEM_SPEC_VERSION          1
>> >  
>> >  /* Commands to HYPERVISOR_tmem_op() */
>> > -#define TMEM_CONTROL               0
>> > +#define TMEM_CONTROL_MOVED         0
>> 
>> Perhaps say where it moved in a brief comment?
> 
> Yes, good idea.
> 
> /* Deprecated. These operations are done now via XEN_SYSCTL_tmem_op
>  * using the sysctl hypercall. */

It's not deprecated (or else it would still work), it's no longer
supported. After having written to original reply, I thought
of

#undef TMEM_CONTROL /* replaced by XEN_SYSCTL_tmem_op */

Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 19:20         ` Konrad Rzeszutek Wilk
@ 2015-08-28  8:31           ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2015-08-28  8:31 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, wei.liu2, xen-devel

>>> On 27.08.15 at 21:20, <konrad.wilk@oracle.com> wrote:
>> >>> --- a/xen/include/public/tmem.h
>> >>> +++ b/xen/include/public/tmem.h
>> >>> @@ -33,7 +33,7 @@
>> >>>  #define TMEM_SPEC_VERSION          1
>> >>>  
>> >>>  /* Commands to HYPERVISOR_tmem_op() */
>> >>> -#define TMEM_CONTROL               0
>> >>> +#define TMEM_CONTROL_MOVED         0
>> >> If anything, this should be deleted as opposed to being renamed.  The
>> >> _MOVED suffix is confusing, as TMEM_CONTROL_MOVED sounds like it could
>> >> be a plausible TMEM operation.
>> > Ah, will do what Jan asked and just put a big comment. And keep the old
>> > name (or maybe add 'DEPRECATED'?)
>> 
>> If you are going to the effort of renaming, then just delete.  Either
>> way will break the build of unsuspecting code, and the latter leaves us
>> with less junk.
> 
> It will break the tmem hypervisor code as it has a check for
> TMEM_CONTROL (which per Jan suggestion should return -E.. something).

That could be taken care of by putting the #define inside
#ifdef __XEN__ (and do the earlier suggested #undef in the
#else).

Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-27 18:58       ` Andrew Cooper
  2015-08-27 19:20         ` Konrad Rzeszutek Wilk
@ 2015-08-28 14:15         ` Konrad Rzeszutek Wilk
  2015-08-28 14:40           ` Jan Beulich
  1 sibling, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-08-28 14:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, wei.liu2, jbeulich

[-- Attachment #1: Type: text/plain, Size: 7124 bytes --]

> >>> +struct xen_sysctl_tmem_op {
> >>> +    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
> >>> +    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
> >>> +    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
> >>> +                           for all others can be the domain id or
> >>> +                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
> >>> +    uint32_t arg1;      /* IN: If not applicable to command use 0. */
> >>> +    uint32_t arg2;      /* IN: If not applicable to command use 0. */
> >> Please can this interface be fixed as part of the move, even if it is in
> >> subsequent patches for clarity.
> > I will gladly fix this interface in further patches. By all means!
> 
> What I wish to avoid is 4.6 releasing with the API in this state, as
> adjusting valgrind and strace to compensate will be miserable.
> 
> The best solution would be to have this patch and the fixups adjacent in
> the series, at which point the valgrind/strace adjustments can start
> with the clean API for 4.6

See attached two patches. The first one makes the 'oid[3]' be a nice structure.

Then I decided to see if I can expand that to also be part of the
'tmem_op', which looked legit (as it is the same size and offset and
pahole wise it looks right).

But sadly the compat layer is not happy with me:


In file included from /home/konrad/ssd/konrad/xen/xen/include/xen/compat.h:12:0,
                 from /home/konrad/ssd/konrad/xen/xen/include/compat/xen.h:3,
                 from /home/konrad/ssd/konrad/xen/xen/include/xen/shared.h:6,
                 from /home/konrad/ssd/konrad/xen/xen/include/xen/sched.h:7,
                 from setup.c:5:
/home/konrad/ssd/konrad/xen/xen/include/xen/tmem_xen.h: In function ‘tmem_get_tmemop_from_client’:
/home/konrad/ssd/konrad/xen/xen/include/compat/xlat.h:935:26: error: incompatible types when assigning to type ‘xen_tmem_oid_t’ from type ‘compat_tmem_oid_t’
         (_d_)->u.gen.oid = (_s_)->u.gen.oid; \
                          ^
/home/konrad/ssd/konrad/xen/xen/include/xen/tmem_xen.h:306:9: note: in expansion of macro ‘XLAT_tmem_op’
         XLAT_tmem_op(op, &cop);
         ^
make[3]: *** [setup.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from /home/konrad/ssd/konrad/xen/xen/include/xen/compat.h:12:0,
                 from /home/konrad/ssd/konrad/xen/xen/include/compat/xen.h:3,
                 from /home/konrad/ssd/konrad/xen/xen/include/xen/shared.h:6,
                 from /home/konrad/ssd/konrad/xen/xen/include/xen/sched.h:7,
                 from memory.c:15:
/home/konrad/ssd/konrad/xen/xen/include/xen/tmem_xen.h: In function ‘tmem_get_tmemop_from_client’:
/home/konrad/ssd/konrad/xen/xen/include/compat/xlat.h:935:26: error: incompatible types when assigning to type ‘xen_tmem_oid_t’ from type ‘compat_tmem_oid_t’
         (_d_)->u.gen.oid = (_s_)->u.gen.oid; \
                          ^
/home/konrad/ssd/konrad/xen/xen/include/xen/tmem_xen.h:306:9: note: in expansion of macro ‘XLAT_tmem_op’
         XLAT_tmem_op(op, &cop);
         ^
make[3]: *** [memory.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from /home/konrad/ssd/konrad/xen/xen/include/xen/compat.h:12:0,
                 from /home/konrad/ssd/konrad/xen/xen/include/compat/xen.h:3,
                 from /home/konrad/ssd/konrad/xen/xen/include/xen/shared.h:6,
                 from /home/konrad/ssd/konrad/xen/xen/include/xen/sched.h:7,
                 from page_alloc.c:27:
/home/konrad/ssd/konrad/xen/xen/include/xen/tmem_xen.h: In function ‘tmem_get_tmemop_from_client’:
/home/konrad/ssd/konrad/xen/xen/include/compat/xlat.h:935:26: error: incompatible types when assigning to type ‘xen_tmem_oid_t’ from type ‘compat_tmem_oid_t’
         (_d_)->u.gen.oid = (_s_)->u.gen.oid; \
                          ^
/home/konrad/ssd/konrad/xen/xen/include/xen/tmem_xen.h:306:9: note: in expansion of macro ‘XLAT_tmem_op’
         XLAT_tmem_op(op, &cop);
         ^
make[3]: *** [page_alloc.o] Error 1
make[2]: *** [/home/konrad/ssd/konrad/xen/xen/common/built_in.o] Error 2
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [/home/konrad/ssd/konrad/xen/xen/arch/x86/built_in.o] Error 2
make[1]: *** [/home/konrad/ssd/konrad/xen/xen/xen] Error 2
make: *** [build] Error 2

Which is right:


struct compat_tmem_oid {
    uint64_t oid[3];
};
typedef struct compat_tmem_oid compat_tmem_oid_t;

typedef COMPAT_HANDLE(char) tmem_cli_va_compat_t;
struct compat_tmem_op {
    uint32_t cmd;
    int32_t pool_id;
    union {
        struct {
            uint64_t uuid[2];
            uint32_t flags;
            uint32_t arg1;
        } creat;
        struct {

            compat_tmem_oid_t oid;   <====== We have 'struct tmem_oid' in the tmem_op.

            uint32_t index;
            uint32_t tmem_offset;
            uint32_t pfn_offset;
            uint32_t len;
            compat_pfn_t cmfn;
        } gen;
    } u;
};
typedef struct compat_tmem_op tmem_op_compat_t;
DEFINE_COMPAT_HANDLE(tmem_op_compat_t);
#define XLAT_tmem_op(_d_, _s_) do { \
    (_d_)->cmd = (_s_)->cmd; \
    (_d_)->pool_id = (_s_)->pool_id; \
    switch (u) { \
    case XLAT_tmem_op_u_creat: \
        { \
            unsigned int i0; \
            for (i0 = 0; i0 <  2; ++i0) { \
                (_d_)->u.creat.uuid[i0] = (_s_)->u.creat.uuid[i0]; \
            } \
        } \
        (_d_)->u.creat.flags = (_s_)->u.creat.flags; \
        (_d_)->u.creat.arg1 = (_s_)->u.creat.arg1; \
        break; \
    case XLAT_tmem_op_u_gen: \
        (_d_)->u.gen.oid = (_s_)->u.gen.oid; \
        (_d_)->u.gen.index = (_s_)->u.gen.index; \
        (_d_)->u.gen.tmem_offset = (_s_)->u.gen.tmem_offset; \
        (_d_)->u.gen.pfn_offset = (_s_)->u.gen.pfn_offset; \
        (_d_)->u.gen.len = (_s_)->u.gen.len; \
        (_d_)->u.gen.cmfn = (_s_)->u.gen.cmfn; \
        break; \
    } \
} while (0)


static inline int tmem_get_tmemop_from_client(tmem_op_t *op, tmem_cli_op_t uops)
{
#ifdef CONFIG_COMPAT
    if ( has_hvm_container_vcpu(current) ?
         hvm_guest_x86_mode(current) != 8 :
         is_pv_32bit_vcpu(current) )
    {
        int rc;
        enum XLAT_tmem_op_u u;
        tmem_op_compat_t cop;

        rc = copy_from_guest(&cop, guest_handle_cast(uops, void), 1);
        if ( rc )
            return rc;
        switch ( cop.cmd )
        {
        case TMEM_NEW_POOL:   u = XLAT_tmem_op_u_creat; break;
        case TMEM_AUTH:       u = XLAT_tmem_op_u_creat; break;
        case TMEM_RESTORE_NEW:u = XLAT_tmem_op_u_creat; break;
        default:              u = XLAT_tmem_op_u_gen ;  break;
        }
#define XLAT_tmem_op_HNDL_u_ctrl_buf(_d_, _s_) \
        guest_from_compat_handle((_d_)->u.ctrl.buf, (_s_)->u.ctrl.buf)
        XLAT_tmem_op(op, &cop);
#undef XLAT_tmem_op_HNDL_u_ctrl_buf
        return 0;
    }
#endif
    return copy_from_guest(op, uops, 1);
}

[-- Attachment #2: 0001-tmem-Make-the-uint64_t-oid-3-a-proper-structure-xen_.patch --]
[-- Type: text/plain, Size: 9093 bytes --]

>From 93918a423a1f78af16f4b19cec30014b5448abcb Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 28 Aug 2015 10:02:33 -0400
Subject: [PATCH 1/2] tmem: Make the uint64_t oid[3] a proper structure:
 xen_sysctl_tmem_oid

And use it almost everywhere. It is easy to use it for the
sysctl since the hypervisor and toolstack are intertwined.

But for the tmem hypercall we need to be dilligient.

We also move some of the parameters on functions to be within
the right location.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/common/tmem.c           | 57 ++++++++++++++++++++++++---------------------
 xen/include/public/sysctl.h |  7 +++++-
 2 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index b62b56e..3a1345a 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -125,12 +125,8 @@ struct tmem_pool {
 #define is_persistent(_p)  (_p->persistent)
 #define is_shared(_p)      (_p->shared)
 
-struct oid {
-    uint64_t oid[3];
-};
-
 struct tmem_object_root {
-    struct oid oid;
+    struct xen_sysctl_tmem_oid oid;
     struct rb_node rb_tree_node; /* protected by pool->pool_rwlock */
     unsigned long objnode_count; /* atomicity depends on obj_spinlock */
     long pgp_count; /* atomicity depends on obj_spinlock */
@@ -158,7 +154,7 @@ struct tmem_page_descriptor {
             };
             struct tmem_object_root *obj;
         } us;
-        struct oid inv_oid;  /* used for invalid list only */
+        struct xen_sysctl_tmem_oid inv_oid;  /* used for invalid list only */
     };
     pagesize_t size; /* 0 == PAGE_SIZE (pfp), -1 == data invalid,
                     else compressed data (cdata) */
@@ -815,7 +811,8 @@ static void rtn_free(struct radix_tree_node *rtn, void *arg)
 
 /************ POOL OBJECT COLLECTION MANIPULATION ROUTINES *******************/
 
-static int oid_compare(struct oid *left, struct oid *right)
+static int oid_compare(struct xen_sysctl_tmem_oid *left,
+                       struct xen_sysctl_tmem_oid *right)
 {
     if ( left->oid[2] == right->oid[2] )
     {
@@ -839,19 +836,20 @@ static int oid_compare(struct oid *left, struct oid *right)
         return 1;
 }
 
-static void oid_set_invalid(struct oid *oidp)
+static void oid_set_invalid(struct xen_sysctl_tmem_oid *oidp)
 {
     oidp->oid[0] = oidp->oid[1] = oidp->oid[2] = -1UL;
 }
 
-static unsigned oid_hash(struct oid *oidp)
+static unsigned oid_hash(struct xen_sysctl_tmem_oid *oidp)
 {
     return (tmem_hash(oidp->oid[0] ^ oidp->oid[1] ^ oidp->oid[2],
                      BITS_PER_LONG) & OBJ_HASH_BUCKETS_MASK);
 }
 
 /* searches for object==oid in pool, returns locked object if found */
-static struct tmem_object_root * obj_find(struct tmem_pool *pool, struct oid *oidp)
+static struct tmem_object_root * obj_find(struct tmem_pool *pool,
+                                          struct xen_sysctl_tmem_oid *oidp)
 {
     struct rb_node *node;
     struct tmem_object_root *obj;
@@ -887,7 +885,7 @@ restart_find:
 static void obj_free(struct tmem_object_root *obj)
 {
     struct tmem_pool *pool;
-    struct oid old_oid;
+    struct xen_sysctl_tmem_oid old_oid;
 
     ASSERT_SPINLOCK(&obj->obj_spinlock);
     ASSERT(obj != NULL);
@@ -946,7 +944,8 @@ static int obj_rb_insert(struct rb_root *root, struct tmem_object_root *obj)
  * allocate, initialize, and insert an tmem_object_root
  * (should be called only if find failed)
  */
-static struct tmem_object_root * obj_alloc(struct tmem_pool *pool, struct oid *oidp)
+static struct tmem_object_root * obj_alloc(struct tmem_pool *pool,
+                                           struct xen_sysctl_tmem_oid *oidp)
 {
     struct tmem_object_root *obj;
 
@@ -1531,8 +1530,8 @@ cleanup:
 }
 
 static int do_tmem_put(struct tmem_pool *pool,
-              struct oid *oidp, uint32_t index,
-              xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
+                       struct xen_sysctl_tmem_oid *oidp, uint32_t index,
+                       xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
 {
     struct tmem_object_root *obj = NULL;
     struct tmem_page_descriptor *pgp = NULL;
@@ -1696,8 +1695,9 @@ unlock_obj:
     return ret;
 }
 
-static int do_tmem_get(struct tmem_pool *pool, struct oid *oidp, uint32_t index,
-              xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
+static int do_tmem_get(struct tmem_pool *pool,
+                       struct xen_sysctl_tmem_oid *oidp, uint32_t index,
+                       xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
 {
     struct tmem_object_root *obj;
     struct tmem_page_descriptor *pgp;
@@ -1774,7 +1774,8 @@ bad_copy:
     return rc;
 }
 
-static int do_tmem_flush_page(struct tmem_pool *pool, struct oid *oidp, uint32_t index)
+static int do_tmem_flush_page(struct tmem_pool *pool,
+                              struct xen_sysctl_tmem_oid *oidp, uint32_t index)
 {
     struct tmem_object_root *obj;
     struct tmem_page_descriptor *pgp;
@@ -1807,7 +1808,8 @@ out:
         return 1;
 }
 
-static int do_tmem_flush_object(struct tmem_pool *pool, struct oid *oidp)
+static int do_tmem_flush_object(struct tmem_pool *pool,
+                                struct xen_sysctl_tmem_oid *oidp)
 {
     struct tmem_object_root *obj;
 
@@ -2432,7 +2434,7 @@ static int tmemc_save_get_next_page(int cli_id, uint32_t pool_id,
     struct tmem_pool *pool = (client == NULL || pool_id >= MAX_POOLS_PER_DOMAIN)
                    ? NULL : client->pools[pool_id];
     struct tmem_page_descriptor *pgp;
-    struct oid oid;
+    struct xen_sysctl_tmem_oid oid;
     int ret = 0;
     struct tmem_handle h;
 
@@ -2525,8 +2527,10 @@ out:
     return ret;
 }
 
-static int tmemc_restore_put_page(int cli_id, uint32_t pool_id, struct oid *oidp,
-                      uint32_t index, tmem_cli_va_param_t buf, uint32_t bufsize)
+static int tmemc_restore_put_page(int cli_id, uint32_t pool_id,
+                                  struct xen_sysctl_tmem_oid *oidp,
+                                  uint32_t index, tmem_cli_va_param_t buf,
+                                  uint32_t bufsize)
 {
     struct client *client = tmem_client_from_cli_id(cli_id);
     struct tmem_pool *pool = (client == NULL || pool_id >= MAX_POOLS_PER_DOMAIN)
@@ -2542,8 +2546,9 @@ static int tmemc_restore_put_page(int cli_id, uint32_t pool_id, struct oid *oidp
     return do_tmem_put(pool, oidp, index, 0, buf);
 }
 
-static int tmemc_restore_flush_page(int cli_id, uint32_t pool_id, struct oid *oidp,
-                        uint32_t index)
+static int tmemc_restore_flush_page(int cli_id, uint32_t pool_id,
+                                    struct xen_sysctl_tmem_oid *oidp,
+                                    uint32_t index)
 {
     struct client *client = tmem_client_from_cli_id(cli_id);
     struct tmem_pool *pool = (client == NULL || pool_id >= MAX_POOLS_PER_DOMAIN)
@@ -2559,7 +2564,7 @@ int tmem_control(struct xen_sysctl_tmem_op *op)
     int ret;
     uint32_t pool_id = op->pool_id;
     uint32_t cmd = op->cmd;
-    struct oid *oidp = (struct oid *)(&op->oid[0]);
+    struct xen_sysctl_tmem_oid *oidp = &op->oid;
 
     if ( op->pad != 0 )
         return -EINVAL;
@@ -2633,7 +2638,7 @@ long do_tmem_op(tmem_cli_op_t uops)
     struct tmem_op op;
     struct client *client = current->domain->tmem_client;
     struct tmem_pool *pool = NULL;
-    struct oid *oidp;
+    struct xen_sysctl_tmem_oid *oidp;
     int rc = 0;
     bool_t succ_get = 0, succ_put = 0;
     bool_t non_succ_get = 0, non_succ_put = 0;
@@ -2714,7 +2719,7 @@ long do_tmem_op(tmem_cli_op_t uops)
             write_unlock(&tmem_rwlock);
             read_lock(&tmem_rwlock);
 
-            oidp = (struct oid *)&op.u.gen.oid[0];
+            oidp = (struct xen_sysctl_tmem_oid *)&op.u.gen.oid[0];
             switch ( op.cmd )
             {
             case TMEM_NEW_POOL:
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index c1566e6..ee5b33f 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -737,6 +737,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
 #define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
 #define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
 
+struct xen_sysctl_tmem_oid {
+    uint64_t oid[3];
+};
+typedef struct xen_sysctl_tmem_oid xen_sysctl_tmem_oid_t;
+
 struct xen_sysctl_tmem_op {
     uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
     int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
@@ -746,7 +751,7 @@ struct xen_sysctl_tmem_op {
     uint32_t arg1;      /* IN: If not applicable to command use 0. */
     uint32_t arg2;      /* IN: If not applicable to command use 0. */
     uint32_t pad;       /* Padding so structure is the same under 32 and 64. */
-    uint64_t oid[3];    /* IN: If not applicable to command use 0. */
+    xen_sysctl_tmem_oid_t oid;     /* IN: If not applicable to command use 0. */
     XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save and restore ops. */
 };
 typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
-- 
2.1.0


[-- Attachment #3: 0002-tmem-Use-struct-xen_tmem_oid-for-all-every-user.patch --]
[-- Type: text/plain, Size: 10529 bytes --]

>From 35c2063457b440cf20359bb04aeb0a0981aa7388 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 28 Aug 2015 10:03:22 -0400
Subject: [PATCH 2/2] tmem: Use 'struct xen_tmem_oid' for all every user.

Patch "tmem: Make the uint64_t oid[3] a proper structure:
xen_sysctl_tmem_oid" converted the sysctl API to use an
proper structure. But it did not do it for the tmem hypercall.

This expands that and converts the tmem hypercall. For this
to work properly we change the name to 'struct xen_tmem_oid'
and use it everywhere. To deflect the #include usage of tmem.h
in sysctl.h or vice-versa we define it in both places and
use an #define to keep compiler happy.

We had to expand the compat layer to copy now each entry.

The layout (and size) of this structure in memory for the
'struct tmem_op' (so guest facing) is the same! Verified
via pahole.

*TODO*: Test with 32/64 bit guests.

--- /tmp/old    2015-08-27 16:34:00.535638730 -0400
+++ /tmp/new    2015-08-27 16:34:10.447705328 -0400
@@ -8,7 +8,7 @@
                        uint32_t   arg1;                 /*    28     4 */
                } creat;                                 /*          24 */
                struct {
-                       uint64_t   oid[3];               /*     8    24 */
+                       xen_tmem_oid_t oid;              /*     8    24 */
                        uint32_t   index;                /*    32     4 */
                        uint32_t   tmem_offset;          /*    36     4 */
                        uint32_t   pfn_offset;           /*    40     4 */

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/common/tmem.c           | 38 +++++++++++++++++++-------------------
 xen/include/public/sysctl.h |  9 ++++++---
 xen/include/public/tmem.h   | 13 ++++++++++++-
 3 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index 3a1345a..bd228b6 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -126,7 +126,7 @@ struct tmem_pool {
 #define is_shared(_p)      (_p->shared)
 
 struct tmem_object_root {
-    struct xen_sysctl_tmem_oid oid;
+    struct xen_tmem_oid oid;
     struct rb_node rb_tree_node; /* protected by pool->pool_rwlock */
     unsigned long objnode_count; /* atomicity depends on obj_spinlock */
     long pgp_count; /* atomicity depends on obj_spinlock */
@@ -154,7 +154,7 @@ struct tmem_page_descriptor {
             };
             struct tmem_object_root *obj;
         } us;
-        struct xen_sysctl_tmem_oid inv_oid;  /* used for invalid list only */
+        struct xen_tmem_oid inv_oid;  /* used for invalid list only */
     };
     pagesize_t size; /* 0 == PAGE_SIZE (pfp), -1 == data invalid,
                     else compressed data (cdata) */
@@ -811,8 +811,8 @@ static void rtn_free(struct radix_tree_node *rtn, void *arg)
 
 /************ POOL OBJECT COLLECTION MANIPULATION ROUTINES *******************/
 
-static int oid_compare(struct xen_sysctl_tmem_oid *left,
-                       struct xen_sysctl_tmem_oid *right)
+static int oid_compare(struct xen_tmem_oid *left,
+                       struct xen_tmem_oid *right)
 {
     if ( left->oid[2] == right->oid[2] )
     {
@@ -836,12 +836,12 @@ static int oid_compare(struct xen_sysctl_tmem_oid *left,
         return 1;
 }
 
-static void oid_set_invalid(struct xen_sysctl_tmem_oid *oidp)
+static void oid_set_invalid(struct xen_tmem_oid *oidp)
 {
     oidp->oid[0] = oidp->oid[1] = oidp->oid[2] = -1UL;
 }
 
-static unsigned oid_hash(struct xen_sysctl_tmem_oid *oidp)
+static unsigned oid_hash(struct xen_tmem_oid *oidp)
 {
     return (tmem_hash(oidp->oid[0] ^ oidp->oid[1] ^ oidp->oid[2],
                      BITS_PER_LONG) & OBJ_HASH_BUCKETS_MASK);
@@ -849,7 +849,7 @@ static unsigned oid_hash(struct xen_sysctl_tmem_oid *oidp)
 
 /* searches for object==oid in pool, returns locked object if found */
 static struct tmem_object_root * obj_find(struct tmem_pool *pool,
-                                          struct xen_sysctl_tmem_oid *oidp)
+                                          struct xen_tmem_oid *oidp)
 {
     struct rb_node *node;
     struct tmem_object_root *obj;
@@ -885,7 +885,7 @@ restart_find:
 static void obj_free(struct tmem_object_root *obj)
 {
     struct tmem_pool *pool;
-    struct xen_sysctl_tmem_oid old_oid;
+    struct xen_tmem_oid old_oid;
 
     ASSERT_SPINLOCK(&obj->obj_spinlock);
     ASSERT(obj != NULL);
@@ -945,7 +945,7 @@ static int obj_rb_insert(struct rb_root *root, struct tmem_object_root *obj)
  * (should be called only if find failed)
  */
 static struct tmem_object_root * obj_alloc(struct tmem_pool *pool,
-                                           struct xen_sysctl_tmem_oid *oidp)
+                                           struct xen_tmem_oid *oidp)
 {
     struct tmem_object_root *obj;
 
@@ -1530,7 +1530,7 @@ cleanup:
 }
 
 static int do_tmem_put(struct tmem_pool *pool,
-                       struct xen_sysctl_tmem_oid *oidp, uint32_t index,
+                       struct xen_tmem_oid *oidp, uint32_t index,
                        xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
 {
     struct tmem_object_root *obj = NULL;
@@ -1696,7 +1696,7 @@ unlock_obj:
 }
 
 static int do_tmem_get(struct tmem_pool *pool,
-                       struct xen_sysctl_tmem_oid *oidp, uint32_t index,
+                       struct xen_tmem_oid *oidp, uint32_t index,
                        xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
 {
     struct tmem_object_root *obj;
@@ -1775,7 +1775,7 @@ bad_copy:
 }
 
 static int do_tmem_flush_page(struct tmem_pool *pool,
-                              struct xen_sysctl_tmem_oid *oidp, uint32_t index)
+                              struct xen_tmem_oid *oidp, uint32_t index)
 {
     struct tmem_object_root *obj;
     struct tmem_page_descriptor *pgp;
@@ -1809,7 +1809,7 @@ out:
 }
 
 static int do_tmem_flush_object(struct tmem_pool *pool,
-                                struct xen_sysctl_tmem_oid *oidp)
+                                struct xen_tmem_oid *oidp)
 {
     struct tmem_object_root *obj;
 
@@ -2434,7 +2434,7 @@ static int tmemc_save_get_next_page(int cli_id, uint32_t pool_id,
     struct tmem_pool *pool = (client == NULL || pool_id >= MAX_POOLS_PER_DOMAIN)
                    ? NULL : client->pools[pool_id];
     struct tmem_page_descriptor *pgp;
-    struct xen_sysctl_tmem_oid oid;
+    struct xen_tmem_oid oid;
     int ret = 0;
     struct tmem_handle h;
 
@@ -2528,7 +2528,7 @@ out:
 }
 
 static int tmemc_restore_put_page(int cli_id, uint32_t pool_id,
-                                  struct xen_sysctl_tmem_oid *oidp,
+                                  struct xen_tmem_oid *oidp,
                                   uint32_t index, tmem_cli_va_param_t buf,
                                   uint32_t bufsize)
 {
@@ -2547,7 +2547,7 @@ static int tmemc_restore_put_page(int cli_id, uint32_t pool_id,
 }
 
 static int tmemc_restore_flush_page(int cli_id, uint32_t pool_id,
-                                    struct xen_sysctl_tmem_oid *oidp,
+                                    struct xen_tmem_oid *oidp,
                                     uint32_t index)
 {
     struct client *client = tmem_client_from_cli_id(cli_id);
@@ -2564,7 +2564,7 @@ int tmem_control(struct xen_sysctl_tmem_op *op)
     int ret;
     uint32_t pool_id = op->pool_id;
     uint32_t cmd = op->cmd;
-    struct xen_sysctl_tmem_oid *oidp = &op->oid;
+    struct xen_tmem_oid *oidp = &op->oid;
 
     if ( op->pad != 0 )
         return -EINVAL;
@@ -2638,7 +2638,7 @@ long do_tmem_op(tmem_cli_op_t uops)
     struct tmem_op op;
     struct client *client = current->domain->tmem_client;
     struct tmem_pool *pool = NULL;
-    struct xen_sysctl_tmem_oid *oidp;
+    struct xen_tmem_oid *oidp;
     int rc = 0;
     bool_t succ_get = 0, succ_put = 0;
     bool_t non_succ_get = 0, non_succ_put = 0;
@@ -2719,7 +2719,7 @@ long do_tmem_op(tmem_cli_op_t uops)
             write_unlock(&tmem_rwlock);
             read_lock(&tmem_rwlock);
 
-            oidp = (struct xen_sysctl_tmem_oid *)&op.u.gen.oid[0];
+            oidp = &op.u.gen.oid;
             switch ( op.cmd )
             {
             case TMEM_NEW_POOL:
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index ee5b33f..e0bdda0 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -737,10 +737,13 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
 #define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
 #define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
 
-struct xen_sysctl_tmem_oid {
+#ifndef TMEM_OID_DEFINED
+#define TMEM_OID_DEFINED
+struct xen_tmem_oid {
     uint64_t oid[3];
 };
-typedef struct xen_sysctl_tmem_oid xen_sysctl_tmem_oid_t;
+typedef struct xen_tmem_oid xen_tmem_oid_t;
+#endif
 
 struct xen_sysctl_tmem_op {
     uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
@@ -751,7 +754,7 @@ struct xen_sysctl_tmem_op {
     uint32_t arg1;      /* IN: If not applicable to command use 0. */
     uint32_t arg2;      /* IN: If not applicable to command use 0. */
     uint32_t pad;       /* Padding so structure is the same under 32 and 64. */
-    xen_sysctl_tmem_oid_t oid;     /* IN: If not applicable to command use 0. */
+    xen_tmem_oid_t oid; /* IN: If not applicable to command use 0. */
     XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save and restore ops. */
 };
 typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
diff --git a/xen/include/public/tmem.h b/xen/include/public/tmem.h
index e4ee704..24ffa58 100644
--- a/xen/include/public/tmem.h
+++ b/xen/include/public/tmem.h
@@ -73,6 +73,13 @@
 #define EFROZEN                 1000
 #define EEMPTY                  1001
 
+#ifndef TMEM_OID_DEFINED
+#define TMEM_OID_DEFINED
+struct xen_tmem_oid {
+    uint64_t oid[3];
+};
+typedef struct xen_tmem_oid xen_tmem_oid_t;
+#endif
 
 #ifndef __ASSEMBLY__
 #if __XEN_INTERFACE_VERSION__ < 0x00040400
@@ -89,7 +96,11 @@ struct tmem_op {
             uint32_t arg1;
         } creat; /* for cmd == TMEM_NEW_POOL, TMEM_AUTH, TMEM_RESTORE_NEW */
         struct {
+#if __XEN_INTERFACE_VERSION__ < 0x00040600
             uint64_t oid[3];
+#else
+            xen_tmem_oid_t oid;
+#endif
             uint32_t index;
             uint32_t tmem_offset;
             uint32_t pfn_offset;
@@ -104,7 +115,7 @@ DEFINE_XEN_GUEST_HANDLE(tmem_op_t);
 struct tmem_handle {
     uint32_t pool_id;
     uint32_t index;
-    uint64_t oid[3];
+    xen_tmem_oid_t oid;
 };
 #endif
 
-- 
2.1.0


[-- Attachment #4: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
  2015-08-28 14:15         ` Konrad Rzeszutek Wilk
@ 2015-08-28 14:40           ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2015-08-28 14:40 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, wei.liu2, xen-devel

>>> On 28.08.15 at 16:15, <konrad.wilk@oracle.com> wrote:
> Then I decided to see if I can expand that to also be part of the
> 'tmem_op', which looked legit (as it is the same size and offset and
> pahole wise it looks right).
> 
> But sadly the compat layer is not happy with me:
> 
> 
> In file included from 
> /home/konrad/ssd/konrad/xen/xen/include/xen/compat.h:12:0,
>                  from 
> /home/konrad/ssd/konrad/xen/xen/include/compat/xen.h:3,
>                  from 
> /home/konrad/ssd/konrad/xen/xen/include/xen/shared.h:6,
>                  from /home/konrad/ssd/konrad/xen/xen/include/xen/sched.h:7,
>                  from setup.c:5:
> /home/konrad/ssd/konrad/xen/xen/include/xen/tmem_xen.h: In function 
> ‘tmem_get_tmemop_from_client’:
> /home/konrad/ssd/konrad/xen/xen/include/compat/xlat.h:935:26: error: 
> incompatible types when assigning to type ‘xen_tmem_oid_t’ from type 
> ‘compat_tmem_oid_t’
>          (_d_)->u.gen.oid = (_s_)->u.gen.oid; \

I think the main missing piece is the addition of the new structure
to include/xlat.lst. See xen_processor_performance and its
sub-structure struct xen_pct_register (which generally wouldn't
need translation) for reference. (Even if two structures are laid
out identically you still can't assign one to the other by means
other than a member-by-member copy.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2015-08-28 14:40 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-27 11:01 [PATCH v2] Tmem bug-fixes and cleanups Konrad Rzeszutek Wilk
2015-08-27 11:01 ` [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest Konrad Rzeszutek Wilk
2015-08-27 11:21   ` Wei Liu
2015-08-27 11:01 ` [PATCH v2 2/8] tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock Konrad Rzeszutek Wilk
2015-08-27 14:55   ` Jan Beulich
2015-08-27 11:01 ` [PATCH v2 3/8] tmem: remove in xc_tmem_control_oid duplicate set_xen_guest_handle call Konrad Rzeszutek Wilk
2015-08-27 11:01 ` [PATCH v2 4/8] tmem: Remove xc_tmem_control mystical arg3 Konrad Rzeszutek Wilk
2015-08-27 11:15   ` Wei Liu
2015-08-27 12:53   ` Andrew Cooper
2015-08-27 18:38     ` Konrad Rzeszutek Wilk
2015-08-27 11:02 ` [PATCH v2 5/8] tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl Konrad Rzeszutek Wilk
2015-08-27 11:22   ` Wei Liu
2015-08-27 13:12   ` Andrew Cooper
2015-08-27 15:13     ` Jan Beulich
2015-08-27 18:47     ` Konrad Rzeszutek Wilk
2015-08-27 18:58       ` Andrew Cooper
2015-08-27 19:20         ` Konrad Rzeszutek Wilk
2015-08-28  8:31           ` Jan Beulich
2015-08-28 14:15         ` Konrad Rzeszutek Wilk
2015-08-28 14:40           ` Jan Beulich
2015-08-27 15:11   ` Jan Beulich
2015-08-27 18:43     ` Konrad Rzeszutek Wilk
2015-08-27 19:02       ` Andrew Cooper
2015-08-28  7:03       ` Jan Beulich
2015-08-27 11:02 ` [PATCH v2 6/8] tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall Konrad Rzeszutek Wilk
2015-08-27 20:59   ` Daniel De Graaf
2015-08-27 11:02 ` [PATCH v2 7/8] tmem: Remove extra spaces at end and some hard tabbing Konrad Rzeszutek Wilk
2015-08-27 13:19   ` Andrew Cooper
2015-08-27 11:02 ` [PATCH v2 8/8] tmem: Spelling mistakes Konrad Rzeszutek Wilk
2015-08-27 15:14   ` Jan Beulich
2015-08-27 13:30 ` [PATCH v2] Tmem bug-fixes and cleanups Andrew Cooper
2015-08-27 14:10 ` Wei Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).