xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature
@ 2013-12-05  9:38 Dongxiao Xu
  2013-12-05  9:38 ` [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature Dongxiao Xu
                   ` (7 more replies)
  0 siblings, 8 replies; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

Changes from v5:
 - Address comments from Dario Faggioli, including:
   * Define a new libxl_cqminfo structure to avoid reference of xc
     structure in libxl functions.
   * Use LOGE() instead of the LIBXL__LOG() functions.

Changes from v4:
 - When comparing xl cqm parameter, use strcmp instead of strncmp,
   otherwise, "xl pqos-attach cqmabcd domid" will be considered as
   a valid command line.
 - Address comments from Andrew Cooper, including:
   * Adjust the pqos parameter parsing function.
   * Modify the pqos related documentation.
   * Add a check for opt_cqm_max_rmid in initialization code.
   * Do not IPI CPU that is in same socket with current CPU.
 - Address comments from Dario Faggioli, including:
   * Fix an typo in export symbols.
   * Return correct libxl error code for qos related functions.
   * Abstract the error printing logic into a function.
 - Address comment from Daniel De Graaf, including:
   * Add return value in for pqos related check.
 - Address comments from Konrad Rzeszutek Wilk, including:
   * Modify the GPLv2 related file header, remove the address.

Changes from v3:
 - Use structure to better organize CQM related global variables.
 - Address comments from Andrew Cooper, including:
   * Remove the domain creation flag for CQM RMID allocation.
   * Adjust the boot parameter format, use custom_param().
   * Add documentation for the new added boot parameter.
   * Change QoS type flag to be uint64_t.
   * Initialize the per socket cpu bitmap in system boot time.
   * Remove get_cqm_avail() function.
   * Misc of format changes.
 - Address comment from Daniel De Graaf, including:
   * Use avc_current_has_perm() for XEN2__PQOS_OP that belongs to SECCLASS_XEN2.

Changes from v2:
 - Address comments from Andrew Cooper, including:
   * Merging tools stack changes into one patch.
   * Reduce the IPI number to one per socket.
   * Change structures for CQM data exchange between tools and Xen.
   * Misc of format/variable/function name changes.
 - Address comments from Konrad Rzeszutek Wilk, including:
   * Simplify the error printing logic.
   * Add xsm check for the new added hypercalls.

Changes from v1:
 - Address comments from Andrew Cooper, including:
   * Change function names, e.g., alloc_cqm_rmid(), system_supports_cqm(), etc.
   * Change some structure element order to save packing cost.
   * Correct some function's return value.
   * Some programming styles change.
   * ...

Future generations of Intel Xeon processor may offer monitoring capability in
each logical processor to measure specific quality-of-service metric,
for example, the Cache QoS Monitoring to get L3 cache occupancy.
Detailed information please refer to Intel SDM chapter 17.14.

Cache QoS Monitoring provides a layer of abstraction between applications and
logical processors through the use of Resource Monitoring IDs (RMIDs).
In Xen design, each guest in the system can be assigned an RMID independently,
while RMID=0 is reserved for monitoring domains that doesn't enable CQM service.
When any of the domain's vcpu is scheduled on a logical processor, the domain's
RMID will be activated by programming the value into one specific MSR, and when
the vcpu is scheduled out, a RMID=0 will be programmed into that MSR.
The Cache QoS Hardware tracks cache utilization of memory accesses according to
the RMIDs and reports monitored data via a counter register. With this solution,
we can get the knowledge how much L3 cache is used by a certain guest.

To attach CQM service to a certain guest, two approaches are provided:
1) Create the guest with "pqos_cqm=1" set in configuration file.
2) Use "xl pqos-attach cqm domid" for a running guest.

To detached CQM service from a guest, users can:
1) Use "xl pqos-detach cqm domid" for a running guest.
2) Also destroying a guest will detach the CQM service.

To get the L3 cache usage, users can use the command of:
$ xl pqos-list cqm (domid)

The below data is just an example showing how the CQM related data is exposed to
end user.

[root@localhost]# xl pqos-list cqm
Name               ID  SocketID        L3C_Usage       SocketID        L3C_Usage
Domain-0            0         0         20127744              1         25231360
ExampleHVMDomain    1         0          3211264              1         10551296

RMID count    56        RMID available    53

Dongxiao Xu (7):
  x86: detect and initialize Cache QoS Monitoring feature
  x86: dynamically attach/detach CQM service for a guest
  x86: initialize per socket cpu map
  x86: collect CQM information from all sockets
  x86: enable CQM monitoring for each domain RMID
  xsm: add platform QoS related xsm policies
  tools: enable Cache QoS Monitoring feature for libxl/libxc

 docs/misc/xen-command-line.markdown          |    7 +
 tools/flask/policy/policy/modules/xen/xen.if |    2 +-
 tools/flask/policy/policy/modules/xen/xen.te |    5 +-
 tools/libxc/xc_domain.c                      |   47 +++++
 tools/libxc/xenctrl.h                        |   11 ++
 tools/libxl/Makefile                         |    3 +-
 tools/libxl/libxl.h                          |    6 +
 tools/libxl/libxl_pqos.c                     |  163 +++++++++++++++++
 tools/libxl/libxl_types.idl                  |   13 ++
 tools/libxl/xl.h                             |    3 +
 tools/libxl/xl_cmdimpl.c                     |  133 ++++++++++++++
 tools/libxl/xl_cmdtable.c                    |   15 ++
 xen/arch/x86/Makefile                        |    1 +
 xen/arch/x86/cpu/intel.c                     |    6 +
 xen/arch/x86/domain.c                        |    8 +
 xen/arch/x86/domctl.c                        |   40 +++++
 xen/arch/x86/pqos.c                          |  242 ++++++++++++++++++++++++++
 xen/arch/x86/setup.c                         |    3 +
 xen/arch/x86/smp.c                           |    7 +-
 xen/arch/x86/smpboot.c                       |   19 +-
 xen/arch/x86/sysctl.c                        |   65 +++++++
 xen/include/asm-x86/cpufeature.h             |    1 +
 xen/include/asm-x86/domain.h                 |    2 +
 xen/include/asm-x86/msr-index.h              |    5 +
 xen/include/asm-x86/pqos.h                   |   51 ++++++
 xen/include/asm-x86/smp.h                    |    2 +
 xen/include/public/domctl.h                  |   20 +++
 xen/include/public/sysctl.h                  |   10 ++
 xen/include/xen/cpumask.h                    |    1 +
 xen/xsm/flask/hooks.c                        |    8 +
 xen/xsm/flask/policy/access_vectors          |   17 +-
 31 files changed, 907 insertions(+), 9 deletions(-)
 create mode 100644 tools/libxl/libxl_pqos.c
 create mode 100644 xen/arch/x86/pqos.c
 create mode 100644 xen/include/asm-x86/pqos.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
@ 2013-12-05  9:38 ` Dongxiao Xu
  2013-12-05 10:54   ` Andrew Cooper
  2014-01-20 13:00   ` Jan Beulich
  2013-12-05  9:38 ` [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest Dongxiao Xu
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

Detect platform QoS feature status and enumerate the resource types,
one of which is to monitor the L3 cache occupancy.

Also introduce a Xen grub command line parameter to control the
QoS feature status.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
 docs/misc/xen-command-line.markdown |    7 ++
 xen/arch/x86/Makefile               |    1 +
 xen/arch/x86/cpu/intel.c            |    6 ++
 xen/arch/x86/pqos.c                 |  130 +++++++++++++++++++++++++++++++++++
 xen/arch/x86/setup.c                |    3 +
 xen/include/asm-x86/cpufeature.h    |    1 +
 xen/include/asm-x86/pqos.h          |   38 ++++++++++
 7 files changed, 186 insertions(+)
 create mode 100644 xen/arch/x86/pqos.c
 create mode 100644 xen/include/asm-x86/pqos.h

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 15aa404..7751ffe 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -770,6 +770,13 @@ This option can be specified more than once (up to 8 times at present).
 ### ple\_window
 > `= <integer>`
 
+### pqos (Intel)
+> `= List of ( <boolean> | cqm:<boolean> | cqm_max_rmid:<integer> )`
+
+> Default: `pqos=1,cqm:1,cqm_max_rmid:255`
+
+Configure platform QoS services.
+
 ### reboot
 > `= t[riple] | k[bd] | n[o] [, [w]arm | [c]old]`
 
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index d502bdf..54962e0 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -58,6 +58,7 @@ obj-y += crash.o
 obj-y += tboot.o
 obj-y += hpet.o
 obj-y += xstate.o
+obj-y += pqos.o
 
 obj-$(crash_debug) += gdbstub.o
 
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index 27fe762..f0d83ea 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c)
 	     ( c->cpuid_level >= 0x00000006 ) &&
 	     ( cpuid_eax(0x00000006) & (1u<<2) ) )
 		set_bit(X86_FEATURE_ARAT, c->x86_capability);
+
+	/* Check platform QoS monitoring capability */
+	if ((c->cpuid_level >= 0x00000007) &&
+	    (cpuid_ebx(0x00000007) & (1u<<12)))
+		set_bit(X86_FEATURE_QOSM, c->x86_capability);
+
 }
 
 static struct cpu_dev intel_cpu_dev __cpuinitdata = {
diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c
new file mode 100644
index 0000000..d78048a
--- /dev/null
+++ b/xen/arch/x86/pqos.c
@@ -0,0 +1,130 @@
+/*
+ * pqos.c: Platform QoS related service for guest.
+ *
+ * Copyright (c) 2013, Intel Corporation
+ * Author: Jiongxi Li  <jiongxi.li@intel.com>
+ * Author: Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+#include <asm/processor.h>
+#include <xen/init.h>
+#include <asm/pqos.h>
+
+static bool_t __initdata opt_pqos = 1;
+static bool_t __initdata opt_cqm = 1;
+static unsigned int __initdata opt_cqm_max_rmid = 255;
+
+static void __init parse_pqos_param(char *s)
+{
+    char *ss, *val_str;
+    int val;
+
+    do {
+        ss = strchr(s, ',');
+        if ( ss )
+            *ss = '\0';
+
+        val = parse_bool(s);
+        if ( val >= 0 )
+            opt_pqos = val;
+        else
+        {
+            val = !!strncmp(s, "no-", 3);
+            if ( !val )
+                s += 3;
+
+            val_str = strchr(s, ':');
+            if ( val_str )
+                *val_str++ = '\0';
+
+            if ( val_str && !strcmp(s, "cqm") &&
+                 (val = parse_bool(val_str)) >= 0 )
+                opt_cqm = val;
+            else if ( val_str && !strcmp(s, "cqm_max_rmid") )
+                opt_cqm_max_rmid = simple_strtoul(val_str, NULL, 0);
+        }
+
+        s = ss + 1;
+    } while ( ss );
+}
+
+custom_param("pqos", parse_pqos_param);
+
+struct pqos_cqm *cqm;
+
+static void __init init_cqm(void)
+{
+    unsigned int rmid;
+    unsigned int eax, edx;
+
+    if ( !opt_cqm_max_rmid )
+        return;
+
+    cqm = xzalloc(struct pqos_cqm);
+    if ( !cqm )
+        return;
+
+    cpuid_count(0xf, 1, &eax, &cqm->upscaling_factor, &cqm->max_rmid, &edx);
+    if ( !(edx & QOS_MONITOR_EVTID_L3) )
+    {
+        xfree(cqm);
+        return;
+    }
+
+    cqm->min_rmid = 1;
+    cqm->max_rmid = min(opt_cqm_max_rmid, cqm->max_rmid);
+
+    cqm->rmid_to_dom = xmalloc_array(domid_t, cqm->max_rmid + 1);
+    if ( !cqm->rmid_to_dom )
+    {
+        xfree(cqm);
+        return;
+    }
+
+    /* Reserve RMID 0 for all domains not being monitored */
+    cqm->rmid_to_dom[0] = DOMID_XEN;
+    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
+        cqm->rmid_to_dom[rmid] = DOMID_INVALID;
+
+    printk(XENLOG_INFO "Cache QoS Monitoring Enabled.\n");
+}
+
+static void __init init_qos_monitor(void)
+{
+    unsigned int qm_features;
+    unsigned int eax, ebx, ecx;
+
+    if ( !(boot_cpu_has(X86_FEATURE_QOSM)) )
+        return;
+
+    cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features);
+
+    if ( opt_cqm && (qm_features & QOS_MONITOR_TYPE_L3) )
+        init_cqm();
+}
+
+void __init init_platform_qos(void)
+{
+    if ( !opt_pqos )
+        return;
+
+    init_qos_monitor();
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 5bf4ee0..95418e4 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -48,6 +48,7 @@
 #include <asm/setup.h>
 #include <xen/cpu.h>
 #include <asm/nmi.h>
+#include <asm/pqos.h>
 
 /* opt_nosmp: If true, secondary processors are ignored. */
 static bool_t __initdata opt_nosmp;
@@ -1402,6 +1403,8 @@ void __init __start_xen(unsigned long mbi_p)
 
     domain_unpause_by_systemcontroller(dom0);
 
+    init_platform_qos();
+
     reset_stack_and_jump(init_done);
 }
 
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 1cfaf94..ca59668 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -147,6 +147,7 @@
 #define X86_FEATURE_ERMS	(7*32+ 9) /* Enhanced REP MOVSB/STOSB */
 #define X86_FEATURE_INVPCID	(7*32+10) /* Invalidate Process Context ID */
 #define X86_FEATURE_RTM 	(7*32+11) /* Restricted Transactional Memory */
+#define X86_FEATURE_QOSM	(7*32+12) /* Platform QoS monitoring capability */
 #define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
 #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
 
diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h
new file mode 100644
index 0000000..da925db
--- /dev/null
+++ b/xen/include/asm-x86/pqos.h
@@ -0,0 +1,38 @@
+/*
+ * pqos.h: Platform QoS related service for guest.
+ *
+ * Copyright (c) 2013, Intel Corporation
+ * Author: Jiongxi Li  <jiongxi.li@intel.com>
+ * Author: Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+#ifndef ASM_PQOS_H
+#define ASM_PQOS_H
+
+#include <public/xen.h>
+
+/* QoS Resource Type Enumeration */
+#define QOS_MONITOR_TYPE_L3            0x2
+
+/* QoS Monitoring Event ID */
+#define QOS_MONITOR_EVTID_L3           0x1
+
+struct pqos_cqm {
+    unsigned int min_rmid;
+    unsigned int max_rmid;
+    unsigned int upscaling_factor;
+    domid_t *rmid_to_dom;
+};
+extern struct pqos_cqm *cqm;
+
+void init_platform_qos(void);
+
+#endif
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
  2013-12-05  9:38 ` [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature Dongxiao Xu
@ 2013-12-05  9:38 ` Dongxiao Xu
  2014-01-20 13:13   ` Jan Beulich
  2013-12-05  9:38 ` [PATCH v6 3/7] x86: initialize per socket cpu map Dongxiao Xu
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

Add hypervisor side support for dynamically attach and detach CQM
services for a certain guest.

When attach CQM service for a guest, system will allocate an RMID for
it. When detach or guest is shutdown, the RMID will be retrieved back
for future use.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
 xen/arch/x86/domain.c        |    3 +++
 xen/arch/x86/domctl.c        |   40 +++++++++++++++++++++++++++++++++
 xen/arch/x86/pqos.c          |   51 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/domain.h |    2 ++
 xen/include/asm-x86/pqos.h   |    5 +++++
 xen/include/public/domctl.h  |   11 +++++++++
 6 files changed, 112 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a3868f9..90e52a2 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -60,6 +60,7 @@
 #include <xen/numa.h>
 #include <xen/iommu.h>
 #include <compat/vcpu.h>
+#include <asm/pqos.h>
 
 DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
 DEFINE_PER_CPU(unsigned long, cr4);
@@ -612,6 +613,8 @@ void arch_domain_destroy(struct domain *d)
 
     free_xenheap_page(d->shared_info);
     cleanup_domain_irq_mapping(d);
+
+    free_cqm_rmid(d);
 }
 
 unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long guest_cr4)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index f7e4586..7007990 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -35,6 +35,7 @@
 #include <asm/mem_sharing.h>
 #include <asm/xstate.h>
 #include <asm/debugger.h>
+#include <asm/pqos.h>
 
 static int gdbsx_guest_mem_io(
     domid_t domid, struct xen_domctl_gdbsx_memio *iop)
@@ -1223,6 +1224,45 @@ long arch_do_domctl(
     }
     break;
 
+    case XEN_DOMCTL_attach_pqos:
+    {
+        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
+        {
+            if ( !system_supports_cqm() )
+                ret = -ENODEV;
+            else if ( d->arch.pqos_cqm_rmid > 0 )
+                ret = -EEXIST;
+            else
+            {
+                ret = alloc_cqm_rmid(d);
+                if ( ret < 0 )
+                    ret = -EUSERS;
+            }
+        }
+        else
+            ret = -EINVAL;
+    }
+    break;
+
+    case XEN_DOMCTL_detach_pqos:
+    {
+        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
+        {
+            if ( !system_supports_cqm() )
+                ret = -ENODEV;
+            else if ( d->arch.pqos_cqm_rmid > 0 )
+            {
+                free_cqm_rmid(d);
+                ret = 0;
+            }
+            else
+                ret = -ENOENT;
+        }
+        else
+            ret = -EINVAL;
+    }
+    break;
+
     default:
         ret = iommu_do_domctl(domctl, d, u_domctl);
         break;
diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c
index d78048a..67d733e 100644
--- a/xen/arch/x86/pqos.c
+++ b/xen/arch/x86/pqos.c
@@ -16,6 +16,7 @@
  */
 #include <asm/processor.h>
 #include <xen/init.h>
+#include <xen/spinlock.h>
 #include <asm/pqos.h>
 
 static bool_t __initdata opt_pqos = 1;
@@ -59,6 +60,7 @@ static void __init parse_pqos_param(char *s)
 custom_param("pqos", parse_pqos_param);
 
 struct pqos_cqm *cqm;
+static DEFINE_SPINLOCK(cqm_lock);
 
 static void __init init_cqm(void)
 {
@@ -119,6 +121,55 @@ void __init init_platform_qos(void)
     init_qos_monitor();
 }
 
+bool_t system_supports_cqm(void)
+{
+    return !!cqm;
+}
+
+int alloc_cqm_rmid(struct domain *d)
+{
+    int rc = 0;
+    unsigned int rmid;
+    unsigned long flags;
+
+    ASSERT(system_supports_cqm());
+
+    spin_lock_irqsave(&cqm_lock, flags);
+    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
+    {
+        if ( cqm->rmid_to_dom[rmid] != DOMID_INVALID)
+            continue;
+
+        cqm->rmid_to_dom[rmid] = d->domain_id;
+        break;
+    }
+    spin_unlock_irqrestore(&cqm_lock, flags);
+
+    /* No CQM RMID available, assign RMID=0 by default */
+    if ( rmid > cqm->max_rmid )
+    {
+        rmid = 0;
+        rc = -1;
+    }
+
+    d->arch.pqos_cqm_rmid = rmid;
+
+    return rc;
+}
+
+void free_cqm_rmid(struct domain *d)
+{
+    unsigned int rmid = d->arch.pqos_cqm_rmid;
+
+    /* We do not free system reserved "RMID=0" */
+    if ( rmid == 0 )
+        return;
+
+    cqm->rmid_to_dom[rmid] = DOMID_INVALID;
+
+    d->arch.pqos_cqm_rmid = 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 9d39061..9487251 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -313,6 +313,8 @@ struct arch_domain
     spinlock_t e820_lock;
     struct e820entry *e820;
     unsigned int nr_e820;
+
+    unsigned int pqos_cqm_rmid;       /* CQM RMID assigned to the domain */
 } __cacheline_aligned;
 
 #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))
diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h
index da925db..16c4882 100644
--- a/xen/include/asm-x86/pqos.h
+++ b/xen/include/asm-x86/pqos.h
@@ -16,6 +16,7 @@
  */
 #ifndef ASM_PQOS_H
 #define ASM_PQOS_H
+#include <xen/sched.h>
 
 #include <public/xen.h>
 
@@ -35,4 +36,8 @@ extern struct pqos_cqm *cqm;
 
 void init_platform_qos(void);
 
+bool_t system_supports_cqm(void);
+int alloc_cqm_rmid(struct domain *d);
+void free_cqm_rmid(struct domain *d);
+
 #endif
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 01a3652..d53e216 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -869,6 +869,14 @@ struct xen_domctl_set_max_evtchn {
 typedef struct xen_domctl_set_max_evtchn xen_domctl_set_max_evtchn_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_max_evtchn_t);
 
+struct xen_domctl_qos_type {
+#define _XEN_DOMCTL_pqos_cqm      0
+#define XEN_DOMCTL_pqos_cqm       (1U<<_XEN_DOMCTL_pqos_cqm)
+    uint64_t flags;
+};
+typedef struct xen_domctl_qos_type xen_domctl_qos_type_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_qos_type_t);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -938,6 +946,8 @@ struct xen_domctl {
 #define XEN_DOMCTL_setnodeaffinity               68
 #define XEN_DOMCTL_getnodeaffinity               69
 #define XEN_DOMCTL_set_max_evtchn                70
+#define XEN_DOMCTL_attach_pqos                   71
+#define XEN_DOMCTL_detach_pqos                   72
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -998,6 +1008,7 @@ struct xen_domctl {
         struct xen_domctl_set_broken_page_p2m set_broken_page_p2m;
         struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
         struct xen_domctl_gdbsx_domstatus   gdbsx_domstatus;
+        struct xen_domctl_qos_type          qos_type;
         uint8_t                             pad[128];
     } u;
 };
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v6 3/7] x86: initialize per socket cpu map
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
  2013-12-05  9:38 ` [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature Dongxiao Xu
  2013-12-05  9:38 ` [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest Dongxiao Xu
@ 2013-12-05  9:38 ` Dongxiao Xu
  2014-01-20 13:26   ` Jan Beulich
  2013-12-05  9:38 ` [PATCH v6 4/7] x86: collect CQM information from all sockets Dongxiao Xu
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

For each socket in the system, we create a separate bitmap to tag its
related CPUs. This per socket bitmap will be initialized on system
start up, and adjusted when CPU is dynamically online/offline.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
 xen/arch/x86/smp.c        |    7 ++++++-
 xen/arch/x86/smpboot.c    |   19 +++++++++++++++++--
 xen/include/asm-x86/smp.h |    2 ++
 xen/include/xen/cpumask.h |    1 +
 4 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 0433f30..7959447 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -283,6 +283,9 @@ void smp_send_call_function_mask(const cpumask_t *mask)
 
 void __stop_this_cpu(void)
 {
+    int cpu = smp_processor_id();
+    int socket = cpu_to_socket(cpu);
+
     ASSERT(!local_irq_is_enabled());
 
     disable_local_APIC();
@@ -296,7 +299,9 @@ void __stop_this_cpu(void)
     clts();
     asm volatile ( "fninit" );
 
-    cpumask_clear_cpu(smp_processor_id(), &cpu_online_map);
+    cpumask_clear_cpu(cpu, &cpu_online_map);
+    if ( socket >= 0 )
+        cpumask_clear_cpu(cpu, &socket_cpu_map[socket]);
 }
 
 static void stop_this_cpu(void *dummy)
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 9f81c7b..bfebc2b 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -59,6 +59,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 cpumask_t cpu_online_map __read_mostly;
 EXPORT_SYMBOL(cpu_online_map);
 
+cpumask_t socket_cpu_map[MAX_NUM_SOCKETS] __read_mostly;
+EXPORT_SYMBOL(socket_cpu_map);
+
 struct cpuinfo_x86 cpu_data[NR_CPUS];
 
 u32 x86_cpu_to_apicid[NR_CPUS] __read_mostly =
@@ -319,6 +322,7 @@ void start_secondary(void *unused)
      * want to limit the things done here to the most necessary things.
      */
     unsigned int cpu = booting_cpu;
+    int socket;
 
     set_processor_id(cpu);
     set_current(idle_vcpu[cpu]);
@@ -381,6 +385,9 @@ void start_secondary(void *unused)
     cpumask_set_cpu(cpu, &cpu_online_map);
     unlock_vector_lock();
 
+    if ( (socket = cpu_to_socket(cpu)) >= 0 )
+        cpumask_set_cpu(cpu, &socket_cpu_map[socket]);
+
     init_percpu_time();
 
     /* We can take interrupts now: we're officially "up". */
@@ -788,8 +795,13 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 
 void __init smp_prepare_boot_cpu(void)
 {
-    cpumask_set_cpu(smp_processor_id(), &cpu_online_map);
-    cpumask_set_cpu(smp_processor_id(), &cpu_present_map);
+    int cpu = smp_processor_id();
+    int socket = cpu_to_socket(cpu);
+
+    cpumask_set_cpu(cpu, &cpu_online_map);
+    cpumask_set_cpu(cpu, &cpu_present_map);
+    if ( socket >= 0 )
+        cpumask_set_cpu(cpu, &socket_cpu_map[socket]);
 }
 
 static void
@@ -819,6 +831,7 @@ remove_siblinginfo(int cpu)
 void __cpu_disable(void)
 {
     int cpu = smp_processor_id();
+    int socket = cpu_to_socket(cpu);
 
     set_cpu_state(CPU_STATE_DYING);
 
@@ -836,6 +849,8 @@ void __cpu_disable(void)
     /* It's now safe to remove this processor from the online map */
     cpumask_clear_cpu(cpu, cpupool0->cpu_valid);
     cpumask_clear_cpu(cpu, &cpu_online_map);
+    if ( socket >= 0 )
+        cpumask_clear_cpu(cpu, &socket_cpu_map[socket]);
     fixup_irqs();
 
     if ( cpu_disable_scheduler(cpu) )
diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h
index 81f8610..f47fa1b 100644
--- a/xen/include/asm-x86/smp.h
+++ b/xen/include/asm-x86/smp.h
@@ -57,6 +57,8 @@ int hard_smp_processor_id(void);
 
 void __stop_this_cpu(void);
 
+#define MAX_NUM_SOCKETS 256
+
 #endif /* !__ASSEMBLY__ */
 
 #endif
diff --git a/xen/include/xen/cpumask.h b/xen/include/xen/cpumask.h
index 850b4a2..883a52a 100644
--- a/xen/include/xen/cpumask.h
+++ b/xen/include/xen/cpumask.h
@@ -419,6 +419,7 @@ static inline void free_cpumask_var(cpumask_var_t mask)
 extern cpumask_t cpu_possible_map;
 extern cpumask_t cpu_online_map;
 extern cpumask_t cpu_present_map;
+extern cpumask_t socket_cpu_map[];
 
 #if NR_CPUS > 1
 #define num_online_cpus()	cpumask_weight(&cpu_online_map)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v6 4/7] x86: collect CQM information from all sockets
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
                   ` (2 preceding siblings ...)
  2013-12-05  9:38 ` [PATCH v6 3/7] x86: initialize per socket cpu map Dongxiao Xu
@ 2013-12-05  9:38 ` Dongxiao Xu
  2014-01-20 13:52   ` Jan Beulich
  2013-12-05  9:38 ` [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID Dongxiao Xu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

Collect CQM information (L3 cache occupancy) from all sockets.
Upper layer application can parse the data structure to get the
information of guest's L3 cache occupancy on certain sockets.

Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
 xen/arch/x86/pqos.c             |   49 +++++++++++++++++++++++++++++
 xen/arch/x86/sysctl.c           |   65 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/msr-index.h |    4 +++
 xen/include/asm-x86/pqos.h      |    7 +++++
 xen/include/public/domctl.h     |    9 ++++++
 xen/include/public/sysctl.h     |   10 ++++++
 6 files changed, 144 insertions(+)

diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c
index 67d733e..f585c10 100644
--- a/xen/arch/x86/pqos.c
+++ b/xen/arch/x86/pqos.c
@@ -15,6 +15,7 @@
  * more details.
  */
 #include <asm/processor.h>
+#include <asm/msr.h>
 #include <xen/init.h>
 #include <xen/spinlock.h>
 #include <asm/pqos.h>
@@ -126,6 +127,12 @@ bool_t system_supports_cqm(void)
     return !!cqm;
 }
 
+unsigned int get_cqm_count(void)
+{
+    ASSERT(system_supports_cqm());
+    return cqm->max_rmid + 1;
+}
+
 int alloc_cqm_rmid(struct domain *d)
 {
     int rc = 0;
@@ -170,6 +177,48 @@ void free_cqm_rmid(struct domain *d)
     d->arch.pqos_cqm_rmid = 0;
 }
 
+static void read_cqm_data(void *arg)
+{
+    uint64_t cqm_data;
+    unsigned int rmid;
+    int socket = cpu_to_socket(smp_processor_id());
+    struct xen_socket_cqmdata *data = arg;
+    unsigned long flags, i;
+
+    ASSERT(system_supports_cqm());
+
+    if ( socket < 0 )
+        return;
+
+    spin_lock_irqsave(&cqm_lock, flags);
+    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
+    {
+        if ( cqm->rmid_to_dom[rmid] == DOMID_INVALID )
+            continue;
+
+        wrmsr(MSR_IA32_QOSEVTSEL, QOS_MONITOR_EVTID_L3, rmid);
+        rdmsrl(MSR_IA32_QMC, cqm_data);
+
+        i = socket * (cqm->max_rmid + 1) + rmid;
+        data[i].valid = !(cqm_data & IA32_QM_CTR_ERROR_MASK);
+        if ( data[i].valid )
+        {
+            data[i].l3c_occupancy = cqm_data * cqm->upscaling_factor;
+            data[i].socket = socket;
+            data[i].domid = cqm->rmid_to_dom[rmid];
+        }
+    }
+    spin_unlock_irqrestore(&cqm_lock, flags);
+}
+
+void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data)
+{
+    /* Read CQM data in current CPU */
+    read_cqm_data(data);
+    /* Issue IPI to other CPUs to read CQM data */
+    on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 15d4b91..4dcf221 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -28,6 +28,7 @@
 #include <xen/nodemask.h>
 #include <xen/cpu.h>
 #include <xsm/xsm.h>
+#include <asm/pqos.h>
 
 #define get_xen_guest_handle(val, hnd)  do { val = (hnd).p; } while (0)
 
@@ -66,6 +67,24 @@ void arch_do_physinfo(xen_sysctl_physinfo_t *pi)
         pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm_directio;
 }
 
+/* Select one random CPU for each socket. Current CPU's socket is excluded */
+static void select_socket_cpu(cpumask_t *cpu_bitmap)
+{
+    int i;
+    unsigned int cpu;
+    unsigned int socket_curr = cpu_to_socket(smp_processor_id());
+
+    cpumask_clear(cpu_bitmap);
+    for ( i = 0; i < MAX_NUM_SOCKETS; i++ )
+    {
+        if ( i == socket_curr )
+            continue;
+        cpu = cpumask_any(&socket_cpu_map[i]);
+        if ( cpu < nr_cpu_ids )
+            cpumask_set_cpu(cpu, cpu_bitmap);
+    }
+}
+
 long arch_do_sysctl(
     struct xen_sysctl *sysctl, XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
 {
@@ -101,6 +120,52 @@ long arch_do_sysctl(
     }
     break;
 
+    case XEN_SYSCTL_getcqminfo:
+    {
+        struct xen_socket_cqmdata *info;
+        uint32_t num_sockets;
+        uint32_t num_rmids;
+        cpumask_t cpu_cqmdata_map;
+
+        if ( !system_supports_cqm() )
+        {
+            ret = -ENODEV;
+            break;
+        }
+
+        select_socket_cpu(&cpu_cqmdata_map);
+
+        num_sockets = min((unsigned int)cpumask_weight(&cpu_cqmdata_map) + 1,
+                          sysctl->u.getcqminfo.num_sockets);
+        num_rmids = get_cqm_count();
+        info = xzalloc_array(struct xen_socket_cqmdata,
+                             num_rmids * num_sockets);
+        if ( !info )
+        {
+            ret = -ENOMEM;
+            break;
+        }
+
+        get_cqm_info(&cpu_cqmdata_map, info);
+
+        if ( copy_to_guest_offset(sysctl->u.getcqminfo.buffer,
+                                  0, info, num_rmids * num_sockets) )
+        {
+            ret = -EFAULT;
+            xfree(info);
+            break;
+        }
+
+        sysctl->u.getcqminfo.num_rmids = num_rmids;
+        sysctl->u.getcqminfo.num_sockets = num_sockets;
+
+        if ( copy_to_guest(u_sysctl, sysctl, 1) )
+            ret = -EFAULT;
+
+        xfree(info);
+    }
+    break;
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index e597a28..46ef165 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -488,4 +488,8 @@
 /* Geode defined MSRs */
 #define MSR_GEODE_BUSCONT_CONF0		0x00001900
 
+/* Platform QoS register */
+#define MSR_IA32_QOSEVTSEL             0x00000c8d
+#define MSR_IA32_QMC                   0x00000c8e
+
 #endif /* __ASM_MSR_INDEX_H */
diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h
index 16c4882..b1298b9 100644
--- a/xen/include/asm-x86/pqos.h
+++ b/xen/include/asm-x86/pqos.h
@@ -17,6 +17,8 @@
 #ifndef ASM_PQOS_H
 #define ASM_PQOS_H
 #include <xen/sched.h>
+#include <xen/cpumask.h>
+#include <public/domctl.h>
 
 #include <public/xen.h>
 
@@ -34,10 +36,15 @@ struct pqos_cqm {
 };
 extern struct pqos_cqm *cqm;
 
+/* IA32_QM_CTR */
+#define IA32_QM_CTR_ERROR_MASK         (0x3ul << 62)
+
 void init_platform_qos(void);
 
 bool_t system_supports_cqm(void);
 int alloc_cqm_rmid(struct domain *d);
 void free_cqm_rmid(struct domain *d);
+unsigned int get_cqm_count(void);
+void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data);
 
 #endif
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index d53e216..563aeaf 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -877,6 +877,15 @@ struct xen_domctl_qos_type {
 typedef struct xen_domctl_qos_type xen_domctl_qos_type_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_qos_type_t);
 
+struct xen_socket_cqmdata {
+    uint64_t l3c_occupancy;
+    uint32_t socket;
+    domid_t  domid;
+    uint8_t  valid;
+};
+typedef struct xen_socket_cqmdata xen_socket_cqmdata_t;
+DEFINE_XEN_GUEST_HANDLE(xen_socket_cqmdata_t);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 8437d31..cf240a7 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -632,6 +632,14 @@ struct xen_sysctl_coverage_op {
 typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t);
 
+/* XEN_SYSCTL_getcqminfo */
+struct xen_sysctl_getcqminfo {
+    XEN_GUEST_HANDLE_64(xen_socket_cqmdata_t) buffer; /* OUT */
+    uint32_t              num_sockets;    /* IN/OUT */
+    uint32_t              num_rmids;       /* OUT */
+};
+typedef struct xen_sysctl_getcqminfo xen_sysctl_getcqminfo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_getcqminfo_t);
 
 struct xen_sysctl {
     uint32_t cmd;
@@ -654,6 +662,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_cpupool_op                    18
 #define XEN_SYSCTL_scheduler_op                  19
 #define XEN_SYSCTL_coverage_op                   20
+#define XEN_SYSCTL_getcqminfo                    21
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
@@ -675,6 +684,7 @@ struct xen_sysctl {
         struct xen_sysctl_cpupool_op        cpupool_op;
         struct xen_sysctl_scheduler_op      scheduler_op;
         struct xen_sysctl_coverage_op       coverage_op;
+        struct xen_sysctl_getcqminfo        getcqminfo;
         uint8_t                             pad[128];
     } u;
 };
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
                   ` (3 preceding siblings ...)
  2013-12-05  9:38 ` [PATCH v6 4/7] x86: collect CQM information from all sockets Dongxiao Xu
@ 2013-12-05  9:38 ` Dongxiao Xu
  2014-01-20 13:58   ` Jan Beulich
  2013-12-05  9:38 ` [PATCH v6 6/7] xsm: add platform QoS related xsm policies Dongxiao Xu
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

If the CQM service is attached to a domain, its related RMID will be set
to hardware for monitoring when the domain's vcpu is scheduled in. When
the domain's vcpu is scheduled out, RMID 0 (system reserved) will be set
for monitoring.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
 xen/arch/x86/domain.c           |    5 +++++
 xen/arch/x86/pqos.c             |   12 ++++++++++++
 xen/include/asm-x86/msr-index.h |    1 +
 xen/include/asm-x86/pqos.h      |    1 +
 4 files changed, 19 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 90e52a2..79bc83c 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1366,6 +1366,8 @@ static void __context_switch(void)
     {
         memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
         vcpu_save_fpu(p);
+        if ( system_supports_cqm() )
+            cqm_assoc_rmid(0);
         p->arch.ctxt_switch_from(p);
     }
 
@@ -1390,6 +1392,9 @@ static void __context_switch(void)
         }
         vcpu_restore_fpu_eager(n);
         n->arch.ctxt_switch_to(n);
+
+        if ( system_supports_cqm() && n->domain->arch.pqos_cqm_rmid > 0 )
+            cqm_assoc_rmid(n->domain->arch.pqos_cqm_rmid);
     }
 
     gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c
index f585c10..c51d285 100644
--- a/xen/arch/x86/pqos.c
+++ b/xen/arch/x86/pqos.c
@@ -60,6 +60,8 @@ static void __init parse_pqos_param(char *s)
 
 custom_param("pqos", parse_pqos_param);
 
+static uint64_t rmid_mask;
+
 struct pqos_cqm *cqm;
 static DEFINE_SPINLOCK(cqm_lock);
 
@@ -110,6 +112,8 @@ static void __init init_qos_monitor(void)
 
     cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features);
 
+    rmid_mask = ~(~0ull << get_count_order(ebx));
+
     if ( opt_cqm && (qm_features & QOS_MONITOR_TYPE_L3) )
         init_cqm();
 }
@@ -219,6 +223,14 @@ void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data)
     on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1);
 }
 
+void cqm_assoc_rmid(unsigned int rmid)
+{
+    uint64_t val;
+
+    rdmsrl(MSR_IA32_PQR_ASSOC, val);
+    wrmsrl(MSR_IA32_PQR_ASSOC, (val & ~(rmid_mask)) | (rmid & rmid_mask));
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 46ef165..45f4918 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -491,5 +491,6 @@
 /* Platform QoS register */
 #define MSR_IA32_QOSEVTSEL             0x00000c8d
 #define MSR_IA32_QMC                   0x00000c8e
+#define MSR_IA32_PQR_ASSOC             0x00000c8f
 
 #endif /* __ASM_MSR_INDEX_H */
diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h
index b1298b9..5d53b7e 100644
--- a/xen/include/asm-x86/pqos.h
+++ b/xen/include/asm-x86/pqos.h
@@ -46,5 +46,6 @@ int alloc_cqm_rmid(struct domain *d);
 void free_cqm_rmid(struct domain *d);
 unsigned int get_cqm_count(void);
 void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data);
+void cqm_assoc_rmid(unsigned int rmid);
 
 #endif
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v6 6/7] xsm: add platform QoS related xsm policies
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
                   ` (4 preceding siblings ...)
  2013-12-05  9:38 ` [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID Dongxiao Xu
@ 2013-12-05  9:38 ` Dongxiao Xu
  2013-12-05  9:38 ` [PATCH v6 7/7] tools: enable Cache QoS Monitoring feature for libxl/libxc Dongxiao Xu
  2013-12-07  5:09 ` [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Xu, Dongxiao
  7 siblings, 0 replies; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

Add xsm policies for attach/detach pqos services and get CQM info
hypercalls.

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
 tools/flask/policy/policy/modules/xen/xen.if |    2 +-
 tools/flask/policy/policy/modules/xen/xen.te |    5 ++++-
 xen/xsm/flask/hooks.c                        |    8 ++++++++
 xen/xsm/flask/policy/access_vectors          |   17 ++++++++++++++---
 4 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index dedc035..1f683af 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -49,7 +49,7 @@ define(`create_domain_common', `
 			getdomaininfo hypercall setvcpucontext setextvcpucontext
 			getscheduler getvcpuinfo getvcpuextstate getaddrsize
 			getaffinity setaffinity };
-	allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim  set_max_evtchn };
+	allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim set_max_evtchn pqos_op };
 	allow $1 $2:security check_context;
 	allow $1 $2:shadow enable;
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op };
diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te
index bb59fe8..115fcfe 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -64,6 +64,9 @@ allow dom0_t xen_t:xen {
 	getidle debug getcpuinfo heap pm_op mca_op lockprof cpupool_op tmem_op
 	tmem_control getscheduler setscheduler
 };
+allow dom0_t xen_t:xen2 {
+	pqos_op
+};
 allow dom0_t xen_t:mmu memorymap;
 
 # Allow dom0 to use these domctls on itself. For domctls acting on other
@@ -76,7 +79,7 @@ allow dom0_t dom0_t:domain {
 	getpodtarget setpodtarget set_misc_info set_virq_handler
 };
 allow dom0_t dom0_t:domain2 {
-	set_cpuid gettsc settsc setscheduler set_max_evtchn
+	set_cpuid gettsc settsc setscheduler set_max_evtchn pqos_op
 };
 allow dom0_t dom0_t:resource { add remove };
 
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index b1e2593..ec065b0 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -730,6 +730,10 @@ static int flask_domctl(struct domain *d, int cmd)
     case XEN_DOMCTL_set_max_evtchn:
         return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SET_MAX_EVTCHN);
 
+    case XEN_DOMCTL_attach_pqos:
+    case XEN_DOMCTL_detach_pqos:
+        return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PQOS_OP);
+
     default:
         printk("flask_domctl: Unknown op %d\n", cmd);
         return -EPERM;
@@ -785,6 +789,10 @@ static int flask_sysctl(int cmd)
     case XEN_SYSCTL_numainfo:
         return domain_has_xen(current->domain, XEN__PHYSINFO);
 
+    case XEN_SYSCTL_getcqminfo:
+        return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
+                                    XEN2__PQOS_OP, NULL);
+
     default:
         printk("flask_sysctl: Unknown op %d\n", cmd);
         return -EPERM;
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 1fbe241..91af8b2 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -3,9 +3,9 @@
 #
 # class class_name { permission_name ... }
 
-# Class xen consists of dom0-only operations dealing with the hypervisor itself.
-# Unless otherwise specified, the source is the domain executing the hypercall,
-# and the target is the xen initial sid (type xen_t).
+# Class xen and xen2 consists of dom0-only operations dealing with the
+# hypervisor itself. Unless otherwise specified, the source is the domain
+# executing the hypercall, and the target is the xen initial sid (type xen_t).
 class xen
 {
 # XENPF_settime
@@ -75,6 +75,14 @@ class xen
     setscheduler
 }
 
+# This is a continuation of class xen, since only 32 permissions can be
+# defined per class
+class xen2
+{
+# XEN_SYSCTL_getcqminfo
+    pqos_op
+}
+
 # Classes domain and domain2 consist of operations that a domain performs on
 # another domain or on itself.  Unless otherwise specified, the source is the
 # domain executing the hypercall, and the target is the domain being operated on
@@ -196,6 +204,9 @@ class domain2
     setclaim
 # XEN_DOMCTL_set_max_evtchn
     set_max_evtchn
+# XEN_DOMCTL_attach_pqos
+# XEN_DOMCTL_detach_pqos
+    pqos_op
 }
 
 # Similar to class domain, but primarily contains domctls related to HVM domains
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v6 7/7] tools: enable Cache QoS Monitoring feature for libxl/libxc
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
                   ` (5 preceding siblings ...)
  2013-12-05  9:38 ` [PATCH v6 6/7] xsm: add platform QoS related xsm policies Dongxiao Xu
@ 2013-12-05  9:38 ` Dongxiao Xu
  2013-12-07  5:09 ` [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Xu, Dongxiao
  7 siblings, 0 replies; 33+ messages in thread
From: Dongxiao Xu @ 2013-12-05  9:38 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, JBeulich, dgdegra

Introduced two new xl commands to attach/detach CQM service for a guest
$ xl pqos-attach cqm domid
$ xl pqos-detach cqm domid

Introduce one new xl command to retrive guest CQM information
$ xl pqos-list cqm (domid)

Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
 tools/libxc/xc_domain.c     |   47 +++++++++++++
 tools/libxc/xenctrl.h       |   11 +++
 tools/libxl/Makefile        |    3 +-
 tools/libxl/libxl.h         |    6 ++
 tools/libxl/libxl_pqos.c    |  163 +++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_types.idl |   13 ++++
 tools/libxl/xl.h            |    3 +
 tools/libxl/xl_cmdimpl.c    |  133 +++++++++++++++++++++++++++++++++++
 tools/libxl/xl_cmdtable.c   |   15 ++++
 9 files changed, 393 insertions(+), 1 deletion(-)
 create mode 100644 tools/libxl/libxl_pqos.c

diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 1ccafc5..769fae3 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1776,6 +1776,53 @@ int xc_domain_set_max_evtchn(xc_interface *xch, uint32_t domid,
     return do_domctl(xch, &domctl);
 }
 
+int xc_domain_pqos_attach(xc_interface *xch, uint32_t domid, uint64_t flags)
+{
+    DECLARE_DOMCTL;
+    domctl.cmd = XEN_DOMCTL_attach_pqos;
+    domctl.domain = (domid_t)domid;
+    domctl.u.qos_type.flags = flags;
+    return do_domctl(xch, &domctl);
+}
+
+int xc_domain_pqos_detach(xc_interface *xch, uint32_t domid, uint64_t flags)
+{
+    DECLARE_DOMCTL;
+    domctl.cmd = XEN_DOMCTL_detach_pqos;
+    domctl.domain = (domid_t)domid;
+    domctl.u.qos_type.flags = flags;
+    return do_domctl(xch, &domctl);
+}
+
+int xc_domain_getcqminfolist(xc_interface *xch, sysctl_cqminfo_t *info)
+{
+    int ret = 0;
+    xen_socket_cqmdata_t *data = info->cqmdata;
+    DECLARE_SYSCTL;
+
+    DECLARE_HYPERCALL_BOUNCE(data,
+        info->num_rmids * info->num_sockets * sizeof(*data),
+        XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+    if ( xc_hypercall_bounce_pre(xch, data) )
+        return -1;
+
+    sysctl.cmd = XEN_SYSCTL_getcqminfo;
+    set_xen_guest_handle(sysctl.u.getcqminfo.buffer, data);
+
+    if ( xc_sysctl(xch, &sysctl) < 0 )
+        ret = -1;
+    else
+    {
+        info->num_sockets = sysctl.u.getcqminfo.num_sockets;
+        info->num_rmids = sysctl.u.getcqminfo.num_rmids;
+    }
+
+    xc_hypercall_bounce_post(xch, data);
+
+    return ret;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 4ac6b8a..a16f72e 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -2395,4 +2395,15 @@ int xc_kexec_load(xc_interface *xch, uint8_t type, uint16_t arch,
  */
 int xc_kexec_unload(xc_interface *xch, int type);
 
+struct xc_sysctl_getcqminfo
+{
+    uint32_t num_rmids;
+    uint32_t num_sockets;
+    xen_socket_cqmdata_t *cqmdata;
+};
+typedef struct xc_sysctl_getcqminfo sysctl_cqminfo_t;
+
+int xc_domain_pqos_attach(xc_interface *xch, uint32_t domid, uint64_t flags);
+int xc_domain_pqos_detach(xc_interface *xch, uint32_t domid, uint64_t flags);
+int xc_domain_getcqminfolist(xc_interface *xch, sysctl_cqminfo_t *info);
 #endif /* XENCTRL_H */
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index cf214bb..35f0b97 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -74,7 +74,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_internal.o libxl_utils.o libxl_uuid.o \
 			libxl_json.o libxl_aoutils.o libxl_numa.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
-			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
+			libxl_qmp.o libxl_event.o libxl_fork.o libxl_pqos.o \
+			$(LIBXL_OBJS-y)
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
 $(LIBXL_OBJS): CFLAGS += $(CFLAGS_LIBXL) -include $(XEN_ROOT)/tools/config.h
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index c7dceda..d2da424 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1051,6 +1051,12 @@ int libxl_flask_getenforce(libxl_ctx *ctx);
 int libxl_flask_setenforce(libxl_ctx *ctx, int mode);
 int libxl_flask_loadpolicy(libxl_ctx *ctx, void *policy, uint32_t size);
 
+int libxl_pqos_attach(libxl_ctx *ctx, uint32_t domid, const char * qos_type);
+int libxl_pqos_detach(libxl_ctx *ctx, uint32_t domid, const char * qos_type);
+libxl_cqminfo * libxl_get_cqm_info(libxl_ctx *ctx,
+                                   uint32_t first_domain,
+                                   unsigned int num_domains);
+
 /* misc */
 
 /* Each of these sets or clears the flag according to whether the
diff --git a/tools/libxl/libxl_pqos.c b/tools/libxl/libxl_pqos.c
new file mode 100644
index 0000000..f831007
--- /dev/null
+++ b/tools/libxl/libxl_pqos.c
@@ -0,0 +1,163 @@
+/*
+ * Copyright (C) 2013      Intel Corporation
+ * Author Jiongxi Li <jiongxi.li@intel.com>
+ * Author Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+#include "libxl_internal.h"
+
+static const char * const msg[] = {
+    [EINVAL] = "invalid QoS resource type! Supported types: \"cqm\"",
+    [ENODEV] = "CQM is not supported in this system.",
+    [EEXIST] = "CQM is already attached to this domain.",
+    [ENOENT] = "CQM is not attached to this domain.",
+    [EUSERS] = "there is no free CQM RMID available.",
+    [ESRCH]  = "is this Domain ID valid?",
+};
+
+static void libxl_pqos_err_msg(libxl_ctx *ctx, int err)
+{
+    GC_INIT(ctx);
+
+    switch (err) {
+    case EINVAL:
+    case ENODEV:
+    case EEXIST:
+    case EUSERS:
+    case ESRCH:
+    case ENOENT:
+        LOGE(ERROR, "%s", msg[err]);
+        break;
+    default:
+        LOGE(ERROR, "errno: %d", err);
+    }
+
+    GC_FREE;
+}
+
+static int libxl_pqos_type2flags(const char * qos_type, uint64_t *flags)
+{
+    int rc = 0;
+
+    if (!strcmp(qos_type, "cqm"))
+        *flags |= XEN_DOMCTL_pqos_cqm;
+    else
+        rc = -1;
+
+    return rc;
+}
+
+int libxl_pqos_attach(libxl_ctx *ctx, uint32_t domid, const char * qos_type)
+{
+    int rc;
+    uint64_t flags = 0;
+
+    rc = libxl_pqos_type2flags(qos_type, &flags);
+    if (rc < 0) {
+        libxl_pqos_err_msg(ctx, EINVAL);
+        return ERROR_FAIL;
+    }
+
+    rc = xc_domain_pqos_attach(ctx->xch, domid, flags);
+    if (rc < 0) {
+        libxl_pqos_err_msg(ctx, errno);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+int libxl_pqos_detach(libxl_ctx *ctx, uint32_t domid, const char * qos_type)
+{
+    int rc;
+    uint64_t flags = 0;
+
+    rc = libxl_pqos_type2flags(qos_type, &flags);
+    if (rc < 0) {
+        libxl_pqos_err_msg(ctx, EINVAL);
+        return ERROR_FAIL;
+    }
+
+    rc = xc_domain_pqos_detach(ctx->xch, domid, flags);
+    if (rc < 0) {
+        libxl_pqos_err_msg(ctx, errno);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+libxl_cqminfo * libxl_get_cqm_info(libxl_ctx *ctx,
+                                   uint32_t first_domain,
+                                   unsigned int num_domains)
+{
+    libxl_cqminfo *xlinfo;
+    int ret;
+    sysctl_cqminfo_t xcinfo;
+    uint32_t num_rmids = 256;
+    uint32_t num_sockets = 256;
+    GC_INIT(ctx);
+
+    xlinfo = malloc(sizeof(libxl_cqminfo));
+    if (!xlinfo) {
+        LOGE(ERROR, "allocating cqm xlinfo");
+        return NULL;
+    }
+
+    xlinfo->cqmdata = calloc(num_rmids * num_sockets,
+                             sizeof(libxl_cqmdata));
+    if (!xlinfo->cqmdata) {
+        LOGE(ERROR, "allocating cqm xlinfo->cqmdata");
+        free(xlinfo);
+        return NULL;
+    }
+
+    xcinfo.num_rmids = num_rmids;
+    xcinfo.num_sockets = num_sockets;
+    xcinfo.cqmdata = calloc(num_rmids * num_sockets,
+                            sizeof(xen_socket_cqmdata_t));
+    if (!xcinfo.cqmdata) {
+        LOGE(ERROR, "allocating cqm xcinfo->cqmdata");
+        free(xlinfo->cqmdata);
+        free(xlinfo);
+        return NULL;
+    }
+
+    ret = xc_domain_getcqminfolist(ctx->xch, &xcinfo);
+    if (ret < 0) {
+        LOGE(ERROR, "getting domain cqm info list");
+        free(xcinfo.cqmdata);
+        free(xlinfo->cqmdata);
+        free(xlinfo);
+        return NULL;
+    }
+
+    memcpy(xlinfo->cqmdata, xcinfo.cqmdata,
+           sizeof(libxl_cqmdata) * num_rmids * num_sockets);
+    xlinfo->num_rmids = xcinfo.num_rmids;
+    xlinfo->num_sockets = xcinfo.num_sockets;
+
+    free(xcinfo.cqmdata);
+
+    GC_FREE;
+    return xlinfo;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index de5bac3..9ecbc2b 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -593,3 +593,16 @@ libxl_event = Struct("event",[
                                  ])),
            ("domain_create_console_available", Struct(None, [])),
            ]))])
+
+libxl_cqmdata = Struct("cqmdata",[
+    ("l3c_occupancy",   uint64),
+    ("socket",          uint32),
+    ("domid",           uint16),
+    ("valid",           uint8),
+    ])
+
+libxl_cqminfo = Struct("cqminfo", [
+    ("num_rmids",    uint32),
+    ("num_sockets",  uint32),
+    ("cqmdata",      Array(libxl_cqmdata, "num_items"))
+    ])
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index e005c39..994d3be 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -105,6 +105,9 @@ int main_getenforce(int argc, char **argv);
 int main_setenforce(int argc, char **argv);
 int main_loadpolicy(int argc, char **argv);
 int main_remus(int argc, char **argv);
+int main_pqosattach(int argc, char **argv);
+int main_pqosdetach(int argc, char **argv);
+int main_pqoslist(int argc, char **argv);
 
 void help(const char *command);
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 8690ec7..19bcf25 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -7193,6 +7193,139 @@ int main_remus(int argc, char **argv)
     return -ERROR_FAIL;
 }
 
+int main_pqosattach(int argc, char **argv)
+{
+    uint32_t domid;
+    int opt, rc;
+    const char *qos_type = NULL;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "pqos-attach", 2) {
+        /* No options */
+    }
+
+    qos_type = argv[optind];
+    domid = find_domain(argv[optind + 1]);
+
+    rc = libxl_pqos_attach(ctx, domid, qos_type);
+
+    return rc;
+}
+
+int main_pqosdetach(int argc, char **argv)
+{
+    uint32_t domid;
+    int opt, rc;
+    const char *qos_type = NULL;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "pqos-detach", 2) {
+        /* No options */
+    }
+
+    qos_type = argv[optind];
+    domid = find_domain(argv[optind + 1]);
+
+    rc = libxl_pqos_detach(ctx, domid, qos_type);
+
+    return rc;
+}
+
+static void print_cqm_info(const libxl_cqminfo *info, uint32_t first_domain,
+                           unsigned int num_domains)
+{
+    unsigned long i, j, k;
+    libxl_cqmdata *cqmdata;
+    char *domname;
+    int print_header;
+    int print_newdom;
+    int cqm_domains = 0;
+
+    if (info->num_rmids == 0)
+        printf("System doesn't support CQM.\n");
+    else {
+        print_header = 1;
+        for (i = first_domain; i < (first_domain + num_domains); i++) {
+            print_newdom = 1;
+            for (j = 0; j < (info->num_rmids * info->num_sockets); j++) {
+                cqmdata = info->cqmdata + j;
+                if (!cqmdata->valid || cqmdata->domid != i)
+                    continue;
+
+                if (print_header) {
+                    printf("Name                                        ID");
+                    for (k = 0; k < info->num_sockets; k++)
+                        printf("\tSocketID\tL3C_Usage");
+                    print_header = 0;
+                }
+
+                if (print_newdom) {
+                    domname = libxl_domid_to_name(ctx, cqmdata->domid);
+                    printf("\n%-40s %5d", domname, cqmdata->domid);
+                    free(domname);
+                    print_newdom = 0;
+                    cqm_domains++;
+                }
+                printf("%10u %16lu     ", cqmdata->socket, cqmdata->l3c_occupancy);
+            }
+        }
+        if (!cqm_domains)
+            printf("No RMID is assigned to domains.\n");
+        else
+            printf("\n");
+        printf("\nRMID count %5d\tRMID available %5d\n",
+               info->num_rmids, info->num_rmids - cqm_domains - 1);
+    }
+}
+
+int main_pqoslist(int argc, char **argv)
+{
+    int opt;
+    const char *qos_type = NULL;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "pqos-list", 1) {
+        /* No options */
+    }
+
+    qos_type = argv[optind];
+
+    if (!strcmp(qos_type, "cqm")) {
+        uint32_t first_domain;
+        unsigned int num_domains;
+        libxl_cqminfo *info;
+
+        if (optind + 1 >= argc) {
+            first_domain = 0;
+            num_domains = 1024;
+        } else if (optind + 1 == argc - 1) {
+            first_domain = find_domain(argv[optind + 1]);
+            num_domains = 1;
+            if (!libxl_domid_to_name(ctx, first_domain)) {
+                fprintf(stderr, "Invalid domain id: %d.\n", first_domain);
+                return 1;
+            }
+        } else {
+            help("pqos-list");
+            return 2;
+        }
+
+        info = libxl_get_cqm_info(ctx, first_domain, num_domains);
+        if (!info) {
+            fprintf(stderr, "Failed to get domain CQM info.\n");
+            return 1;
+        }
+
+        print_cqm_info(info, first_domain, num_domains);
+
+        free(info->cqmdata);
+        free(info);
+    } else {
+        fprintf(stderr, "QoS resource type supported is: cqm.\n");
+        help("pqos-list");
+        return 2;
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 326a660..6ced416 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -488,6 +488,21 @@ struct cmd_spec cmd_table[] = {
       "                        of the domain."
 
     },
+    { "pqos-attach",
+      &main_pqosattach, 0, 1,
+      "Allocate and map qos resource",
+      "<Resource> <Domain>",
+    },
+    { "pqos-detach",
+      &main_pqosdetach, 0, 1,
+      "Reliquish qos resource",
+      "<Resource> <Domain>",
+    },
+    { "pqos-list",
+      &main_pqoslist, 0, 0,
+      "List qos information about all/some domains",
+      "<Resource> [Domain]",
+    },
 };
 
 int cmdtable_len = sizeof(cmd_table)/sizeof(struct cmd_spec);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature
  2013-12-05  9:38 ` [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature Dongxiao Xu
@ 2013-12-05 10:54   ` Andrew Cooper
  2013-12-06  8:29     ` Jan Beulich
  2014-01-20 13:00   ` Jan Beulich
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2013-12-05 10:54 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir, Ian.Campbell, stefano.stabellini, dario.faggioli,
	Ian.Jackson, xen-devel, JBeulich, dgdegra

On 05/12/13 09:38, Dongxiao Xu wrote:
> Detect platform QoS feature status and enumerate the resource types,
> one of which is to monitor the L3 cache occupancy.
>
> Also introduce a Xen grub command line parameter to control the
> QoS feature status.
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> ---
>  docs/misc/xen-command-line.markdown |    7 ++
>  xen/arch/x86/Makefile               |    1 +
>  xen/arch/x86/cpu/intel.c            |    6 ++
>  xen/arch/x86/pqos.c                 |  130 +++++++++++++++++++++++++++++++++++
>  xen/arch/x86/setup.c                |    3 +
>  xen/include/asm-x86/cpufeature.h    |    1 +
>  xen/include/asm-x86/pqos.h          |   38 ++++++++++
>  7 files changed, 186 insertions(+)
>  create mode 100644 xen/arch/x86/pqos.c
>  create mode 100644 xen/include/asm-x86/pqos.h
>
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index 15aa404..7751ffe 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -770,6 +770,13 @@ This option can be specified more than once (up to 8 times at present).
>  ### ple\_window
>  > `= <integer>`
>  
> +### pqos (Intel)
> +> `= List of ( <boolean> | cqm:<boolean> | cqm_max_rmid:<integer> )`
> +
> +> Default: `pqos=1,cqm:1,cqm_max_rmid:255`
> +
> +Configure platform QoS services.
> +
>  ### reboot
>  > `= t[riple] | k[bd] | n[o] [, [w]arm | [c]old]`
>  
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index d502bdf..54962e0 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -58,6 +58,7 @@ obj-y += crash.o
>  obj-y += tboot.o
>  obj-y += hpet.o
>  obj-y += xstate.o
> +obj-y += pqos.o
>  
>  obj-$(crash_debug) += gdbstub.o
>  
> diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
> index 27fe762..f0d83ea 100644
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c)
>  	     ( c->cpuid_level >= 0x00000006 ) &&
>  	     ( cpuid_eax(0x00000006) & (1u<<2) ) )
>  		set_bit(X86_FEATURE_ARAT, c->x86_capability);
> +
> +	/* Check platform QoS monitoring capability */
> +	if ((c->cpuid_level >= 0x00000007) &&
> +	    (cpuid_ebx(0x00000007) & (1u<<12)))
> +		set_bit(X86_FEATURE_QOSM, c->x86_capability);
> +
>  }
>  
>  static struct cpu_dev intel_cpu_dev __cpuinitdata = {
> diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c
> new file mode 100644
> index 0000000..d78048a
> --- /dev/null
> +++ b/xen/arch/x86/pqos.c
> @@ -0,0 +1,130 @@
> +/*
> + * pqos.c: Platform QoS related service for guest.
> + *
> + * Copyright (c) 2013, Intel Corporation
> + * Author: Jiongxi Li  <jiongxi.li@intel.com>
> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +#include <asm/processor.h>
> +#include <xen/init.h>
> +#include <asm/pqos.h>
> +
> +static bool_t __initdata opt_pqos = 1;
> +static bool_t __initdata opt_cqm = 1;
> +static unsigned int __initdata opt_cqm_max_rmid = 255;
> +
> +static void __init parse_pqos_param(char *s)
> +{
> +    char *ss, *val_str;
> +    int val;
> +
> +    do {
> +        ss = strchr(s, ',');
> +        if ( ss )
> +            *ss = '\0';
> +
> +        val = parse_bool(s);
> +        if ( val >= 0 )
> +            opt_pqos = val;
> +        else
> +        {
> +            val = !!strncmp(s, "no-", 3);
> +            if ( !val )
> +                s += 3;
> +
> +            val_str = strchr(s, ':');
> +            if ( val_str )
> +                *val_str++ = '\0';
> +
> +            if ( val_str && !strcmp(s, "cqm") &&
> +                 (val = parse_bool(val_str)) >= 0 )
> +                opt_cqm = val;
> +            else if ( val_str && !strcmp(s, "cqm_max_rmid") )
> +                opt_cqm_max_rmid = simple_strtoul(val_str, NULL, 0);
> +        }
> +
> +        s = ss + 1;
> +    } while ( ss );
> +}
> +
> +custom_param("pqos", parse_pqos_param);
> +
> +struct pqos_cqm *cqm;

This is not static, so does not live in the BSS.  It needs initialising
to NULL, so system_supports_cqm() from the following patch doesn't
erroneously report true before init_cqm() is called.

~Andrew

> +
> +static void __init init_cqm(void)
> +{
> +    unsigned int rmid;
> +    unsigned int eax, edx;
> +
> +    if ( !opt_cqm_max_rmid )
> +        return;
> +
> +    cqm = xzalloc(struct pqos_cqm);
> +    if ( !cqm )
> +        return;
> +
> +    cpuid_count(0xf, 1, &eax, &cqm->upscaling_factor, &cqm->max_rmid, &edx);
> +    if ( !(edx & QOS_MONITOR_EVTID_L3) )
> +    {
> +        xfree(cqm);
> +        return;
> +    }
> +
> +    cqm->min_rmid = 1;
> +    cqm->max_rmid = min(opt_cqm_max_rmid, cqm->max_rmid);
> +
> +    cqm->rmid_to_dom = xmalloc_array(domid_t, cqm->max_rmid + 1);
> +    if ( !cqm->rmid_to_dom )
> +    {
> +        xfree(cqm);
> +        return;
> +    }
> +
> +    /* Reserve RMID 0 for all domains not being monitored */
> +    cqm->rmid_to_dom[0] = DOMID_XEN;
> +    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
> +        cqm->rmid_to_dom[rmid] = DOMID_INVALID;
> +
> +    printk(XENLOG_INFO "Cache QoS Monitoring Enabled.\n");
> +}
> +
> +static void __init init_qos_monitor(void)
> +{
> +    unsigned int qm_features;
> +    unsigned int eax, ebx, ecx;
> +
> +    if ( !(boot_cpu_has(X86_FEATURE_QOSM)) )
> +        return;
> +
> +    cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features);
> +
> +    if ( opt_cqm && (qm_features & QOS_MONITOR_TYPE_L3) )
> +        init_cqm();
> +}
> +
> +void __init init_platform_qos(void)
> +{
> +    if ( !opt_pqos )
> +        return;
> +
> +    init_qos_monitor();
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 5bf4ee0..95418e4 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -48,6 +48,7 @@
>  #include <asm/setup.h>
>  #include <xen/cpu.h>
>  #include <asm/nmi.h>
> +#include <asm/pqos.h>
>  
>  /* opt_nosmp: If true, secondary processors are ignored. */
>  static bool_t __initdata opt_nosmp;
> @@ -1402,6 +1403,8 @@ void __init __start_xen(unsigned long mbi_p)
>  
>      domain_unpause_by_systemcontroller(dom0);
>  
> +    init_platform_qos();
> +
>      reset_stack_and_jump(init_done);
>  }
>  
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index 1cfaf94..ca59668 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -147,6 +147,7 @@
>  #define X86_FEATURE_ERMS	(7*32+ 9) /* Enhanced REP MOVSB/STOSB */
>  #define X86_FEATURE_INVPCID	(7*32+10) /* Invalidate Process Context ID */
>  #define X86_FEATURE_RTM 	(7*32+11) /* Restricted Transactional Memory */
> +#define X86_FEATURE_QOSM	(7*32+12) /* Platform QoS monitoring capability */
>  #define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
>  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
>  
> diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h
> new file mode 100644
> index 0000000..da925db
> --- /dev/null
> +++ b/xen/include/asm-x86/pqos.h
> @@ -0,0 +1,38 @@
> +/*
> + * pqos.h: Platform QoS related service for guest.
> + *
> + * Copyright (c) 2013, Intel Corporation
> + * Author: Jiongxi Li  <jiongxi.li@intel.com>
> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +#ifndef ASM_PQOS_H
> +#define ASM_PQOS_H
> +
> +#include <public/xen.h>
> +
> +/* QoS Resource Type Enumeration */
> +#define QOS_MONITOR_TYPE_L3            0x2
> +
> +/* QoS Monitoring Event ID */
> +#define QOS_MONITOR_EVTID_L3           0x1
> +
> +struct pqos_cqm {
> +    unsigned int min_rmid;
> +    unsigned int max_rmid;
> +    unsigned int upscaling_factor;
> +    domid_t *rmid_to_dom;
> +};
> +extern struct pqos_cqm *cqm;
> +
> +void init_platform_qos(void);
> +
> +#endif

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature
  2013-12-05 10:54   ` Andrew Cooper
@ 2013-12-06  8:29     ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2013-12-06  8:29 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: keir, Ian.Campbell, stefano.stabellini, dario.faggioli,
	Ian.Jackson, xen-devel, Dongxiao Xu, dgdegra

>>> On 05.12.13 at 11:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 05/12/13 09:38, Dongxiao Xu wrote:
>> +struct pqos_cqm *cqm;
> 
> This is not static, so does not live in the BSS.  It needs initialising
> to NULL, so system_supports_cqm() from the following patch doesn't
> erroneously report true before init_cqm() is called.

Since when is such data not in .bss, even more so with -fno-common
in place?

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature
  2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
                   ` (6 preceding siblings ...)
  2013-12-05  9:38 ` [PATCH v6 7/7] tools: enable Cache QoS Monitoring feature for libxl/libxc Dongxiao Xu
@ 2013-12-07  5:09 ` Xu, Dongxiao
  2013-12-09  9:34   ` Jan Beulich
  7 siblings, 1 reply; 33+ messages in thread
From: Xu, Dongxiao @ 2013-12-07  5:09 UTC (permalink / raw)
  To: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	JBeulich@suse.com, dgdegra@tycho.nsa.gov
  Cc: xen-devel@lists.xen.org

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org
> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Dongxiao Xu
> Sent: Thursday, December 05, 2013 5:39 PM
> To: xen-devel@lists.xen.org
> Cc: keir@xen.org; Ian.Campbell@citrix.com; stefano.stabellini@eu.citrix.com;
> andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> Ian.Jackson@eu.citrix.com; JBeulich@suse.com; dgdegra@tycho.nsa.gov
> Subject: [Xen-devel] [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature
> 

Any more comments about this version?

Thanks,
Dongxiao

> Changes from v5:
>  - Address comments from Dario Faggioli, including:
>    * Define a new libxl_cqminfo structure to avoid reference of xc
>      structure in libxl functions.
>    * Use LOGE() instead of the LIBXL__LOG() functions.
> 
> Changes from v4:
>  - When comparing xl cqm parameter, use strcmp instead of strncmp,
>    otherwise, "xl pqos-attach cqmabcd domid" will be considered as
>    a valid command line.
>  - Address comments from Andrew Cooper, including:
>    * Adjust the pqos parameter parsing function.
>    * Modify the pqos related documentation.
>    * Add a check for opt_cqm_max_rmid in initialization code.
>    * Do not IPI CPU that is in same socket with current CPU.
>  - Address comments from Dario Faggioli, including:
>    * Fix an typo in export symbols.
>    * Return correct libxl error code for qos related functions.
>    * Abstract the error printing logic into a function.
>  - Address comment from Daniel De Graaf, including:
>    * Add return value in for pqos related check.
>  - Address comments from Konrad Rzeszutek Wilk, including:
>    * Modify the GPLv2 related file header, remove the address.
> 
> Changes from v3:
>  - Use structure to better organize CQM related global variables.
>  - Address comments from Andrew Cooper, including:
>    * Remove the domain creation flag for CQM RMID allocation.
>    * Adjust the boot parameter format, use custom_param().
>    * Add documentation for the new added boot parameter.
>    * Change QoS type flag to be uint64_t.
>    * Initialize the per socket cpu bitmap in system boot time.
>    * Remove get_cqm_avail() function.
>    * Misc of format changes.
>  - Address comment from Daniel De Graaf, including:
>    * Use avc_current_has_perm() for XEN2__PQOS_OP that belongs to
> SECCLASS_XEN2.
> 
> Changes from v2:
>  - Address comments from Andrew Cooper, including:
>    * Merging tools stack changes into one patch.
>    * Reduce the IPI number to one per socket.
>    * Change structures for CQM data exchange between tools and Xen.
>    * Misc of format/variable/function name changes.
>  - Address comments from Konrad Rzeszutek Wilk, including:
>    * Simplify the error printing logic.
>    * Add xsm check for the new added hypercalls.
> 
> Changes from v1:
>  - Address comments from Andrew Cooper, including:
>    * Change function names, e.g., alloc_cqm_rmid(), system_supports_cqm(),
> etc.
>    * Change some structure element order to save packing cost.
>    * Correct some function's return value.
>    * Some programming styles change.
>    * ...
> 
> Future generations of Intel Xeon processor may offer monitoring capability in
> each logical processor to measure specific quality-of-service metric,
> for example, the Cache QoS Monitoring to get L3 cache occupancy.
> Detailed information please refer to Intel SDM chapter 17.14.
> 
> Cache QoS Monitoring provides a layer of abstraction between applications and
> logical processors through the use of Resource Monitoring IDs (RMIDs).
> In Xen design, each guest in the system can be assigned an RMID independently,
> while RMID=0 is reserved for monitoring domains that doesn't enable CQM
> service.
> When any of the domain's vcpu is scheduled on a logical processor, the domain's
> RMID will be activated by programming the value into one specific MSR, and
> when
> the vcpu is scheduled out, a RMID=0 will be programmed into that MSR.
> The Cache QoS Hardware tracks cache utilization of memory accesses according
> to
> the RMIDs and reports monitored data via a counter register. With this solution,
> we can get the knowledge how much L3 cache is used by a certain guest.
> 
> To attach CQM service to a certain guest, two approaches are provided:
> 1) Create the guest with "pqos_cqm=1" set in configuration file.
> 2) Use "xl pqos-attach cqm domid" for a running guest.
> 
> To detached CQM service from a guest, users can:
> 1) Use "xl pqos-detach cqm domid" for a running guest.
> 2) Also destroying a guest will detach the CQM service.
> 
> To get the L3 cache usage, users can use the command of:
> $ xl pqos-list cqm (domid)
> 
> The below data is just an example showing how the CQM related data is exposed
> to
> end user.
> 
> [root@localhost]# xl pqos-list cqm
> Name               ID  SocketID        L3C_Usage       SocketID
> L3C_Usage
> Domain-0            0         0         20127744              1
> 25231360
> ExampleHVMDomain    1         0          3211264              1
> 10551296
> 
> RMID count    56        RMID available    53
> 
> Dongxiao Xu (7):
>   x86: detect and initialize Cache QoS Monitoring feature
>   x86: dynamically attach/detach CQM service for a guest
>   x86: initialize per socket cpu map
>   x86: collect CQM information from all sockets
>   x86: enable CQM monitoring for each domain RMID
>   xsm: add platform QoS related xsm policies
>   tools: enable Cache QoS Monitoring feature for libxl/libxc
> 
>  docs/misc/xen-command-line.markdown          |    7 +
>  tools/flask/policy/policy/modules/xen/xen.if |    2 +-
>  tools/flask/policy/policy/modules/xen/xen.te |    5 +-
>  tools/libxc/xc_domain.c                      |   47 +++++
>  tools/libxc/xenctrl.h                        |   11 ++
>  tools/libxl/Makefile                         |    3 +-
>  tools/libxl/libxl.h                          |    6 +
>  tools/libxl/libxl_pqos.c                     |  163 +++++++++++++++++
>  tools/libxl/libxl_types.idl                  |   13 ++
>  tools/libxl/xl.h                             |    3 +
>  tools/libxl/xl_cmdimpl.c                     |  133 ++++++++++++++
>  tools/libxl/xl_cmdtable.c                    |   15 ++
>  xen/arch/x86/Makefile                        |    1 +
>  xen/arch/x86/cpu/intel.c                     |    6 +
>  xen/arch/x86/domain.c                        |    8 +
>  xen/arch/x86/domctl.c                        |   40 +++++
>  xen/arch/x86/pqos.c                          |  242
> ++++++++++++++++++++++++++
>  xen/arch/x86/setup.c                         |    3 +
>  xen/arch/x86/smp.c                           |    7 +-
>  xen/arch/x86/smpboot.c                       |   19 +-
>  xen/arch/x86/sysctl.c                        |   65 +++++++
>  xen/include/asm-x86/cpufeature.h             |    1 +
>  xen/include/asm-x86/domain.h                 |    2 +
>  xen/include/asm-x86/msr-index.h              |    5 +
>  xen/include/asm-x86/pqos.h                   |   51 ++++++
>  xen/include/asm-x86/smp.h                    |    2 +
>  xen/include/public/domctl.h                  |   20 +++
>  xen/include/public/sysctl.h                  |   10 ++
>  xen/include/xen/cpumask.h                    |    1 +
>  xen/xsm/flask/hooks.c                        |    8 +
>  xen/xsm/flask/policy/access_vectors          |   17 +-
>  31 files changed, 907 insertions(+), 9 deletions(-)
>  create mode 100644 tools/libxl/libxl_pqos.c
>  create mode 100644 xen/arch/x86/pqos.c
>  create mode 100644 xen/include/asm-x86/pqos.h
> 
> --
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature
  2013-12-07  5:09 ` [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Xu, Dongxiao
@ 2013-12-09  9:34   ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2013-12-09  9:34 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

>>> On 07.12.13 at 06:09, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>>  -----Original Message-----
>> From: xen-devel-bounces@lists.xen.org 
>> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Dongxiao Xu
>> Sent: Thursday, December 05, 2013 5:39 PM
>> To: xen-devel@lists.xen.org 
>> Cc: keir@xen.org; Ian.Campbell@citrix.com; stefano.stabellini@eu.citrix.com;
>> andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> Ian.Jackson@eu.citrix.com; JBeulich@suse.com; dgdegra@tycho.nsa.gov 
>> Subject: [Xen-devel] [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature
>> 
> 
> Any more comments about this version?

Please allow some more time - the series isn't intended for 4.4,
yet our focus ought to be on 4.4 for the moment.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature
  2013-12-05  9:38 ` [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature Dongxiao Xu
  2013-12-05 10:54   ` Andrew Cooper
@ 2014-01-20 13:00   ` Jan Beulich
  2014-01-28 14:01     ` Xu, Dongxiao
  1 sibling, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2014-01-20 13:00 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, dgdegra

>>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c)
>  	     ( c->cpuid_level >= 0x00000006 ) &&
>  	     ( cpuid_eax(0x00000006) & (1u<<2) ) )
>  		set_bit(X86_FEATURE_ARAT, c->x86_capability);
> +
> +	/* Check platform QoS monitoring capability */
> +	if ((c->cpuid_level >= 0x00000007) &&
> +	    (cpuid_ebx(0x00000007) & (1u<<12)))
> +		set_bit(X86_FEATURE_QOSM, c->x86_capability);
> +

This is redundant with generic_identify() setting the respective
c->x86_capability[] element.

> +struct pqos_cqm *cqm;

__read_mostly?

> +
> +static void __init init_cqm(void)
> +{
> +    unsigned int rmid;
> +    unsigned int eax, edx;
> +
> +    if ( !opt_cqm_max_rmid )
> +        return;
> +
> +    cqm = xzalloc(struct pqos_cqm);
> +    if ( !cqm )
> +        return;
> +
> +    cpuid_count(0xf, 1, &eax, &cqm->upscaling_factor, &cqm->max_rmid, &edx);
> +    if ( !(edx & QOS_MONITOR_EVTID_L3) )
> +    {
> +        xfree(cqm);

"cqm" is a global variable and - afaict - the only way for other
entities to tell whether the functionality is enabled. Hence shouldn't
you clear the variable here (and similarly further down)? Otherwise
the variable should perhaps be static.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest
  2013-12-05  9:38 ` [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest Dongxiao Xu
@ 2014-01-20 13:13   ` Jan Beulich
  2014-01-20 13:17     ` Andrew Cooper
  2014-01-28 14:09     ` Xu, Dongxiao
  0 siblings, 2 replies; 33+ messages in thread
From: Jan Beulich @ 2014-01-20 13:13 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, dgdegra

>>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> @@ -1223,6 +1224,45 @@ long arch_do_domctl(
>      }
>      break;
>  
> +    case XEN_DOMCTL_attach_pqos:
> +    {
> +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
> +        {
> +            if ( !system_supports_cqm() )
> +                ret = -ENODEV;
> +            else if ( d->arch.pqos_cqm_rmid > 0 )
> +                ret = -EEXIST;
> +            else
> +            {
> +                ret = alloc_cqm_rmid(d);
> +                if ( ret < 0 )
> +                    ret = -EUSERS;

Why don't you have the function return a sensible error code
(which presumably might also end up being other than -EUSERS,
e.g. -ENOMEM).

> +            }
> +        }
> +        else
> +            ret = -EINVAL;
> +    }
> +    break;
> +
> +    case XEN_DOMCTL_detach_pqos:
> +    {
> +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
> +        {
> +            if ( !system_supports_cqm() )
> +                ret = -ENODEV;
> +            else if ( d->arch.pqos_cqm_rmid > 0 )
> +            {
> +                free_cqm_rmid(d);
> +                ret = 0;
> +            }
> +            else
> +                ret = -ENOENT;
> +        }
> +        else
> +            ret = -EINVAL;
> +    }
> +    break;

For consistency, both of the above would better be changed to a
single series of if()/else if().../else.

> +bool_t system_supports_cqm(void)
> +{
> +    return !!cqm;

So here we go (wrt the remark on patch 1).

> +}
> +
> +int alloc_cqm_rmid(struct domain *d)
> +{
> +    int rc = 0;
> +    unsigned int rmid;
> +    unsigned long flags;
> +
> +    ASSERT(system_supports_cqm());
> +
> +    spin_lock_irqsave(&cqm_lock, flags);

Why not just spin_lock()? Briefly scanning over the following patches
doesn't point out anything that might require this to be an IRQ-safe
lock.

> +    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
> +    {
> +        if ( cqm->rmid_to_dom[rmid] != DOMID_INVALID)
> +            continue;
> +
> +        cqm->rmid_to_dom[rmid] = d->domain_id;
> +        break;
> +    }
> +    spin_unlock_irqrestore(&cqm_lock, flags);
> +
> +    /* No CQM RMID available, assign RMID=0 by default */
> +    if ( rmid > cqm->max_rmid )
> +    {
> +        rmid = 0;
> +        rc = -1;
> +    }
> +
> +    d->arch.pqos_cqm_rmid = rmid;

Is it really safe to do this and the freeing below outside of the
lock?

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest
  2014-01-20 13:13   ` Jan Beulich
@ 2014-01-20 13:17     ` Andrew Cooper
  2014-01-20 14:03       ` Jan Beulich
  2014-01-28 14:09     ` Xu, Dongxiao
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2014-01-20 13:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, Ian.Campbell, stefano.stabellini, dario.faggioli,
	Ian.Jackson, xen-devel, Dongxiao Xu, dgdegra

On 20/01/14 13:13, Jan Beulich wrote:
>>>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
>> @@ -1223,6 +1224,45 @@ long arch_do_domctl(
>>      }
>>      break;
>>  
>> +    case XEN_DOMCTL_attach_pqos:
>> +    {
>> +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
>> +        {
>> +            if ( !system_supports_cqm() )
>> +                ret = -ENODEV;
>> +            else if ( d->arch.pqos_cqm_rmid > 0 )
>> +                ret = -EEXIST;
>> +            else
>> +            {
>> +                ret = alloc_cqm_rmid(d);
>> +                if ( ret < 0 )
>> +                    ret = -EUSERS;
> Why don't you have the function return a sensible error code
> (which presumably might also end up being other than -EUSERS,
> e.g. -ENOMEM).

-EUSERS is correct here.  This failure like this means "all the
available system rmid's are already being used by other domains".

~Andrew

>
>> +            }
>> +        }
>> +        else
>> +            ret = -EINVAL;
>> +    }
>> +    break;
>> +
>> +    case XEN_DOMCTL_detach_pqos:
>> +    {
>> +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
>> +        {
>> +            if ( !system_supports_cqm() )
>> +                ret = -ENODEV;
>> +            else if ( d->arch.pqos_cqm_rmid > 0 )
>> +            {
>> +                free_cqm_rmid(d);
>> +                ret = 0;
>> +            }
>> +            else
>> +                ret = -ENOENT;
>> +        }
>> +        else
>> +            ret = -EINVAL;
>> +    }
>> +    break;
> For consistency, both of the above would better be changed to a
> single series of if()/else if().../else.
>
>> +bool_t system_supports_cqm(void)
>> +{
>> +    return !!cqm;
> So here we go (wrt the remark on patch 1).
>
>> +}
>> +
>> +int alloc_cqm_rmid(struct domain *d)
>> +{
>> +    int rc = 0;
>> +    unsigned int rmid;
>> +    unsigned long flags;
>> +
>> +    ASSERT(system_supports_cqm());
>> +
>> +    spin_lock_irqsave(&cqm_lock, flags);
> Why not just spin_lock()? Briefly scanning over the following patches
> doesn't point out anything that might require this to be an IRQ-safe
> lock.
>
>> +    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
>> +    {
>> +        if ( cqm->rmid_to_dom[rmid] != DOMID_INVALID)
>> +            continue;
>> +
>> +        cqm->rmid_to_dom[rmid] = d->domain_id;
>> +        break;
>> +    }
>> +    spin_unlock_irqrestore(&cqm_lock, flags);
>> +
>> +    /* No CQM RMID available, assign RMID=0 by default */
>> +    if ( rmid > cqm->max_rmid )
>> +    {
>> +        rmid = 0;
>> +        rc = -1;
>> +    }
>> +
>> +    d->arch.pqos_cqm_rmid = rmid;
> Is it really safe to do this and the freeing below outside of the
> lock?
>
> Jan
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 3/7] x86: initialize per socket cpu map
  2013-12-05  9:38 ` [PATCH v6 3/7] x86: initialize per socket cpu map Dongxiao Xu
@ 2014-01-20 13:26   ` Jan Beulich
  2014-01-28 14:12     ` Xu, Dongxiao
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2014-01-20 13:26 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, dgdegra

>>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> For each socket in the system, we create a separate bitmap to tag its
> related CPUs. This per socket bitmap will be initialized on system
> start up, and adjusted when CPU is dynamically online/offline.

There's no reasoning here at all why cpu_sibling_mask and
cpu_core_mask aren't sufficient.

> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -59,6 +59,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
>  cpumask_t cpu_online_map __read_mostly;
>  EXPORT_SYMBOL(cpu_online_map);
>  
> +cpumask_t socket_cpu_map[MAX_NUM_SOCKETS] __read_mostly;
> +EXPORT_SYMBOL(socket_cpu_map);

And _if_ we really need it, then it should be done in a better way
than via a statically sized array, the size of which can't even be
overridden on the build and/or hypervisor command line.

And there shouldn't be EXPORT_SYMBOL() in new, not directly
cloned hypervisor code either.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2013-12-05  9:38 ` [PATCH v6 4/7] x86: collect CQM information from all sockets Dongxiao Xu
@ 2014-01-20 13:52   ` Jan Beulich
  2014-01-28 14:23     ` Xu, Dongxiao
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2014-01-20 13:52 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, dgdegra

>>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> @@ -126,6 +127,12 @@ bool_t system_supports_cqm(void)
>      return !!cqm;
>  }
>  
> +unsigned int get_cqm_count(void)
> +{
> +    ASSERT(system_supports_cqm());
> +    return cqm->max_rmid + 1;
> +}
> +
>  int alloc_cqm_rmid(struct domain *d)
>  {
>      int rc = 0;
> @@ -170,6 +177,48 @@ void free_cqm_rmid(struct domain *d)
>      d->arch.pqos_cqm_rmid = 0;
>  }
>  
> +static void read_cqm_data(void *arg)
> +{
> +    uint64_t cqm_data;
> +    unsigned int rmid;
> +    int socket = cpu_to_socket(smp_processor_id());
> +    struct xen_socket_cqmdata *data = arg;
> +    unsigned long flags, i;

Either i can be "unsigned int" ...

> +
> +    ASSERT(system_supports_cqm());
> +
> +    if ( socket < 0 )
> +        return;
> +
> +    spin_lock_irqsave(&cqm_lock, flags);
> +    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
> +    {
> +        if ( cqm->rmid_to_dom[rmid] == DOMID_INVALID )
> +            continue;
> +
> +        wrmsr(MSR_IA32_QOSEVTSEL, QOS_MONITOR_EVTID_L3, rmid);
> +        rdmsrl(MSR_IA32_QMC, cqm_data);
> +
> +        i = socket * (cqm->max_rmid + 1) + rmid;

... or this calculation needs one of the two operands of * cast
to "unsigned long".

> +        data[i].valid = !(cqm_data & IA32_QM_CTR_ERROR_MASK);
> +        if ( data[i].valid )
> +        {
> +            data[i].l3c_occupancy = cqm_data * cqm->upscaling_factor;
> +            data[i].socket = socket;
> +            data[i].domid = cqm->rmid_to_dom[rmid];
> +        }
> +    }
> +    spin_unlock_irqrestore(&cqm_lock, flags);
> +}

Also, please clarify why the locking here is necessary: You don't
seem to be modifying global data, and the only possibly mutable
thing you read is cqm->rmid_to_dom[]. A race on that one with
an addition/deletion doesn't appear to be problematic though.

> +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data)

const cpumask_t *

> +    case XEN_SYSCTL_getcqminfo:
> +    {
> +        struct xen_socket_cqmdata *info;
> +        uint32_t num_sockets;
> +        uint32_t num_rmids;
> +        cpumask_t cpu_cqmdata_map;

Unless absolutely avoidable, not CPU masks on the stack please.

> +
> +        if ( !system_supports_cqm() )
> +        {
> +            ret = -ENODEV;
> +            break;
> +        }
> +
> +        select_socket_cpu(&cpu_cqmdata_map);
> +
> +        num_sockets = min((unsigned int)cpumask_weight(&cpu_cqmdata_map) + 1,
> +                          sysctl->u.getcqminfo.num_sockets);
> +        num_rmids = get_cqm_count();
> +        info = xzalloc_array(struct xen_socket_cqmdata,
> +                             num_rmids * num_sockets);

While unlikely right now, you ought to consider the case of this
multiplication overflowing.

Also - how does the caller know how big the buffer needs to be?
Only num_sockets can be restricted by it...

And what's worse - you allow the caller to limit num_sockets and
allocate info based on this limited value, but you don't restrict
cpu_cqmdata_map to just the socket covered, i.e. if the caller
specified a lower number, then you'll corrupt memory.

And finally, I think the total size of the buffer here can easily
exceed a page, i.e. this then ends up being a non-order-0
allocation, which may _never_ succeed (i.e. the operation is
then rendered useless). I guest it'd be better to e.g. vmap()
the MFNs underlying the guest buffer.

> +        if ( !info )
> +        {
> +            ret = -ENOMEM;
> +            break;
> +        }
> +
> +        get_cqm_info(&cpu_cqmdata_map, info);
> +
> +        if ( copy_to_guest_offset(sysctl->u.getcqminfo.buffer,
> +                                  0, info, num_rmids * num_sockets) )

If the offset is zero anyway, why do you use copy_to_guest_offset()
rather than copy_to_guest()?

> +        {
> +            ret = -EFAULT;
> +            xfree(info);
> +            break;
> +        }
> +
> +        sysctl->u.getcqminfo.num_rmids = num_rmids;
> +        sysctl->u.getcqminfo.num_sockets = num_sockets;
> +
> +        if ( copy_to_guest(u_sysctl, sysctl, 1) )

__copy_to_guest() is sufficient here.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID
  2013-12-05  9:38 ` [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID Dongxiao Xu
@ 2014-01-20 13:58   ` Jan Beulich
  2014-01-28 14:24     ` Xu, Dongxiao
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2014-01-20 13:58 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, dgdegra

>>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1366,6 +1366,8 @@ static void __context_switch(void)
>      {
>          memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
>          vcpu_save_fpu(p);
> +        if ( system_supports_cqm() )
> +            cqm_assoc_rmid(0);
>          p->arch.ctxt_switch_from(p);
>      }
>  
> @@ -1390,6 +1392,9 @@ static void __context_switch(void)
>          }
>          vcpu_restore_fpu_eager(n);
>          n->arch.ctxt_switch_to(n);
> +
> +        if ( system_supports_cqm() && n->domain->arch.pqos_cqm_rmid > 0 )
> +            cqm_assoc_rmid(n->domain->arch.pqos_cqm_rmid);
>      }
>  
>      gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :

The two uses here clearly call for system_supports_cqm() to
be an inline function (the more that the variable checked in that
function is already global anyway).

Further, cqm_assoc_rmid() being an RDMSR plus WRMSR, you
surely will want to optimize the case of p's and n's RMIDs being
identical. Or at the very least make sure you _never_ call that
function if all domains run with RMID 0.

> @@ -60,6 +60,8 @@ static void __init parse_pqos_param(char *s)
>  
>  custom_param("pqos", parse_pqos_param);
>  
> +static uint64_t rmid_mask;

__read_mostly?

> +void cqm_assoc_rmid(unsigned int rmid)
> +{
> +    uint64_t val;
> +
> +    rdmsrl(MSR_IA32_PQR_ASSOC, val);
> +    wrmsrl(MSR_IA32_PQR_ASSOC, (val & ~(rmid_mask)) | (rmid & rmid_mask));

Stray parentheses around a simple variable.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest
  2014-01-20 13:17     ` Andrew Cooper
@ 2014-01-20 14:03       ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2014-01-20 14:03 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: keir, Ian.Campbell, stefano.stabellini, dario.faggioli,
	Ian.Jackson, xen-devel, Dongxiao Xu, dgdegra

>>> On 20.01.14 at 14:17, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 20/01/14 13:13, Jan Beulich wrote:
>>>>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
>>> @@ -1223,6 +1224,45 @@ long arch_do_domctl(
>>>      }
>>>      break;
>>>  
>>> +    case XEN_DOMCTL_attach_pqos:
>>> +    {
>>> +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
>>> +        {
>>> +            if ( !system_supports_cqm() )
>>> +                ret = -ENODEV;
>>> +            else if ( d->arch.pqos_cqm_rmid > 0 )
>>> +                ret = -EEXIST;
>>> +            else
>>> +            {
>>> +                ret = alloc_cqm_rmid(d);
>>> +                if ( ret < 0 )
>>> +                    ret = -EUSERS;
>> Why don't you have the function return a sensible error code
>> (which presumably might also end up being other than -EUSERS,
>> e.g. -ENOMEM).
> 
> -EUSERS is correct here.  This failure like this means "all the
> available system rmid's are already being used by other domains".

I didn't mean to say anything to the contrary with _current_ code.
As with any alloc type function, a future change may involve other
than just a RMID allocation, and hence -ENOMEM may become
possible.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature
  2014-01-20 13:00   ` Jan Beulich
@ 2014-01-28 14:01     ` Xu, Dongxiao
  2014-01-28 14:04       ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Xu, Dongxiao @ 2014-01-28 14:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, January 20, 2014 9:01 PM
> To: Xu, Dongxiao
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com;
> stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org;
> konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org
> Subject: Re: [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring
> feature
> 
> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/intel.c
> > +++ b/xen/arch/x86/cpu/intel.c
> > @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c)
> >  	     ( c->cpuid_level >= 0x00000006 ) &&
> >  	     ( cpuid_eax(0x00000006) & (1u<<2) ) )
> >  		set_bit(X86_FEATURE_ARAT, c->x86_capability);
> > +
> > +	/* Check platform QoS monitoring capability */
> > +	if ((c->cpuid_level >= 0x00000007) &&
> > +	    (cpuid_ebx(0x00000007) & (1u<<12)))
> > +		set_bit(X86_FEATURE_QOSM, c->x86_capability);
> > +
> 
> This is redundant with generic_identify() setting the respective
> c->x86_capability[] element.
> 
> > +struct pqos_cqm *cqm;
> 
> __read_mostly?
cqm->rmid_to_dom will be updated time to time.

> 
> > +
> > +static void __init init_cqm(void)
> > +{
> > +    unsigned int rmid;
> > +    unsigned int eax, edx;
> > +
> > +    if ( !opt_cqm_max_rmid )
> > +        return;
> > +
> > +    cqm = xzalloc(struct pqos_cqm);
> > +    if ( !cqm )
> > +        return;
> > +
> > +    cpuid_count(0xf, 1, &eax, &cqm->upscaling_factor, &cqm->max_rmid,
> &edx);
> > +    if ( !(edx & QOS_MONITOR_EVTID_L3) )
> > +    {
> > +        xfree(cqm);
> 
> "cqm" is a global variable and - afaict - the only way for other
> entities to tell whether the functionality is enabled. Hence shouldn't
> you clear the variable here (and similarly further down)? Otherwise
> the variable should perhaps be static.

Yes, will correct it in following patches.

> 
> Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature
  2014-01-28 14:01     ` Xu, Dongxiao
@ 2014-01-28 14:04       ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2014-01-28 14:04 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

>>> On 28.01.14 at 15:01, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, January 20, 2014 9:01 PM
>> To: Xu, Dongxiao
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com;
>> stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org;
>> konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org 
>> Subject: Re: [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring
>> feature
>> 
>> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
>> > --- a/xen/arch/x86/cpu/intel.c
>> > +++ b/xen/arch/x86/cpu/intel.c
>> > @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c)
>> >  	     ( c->cpuid_level >= 0x00000006 ) &&
>> >  	     ( cpuid_eax(0x00000006) & (1u<<2) ) )
>> >  		set_bit(X86_FEATURE_ARAT, c->x86_capability);
>> > +
>> > +	/* Check platform QoS monitoring capability */
>> > +	if ((c->cpuid_level >= 0x00000007) &&
>> > +	    (cpuid_ebx(0x00000007) & (1u<<12)))
>> > +		set_bit(X86_FEATURE_QOSM, c->x86_capability);
>> > +
>> 
>> This is redundant with generic_identify() setting the respective
>> c->x86_capability[] element.
>> 
>> > +struct pqos_cqm *cqm;
>> 
>> __read_mostly?
> cqm->rmid_to_dom will be updated time to time.

But the attribute applies to the variable itself, not what it points to.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest
  2014-01-20 13:13   ` Jan Beulich
  2014-01-20 13:17     ` Andrew Cooper
@ 2014-01-28 14:09     ` Xu, Dongxiao
  2014-01-28 14:37       ` Jan Beulich
  1 sibling, 1 reply; 33+ messages in thread
From: Xu, Dongxiao @ 2014-01-28 14:09 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, January 20, 2014 9:14 PM
> To: Xu, Dongxiao
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com;
> stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org;
> konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org
> Subject: Re: [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a
> guest
> 
> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> > @@ -1223,6 +1224,45 @@ long arch_do_domctl(
> >      }
> >      break;
> >
> > +    case XEN_DOMCTL_attach_pqos:
> > +    {
> > +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
> > +        {
> > +            if ( !system_supports_cqm() )
> > +                ret = -ENODEV;
> > +            else if ( d->arch.pqos_cqm_rmid > 0 )
> > +                ret = -EEXIST;
> > +            else
> > +            {
> > +                ret = alloc_cqm_rmid(d);
> > +                if ( ret < 0 )
> > +                    ret = -EUSERS;
> 
> Why don't you have the function return a sensible error code
> (which presumably might also end up being other than -EUSERS,
> e.g. -ENOMEM).

For the assignment of RMID, I don't think there will be error of ENOMEM, so I think EUSER will be better here?

> 
> > +            }
> > +        }
> > +        else
> > +            ret = -EINVAL;
> > +    }
> > +    break;
> > +
> > +    case XEN_DOMCTL_detach_pqos:
> > +    {
> > +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
> > +        {
> > +            if ( !system_supports_cqm() )
> > +                ret = -ENODEV;
> > +            else if ( d->arch.pqos_cqm_rmid > 0 )
> > +            {
> > +                free_cqm_rmid(d);
> > +                ret = 0;
> > +            }
> > +            else
> > +                ret = -ENOENT;
> > +        }
> > +        else
> > +            ret = -EINVAL;
> > +    }
> > +    break;
> 
> For consistency, both of the above would better be changed to a
> single series of if()/else if().../else.

Will re-format in following patch.

> 
> > +bool_t system_supports_cqm(void)
> > +{
> > +    return !!cqm;
> 
> So here we go (wrt the remark on patch 1).

Yes

> 
> > +}
> > +
> > +int alloc_cqm_rmid(struct domain *d)
> > +{
> > +    int rc = 0;
> > +    unsigned int rmid;
> > +    unsigned long flags;
> > +
> > +    ASSERT(system_supports_cqm());
> > +
> > +    spin_lock_irqsave(&cqm_lock, flags);
> 
> Why not just spin_lock()? Briefly scanning over the following patches
> doesn't point out anything that might require this to be an IRQ-safe
> lock.

Will change it to simple spin_lock().

> 
> > +    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
> > +    {
> > +        if ( cqm->rmid_to_dom[rmid] != DOMID_INVALID)
> > +            continue;
> > +
> > +        cqm->rmid_to_dom[rmid] = d->domain_id;
> > +        break;
> > +    }
> > +    spin_unlock_irqrestore(&cqm_lock, flags);
> > +
> > +    /* No CQM RMID available, assign RMID=0 by default */
> > +    if ( rmid > cqm->max_rmid )
> > +    {
> > +        rmid = 0;
> > +        rc = -1;
> > +    }
> > +
> > +    d->arch.pqos_cqm_rmid = rmid;
> 
> Is it really safe to do this and the freeing below outside of the
> lock?

Could you help to elaborate the race condition here?

Per my understanding, there can't be any competition over who owns the entry. Besides, setting it
back to DOMID_INVALID won't race with the allocation loop.

Thanks,
Dongxiao

> 
> Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 3/7] x86: initialize per socket cpu map
  2014-01-20 13:26   ` Jan Beulich
@ 2014-01-28 14:12     ` Xu, Dongxiao
  2014-01-28 14:41       ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Xu, Dongxiao @ 2014-01-28 14:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, January 20, 2014 9:27 PM
> To: Xu, Dongxiao
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com;
> stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org;
> konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org
> Subject: Re: [PATCH v6 3/7] x86: initialize per socket cpu map
> 
> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> > For each socket in the system, we create a separate bitmap to tag its
> > related CPUs. This per socket bitmap will be initialized on system
> > start up, and adjusted when CPU is dynamically online/offline.
> 
> There's no reasoning here at all why cpu_sibling_mask and
> cpu_core_mask aren't sufficient.

The new mask is to mark socket CPUs, and they may be different with cpu_sibling_mask and cpu_core_mask...

> 
> > --- a/xen/arch/x86/smpboot.c
> > +++ b/xen/arch/x86/smpboot.c
> > @@ -59,6 +59,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t,
> cpu_core_mask);
> >  cpumask_t cpu_online_map __read_mostly;
> >  EXPORT_SYMBOL(cpu_online_map);
> >
> > +cpumask_t socket_cpu_map[MAX_NUM_SOCKETS] __read_mostly;
> > +EXPORT_SYMBOL(socket_cpu_map);
> 
> And _if_ we really need it, then it should be done in a better way
> than via a statically sized array, the size of which can't even be
> overridden on the build and/or hypervisor command line.

I saw current Xen code uses a lot of such static macros, e.g., NR_CPUS.
This reminds me one thing, can we define the MAX_NUM_SOCKETS as NR_CPUS? Since the socket number could not exceeds the CPU number.

> 
> And there shouldn't be EXPORT_SYMBOL() in new, not directly
> cloned hypervisor code either.

Okay, will remove it in following patch.

Thanks,
Dongxiao

> 
> Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2014-01-20 13:52   ` Jan Beulich
@ 2014-01-28 14:23     ` Xu, Dongxiao
  2014-01-28 14:34       ` Andrew Cooper
  2014-01-28 14:47       ` Jan Beulich
  0 siblings, 2 replies; 33+ messages in thread
From: Xu, Dongxiao @ 2014-01-28 14:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, January 20, 2014 9:52 PM
> To: Xu, Dongxiao
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com;
> stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org;
> konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org
> Subject: Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
> 
> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> > @@ -126,6 +127,12 @@ bool_t system_supports_cqm(void)
> >      return !!cqm;
> >  }
> >
> > +unsigned int get_cqm_count(void)
> > +{
> > +    ASSERT(system_supports_cqm());
> > +    return cqm->max_rmid + 1;
> > +}
> > +
> >  int alloc_cqm_rmid(struct domain *d)
> >  {
> >      int rc = 0;
> > @@ -170,6 +177,48 @@ void free_cqm_rmid(struct domain *d)
> >      d->arch.pqos_cqm_rmid = 0;
> >  }
> >
> > +static void read_cqm_data(void *arg)
> > +{
> > +    uint64_t cqm_data;
> > +    unsigned int rmid;
> > +    int socket = cpu_to_socket(smp_processor_id());
> > +    struct xen_socket_cqmdata *data = arg;
> > +    unsigned long flags, i;
> 
> Either i can be "unsigned int" ...
> 
> > +
> > +    ASSERT(system_supports_cqm());
> > +
> > +    if ( socket < 0 )
> > +        return;
> > +
> > +    spin_lock_irqsave(&cqm_lock, flags);
> > +    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
> > +    {
> > +        if ( cqm->rmid_to_dom[rmid] == DOMID_INVALID )
> > +            continue;
> > +
> > +        wrmsr(MSR_IA32_QOSEVTSEL, QOS_MONITOR_EVTID_L3, rmid);
> > +        rdmsrl(MSR_IA32_QMC, cqm_data);
> > +
> > +        i = socket * (cqm->max_rmid + 1) + rmid;
> 
> ... or this calculation needs one of the two operands of * cast
> to "unsigned long".

Will adopt this way in following patch.

> 
> > +        data[i].valid = !(cqm_data & IA32_QM_CTR_ERROR_MASK);
> > +        if ( data[i].valid )
> > +        {
> > +            data[i].l3c_occupancy = cqm_data * cqm->upscaling_factor;
> > +            data[i].socket = socket;
> > +            data[i].domid = cqm->rmid_to_dom[rmid];
> > +        }
> > +    }
> > +    spin_unlock_irqrestore(&cqm_lock, flags);
> > +}
> 
> Also, please clarify why the locking here is necessary: You don't
> seem to be modifying global data, and the only possibly mutable
> thing you read is cqm->rmid_to_dom[]. A race on that one with
> an addition/deletion doesn't appear to be problematic though.

Will use the spin_lock() in following patch.

> 
> > +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct
> xen_socket_cqmdata *data)
> 
> const cpumask_t *
> 
> > +    case XEN_SYSCTL_getcqminfo:
> > +    {
> > +        struct xen_socket_cqmdata *info;
> > +        uint32_t num_sockets;
> > +        uint32_t num_rmids;
> > +        cpumask_t cpu_cqmdata_map;
> 
> Unless absolutely avoidable, not CPU masks on the stack please.

Okay, will allocate it by "xzalloc" function.

> 
> > +
> > +        if ( !system_supports_cqm() )
> > +        {
> > +            ret = -ENODEV;
> > +            break;
> > +        }
> > +
> > +        select_socket_cpu(&cpu_cqmdata_map);
> > +
> > +        num_sockets = min((unsigned
> int)cpumask_weight(&cpu_cqmdata_map) + 1,
> > +                          sysctl->u.getcqminfo.num_sockets);
> > +        num_rmids = get_cqm_count();
> > +        info = xzalloc_array(struct xen_socket_cqmdata,
> > +                             num_rmids * num_sockets);
> 
> While unlikely right now, you ought to consider the case of this
> multiplication overflowing.
> 
> Also - how does the caller know how big the buffer needs to be?
> Only num_sockets can be restricted by it...
> 
> And what's worse - you allow the caller to limit num_sockets and
> allocate info based on this limited value, but you don't restrict
> cpu_cqmdata_map to just the socket covered, i.e. if the caller
> specified a lower number, then you'll corrupt memory.

Currently the caller (libxc) sets big num_rmid and num_sockets which are the maximum values that could be available inside the hypervisor.
If you think this approach is not enough to ensure the security, what about the caller (libxc) issue an hypercall to get the two values from hypervisor, then allocating the same size of num_rmid and num_sockets?

> 
> And finally, I think the total size of the buffer here can easily
> exceed a page, i.e. this then ends up being a non-order-0
> allocation, which may _never_ succeed (i.e. the operation is
> then rendered useless). I guest it'd be better to e.g. vmap()
> the MFNs underlying the guest buffer.

Do you mean we check the size of the total size, and allocate MFNs one by one, then vmap them?

> 
> > +        if ( !info )
> > +        {
> > +            ret = -ENOMEM;
> > +            break;
> > +        }
> > +
> > +        get_cqm_info(&cpu_cqmdata_map, info);
> > +
> > +        if ( copy_to_guest_offset(sysctl->u.getcqminfo.buffer,
> > +                                  0, info, num_rmids * num_sockets) )
> 
> If the offset is zero anyway, why do you use copy_to_guest_offset()
> rather than copy_to_guest()?

Okay.

> 
> > +        {
> > +            ret = -EFAULT;
> > +            xfree(info);
> > +            break;
> > +        }
> > +
> > +        sysctl->u.getcqminfo.num_rmids = num_rmids;
> > +        sysctl->u.getcqminfo.num_sockets = num_sockets;
> > +
> > +        if ( copy_to_guest(u_sysctl, sysctl, 1) )
> 
> __copy_to_guest() is sufficient here.

Okay.

Thanks,
Dongxiao

> 
> Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID
  2014-01-20 13:58   ` Jan Beulich
@ 2014-01-28 14:24     ` Xu, Dongxiao
  0 siblings, 0 replies; 33+ messages in thread
From: Xu, Dongxiao @ 2014-01-28 14:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, January 20, 2014 9:58 PM
> To: Xu, Dongxiao
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com;
> stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org;
> konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org
> Subject: Re: [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID
> 
> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
> > --- a/xen/arch/x86/domain.c
> > +++ b/xen/arch/x86/domain.c
> > @@ -1366,6 +1366,8 @@ static void __context_switch(void)
> >      {
> >          memcpy(&p->arch.user_regs, stack_regs,
> CTXT_SWITCH_STACK_BYTES);
> >          vcpu_save_fpu(p);
> > +        if ( system_supports_cqm() )
> > +            cqm_assoc_rmid(0);
> >          p->arch.ctxt_switch_from(p);
> >      }
> >
> > @@ -1390,6 +1392,9 @@ static void __context_switch(void)
> >          }
> >          vcpu_restore_fpu_eager(n);
> >          n->arch.ctxt_switch_to(n);
> > +
> > +        if ( system_supports_cqm() && n->domain->arch.pqos_cqm_rmid >
> 0 )
> > +            cqm_assoc_rmid(n->domain->arch.pqos_cqm_rmid);
> >      }
> >
> >      gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
> 
> The two uses here clearly call for system_supports_cqm() to
> be an inline function (the more that the variable checked in that
> function is already global anyway).

Okay.

> 
> Further, cqm_assoc_rmid() being an RDMSR plus WRMSR, you
> surely will want to optimize the case of p's and n's RMIDs being
> identical. Or at the very least make sure you _never_ call that
> function if all domains run with RMID 0.

Okay.

> 
> > @@ -60,6 +60,8 @@ static void __init parse_pqos_param(char *s)
> >
> >  custom_param("pqos", parse_pqos_param);
> >
> > +static uint64_t rmid_mask;
> 
> __read_mostly?

Okay.

> 
> > +void cqm_assoc_rmid(unsigned int rmid)
> > +{
> > +    uint64_t val;
> > +
> > +    rdmsrl(MSR_IA32_PQR_ASSOC, val);
> > +    wrmsrl(MSR_IA32_PQR_ASSOC, (val & ~(rmid_mask)) | (rmid &
> rmid_mask));
> 
> Stray parentheses around a simple variable.

Okay.

Will reflect your suggestions in next version patch.

Thanks,
Dongxiao

> 
> Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2014-01-28 14:23     ` Xu, Dongxiao
@ 2014-01-28 14:34       ` Andrew Cooper
  2014-01-28 15:03         ` Jan Beulich
  2014-01-28 14:47       ` Jan Beulich
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2014-01-28 14:34 UTC (permalink / raw)
  To: Xu, Dongxiao
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, dario.faggioli@citrix.com,
	Ian.Jackson@eu.citrix.com, xen-devel@lists.xen.org, Jan Beulich,
	dgdegra@tycho.nsa.gov

On 28/01/14 14:23, Xu, Dongxiao wrote:
>
>>> +    case XEN_SYSCTL_getcqminfo:
>>> +    {
>>> +        struct xen_socket_cqmdata *info;
>>> +        uint32_t num_sockets;
>>> +        uint32_t num_rmids;
>>> +        cpumask_t cpu_cqmdata_map;
>> Unless absolutely avoidable, not CPU masks on the stack please.
> Okay, will allocate it by "xzalloc" function.

cpumask_var_t mask;
{z}alloc_cpumask_var(&mask);
...
free_cpumask_var(mask);

This will switch between a long on the stack and an allocated structure
depending on whether Xen is compiled with fewer or more than 64 pcpu.

>
>>> +
>>> +        if ( !system_supports_cqm() )
>>> +        {
>>> +            ret = -ENODEV;
>>> +            break;
>>> +        }
>>> +
>>> +        select_socket_cpu(&cpu_cqmdata_map);
>>> +
>>> +        num_sockets = min((unsigned
>> int)cpumask_weight(&cpu_cqmdata_map) + 1,
>>> +                          sysctl->u.getcqminfo.num_sockets);
>>> +        num_rmids = get_cqm_count();
>>> +        info = xzalloc_array(struct xen_socket_cqmdata,
>>> +                             num_rmids * num_sockets);
>> While unlikely right now, you ought to consider the case of this
>> multiplication overflowing.
>>
>> Also - how does the caller know how big the buffer needs to be?
>> Only num_sockets can be restricted by it...
>>
>> And what's worse - you allow the caller to limit num_sockets and
>> allocate info based on this limited value, but you don't restrict
>> cpu_cqmdata_map to just the socket covered, i.e. if the caller
>> specified a lower number, then you'll corrupt memory.
> Currently the caller (libxc) sets big num_rmid and num_sockets which are the maximum values that could be available inside the hypervisor.
> If you think this approach is not enough to ensure the security, what about the caller (libxc) issue an hypercall to get the two values from hypervisor, then allocating the same size of num_rmid and num_sockets?
>
>> And finally, I think the total size of the buffer here can easily
>> exceed a page, i.e. this then ends up being a non-order-0
>> allocation, which may _never_ succeed (i.e. the operation is
>> then rendered useless). I guest it'd be better to e.g. vmap()
>> the MFNs underlying the guest buffer.
> Do you mean we check the size of the total size, and allocate MFNs one by one, then vmap them?

I still think this is barking mad as a method of getting this quantity
of data from Xen to the toolstack in a repeated fashon.

Xen should allocate a per-socket buffer at the start of day (or perhaps
on first use of CQM), and the CQM monitoring tool gets to map those
per-socket buffers read-only.

This way, all processing of the CQM data happens in dom0 userspace, not
in Xen in hypercall context; All Xen has to do is periodically dump the
MSRs into the pages.

~Andrew

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest
  2014-01-28 14:09     ` Xu, Dongxiao
@ 2014-01-28 14:37       ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2014-01-28 14:37 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

>>> On 28.01.14 at 15:09, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
>> > @@ -1223,6 +1224,45 @@ long arch_do_domctl(
>> >      }
>> >      break;
>> >
>> > +    case XEN_DOMCTL_attach_pqos:
>> > +    {
>> > +        if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm )
>> > +        {
>> > +            if ( !system_supports_cqm() )
>> > +                ret = -ENODEV;
>> > +            else if ( d->arch.pqos_cqm_rmid > 0 )
>> > +                ret = -EEXIST;
>> > +            else
>> > +            {
>> > +                ret = alloc_cqm_rmid(d);
>> > +                if ( ret < 0 )
>> > +                    ret = -EUSERS;
>> 
>> Why don't you have the function return a sensible error code
>> (which presumably might also end up being other than -EUSERS,
>> e.g. -ENOMEM).
> 
> For the assignment of RMID, I don't think there will be error of ENOMEM, so 
> I think EUSER will be better here?

-EUSER is certainly fine here, but that wasn't my point. My point was
that alloc_cqm_rmid() should return a proper error code (right now
only -EUSER, but _potentially_ there could be others in the future,
_for example_ -ENOMEM), and that error code should simply get
passed up here.

>> > +    for ( rmid = cqm->min_rmid; rmid <= cqm->max_rmid; rmid++ )
>> > +    {
>> > +        if ( cqm->rmid_to_dom[rmid] != DOMID_INVALID)
>> > +            continue;
>> > +
>> > +        cqm->rmid_to_dom[rmid] = d->domain_id;
>> > +        break;
>> > +    }
>> > +    spin_unlock_irqrestore(&cqm_lock, flags);
>> > +
>> > +    /* No CQM RMID available, assign RMID=0 by default */
>> > +    if ( rmid > cqm->max_rmid )
>> > +    {
>> > +        rmid = 0;
>> > +        rc = -1;
>> > +    }
>> > +
>> > +    d->arch.pqos_cqm_rmid = rmid;
>> 
>> Is it really safe to do this and the freeing below outside of the
>> lock?
> 
> Could you help to elaborate the race condition here?

I wasn't saying there is one. I was asking whether you thought
about whether there might be one. After all, from simply looking
at it I get the impression that two racing calls to this function
might end up leaking one of the two RMIDs (second instance
blindly overwriting what the first instance stored).

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 3/7] x86: initialize per socket cpu map
  2014-01-28 14:12     ` Xu, Dongxiao
@ 2014-01-28 14:41       ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2014-01-28 14:41 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

>>> On 28.01.14 at 15:12, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
>> > For each socket in the system, we create a separate bitmap to tag its
>> > related CPUs. This per socket bitmap will be initialized on system
>> > start up, and adjusted when CPU is dynamically online/offline.
>> 
>> There's no reasoning here at all why cpu_sibling_mask and
>> cpu_core_mask aren't sufficient.
> 
> The new mask is to mark socket CPUs, and they may be different with 
> cpu_sibling_mask and cpu_core_mask...

Sorry, I don't follow: cpu_core_mask represents all cores sitting
on the same socket as the "owning" CPU. How's that different
from "marking socket CPUs"?

>> > --- a/xen/arch/x86/smpboot.c
>> > +++ b/xen/arch/x86/smpboot.c
>> > @@ -59,6 +59,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t,
>> cpu_core_mask);
>> >  cpumask_t cpu_online_map __read_mostly;
>> >  EXPORT_SYMBOL(cpu_online_map);
>> >
>> > +cpumask_t socket_cpu_map[MAX_NUM_SOCKETS] __read_mostly;
>> > +EXPORT_SYMBOL(socket_cpu_map);
>> 
>> And _if_ we really need it, then it should be done in a better way
>> than via a statically sized array, the size of which can't even be
>> overridden on the build and/or hypervisor command line.
> 
> I saw current Xen code uses a lot of such static macros, e.g., NR_CPUS.

For one, the number of these has been decreasing over time.

And then NR_CPUS _can_ be controlled from the make command
line.

> This reminds me one thing, can we define the MAX_NUM_SOCKETS as NR_CPUS? 
> Since the socket number could not exceeds the CPU number.

That might be an option, but only if this construct is really needed
in the first place.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2014-01-28 14:23     ` Xu, Dongxiao
  2014-01-28 14:34       ` Andrew Cooper
@ 2014-01-28 14:47       ` Jan Beulich
  1 sibling, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2014-01-28 14:47 UTC (permalink / raw)
  To: Dongxiao Xu
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, andrew.cooper3@citrix.com,
	dario.faggioli@citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

>>> On 28.01.14 at 15:23, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >>> On 05.12.13 at 10:38, Dongxiao Xu <dongxiao.xu@intel.com> wrote:
>> > +    case XEN_SYSCTL_getcqminfo:
>> > +    {
>> > +        struct xen_socket_cqmdata *info;
>> > +        uint32_t num_sockets;
>> > +        uint32_t num_rmids;
>> > +        cpumask_t cpu_cqmdata_map;
>> 
>> Unless absolutely avoidable, not CPU masks on the stack please.
> 
> Okay, will allocate it by "xzalloc" function.

Hopefully this was just a thinko - zalloc_cpumask_var() is what you
want to use.

>> > +        num_sockets = min((unsigned
>> int)cpumask_weight(&cpu_cqmdata_map) + 1,
>> > +                          sysctl->u.getcqminfo.num_sockets);
>> > +        num_rmids = get_cqm_count();
>> > +        info = xzalloc_array(struct xen_socket_cqmdata,
>> > +                             num_rmids * num_sockets);
>> 
>> While unlikely right now, you ought to consider the case of this
>> multiplication overflowing.
>> 
>> Also - how does the caller know how big the buffer needs to be?
>> Only num_sockets can be restricted by it...
>> 
>> And what's worse - you allow the caller to limit num_sockets and
>> allocate info based on this limited value, but you don't restrict
>> cpu_cqmdata_map to just the socket covered, i.e. if the caller
>> specified a lower number, then you'll corrupt memory.
> 
> Currently the caller (libxc) sets big num_rmid and num_sockets which are the 
> maximum values that could be available inside the hypervisor.
> If you think this approach is not enough to ensure the security, what about 
> the caller (libxc) issue an hypercall to get the two values from hypervisor, 
> then allocating the same size of num_rmid and num_sockets?

Yes, that's the first half of it. And then _both_ values, as they're
determining the array dimensions, need to be passed back into
the "actual" call, and the hypervisor will need to take care not to
exceed these array dimensions.

>> And finally, I think the total size of the buffer here can easily
>> exceed a page, i.e. this then ends up being a non-order-0
>> allocation, which may _never_ succeed (i.e. the operation is
>> then rendered useless). I guest it'd be better to e.g. vmap()
>> the MFNs underlying the guest buffer.
> 
> Do you mean we check the size of the total size, and allocate MFNs one by 
> one, then vmap them?

That would also be a possibility, but isn't what I wrote above (as
being less resource efficient, but easier to implement).

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2014-01-28 14:34       ` Andrew Cooper
@ 2014-01-28 15:03         ` Jan Beulich
  2014-01-28 15:15           ` Xu, Dongxiao
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2014-01-28 15:03 UTC (permalink / raw)
  To: Andrew Cooper, Dongxiao Xu
  Cc: keir@xen.org, Ian.Campbell@citrix.com, dario.faggioli@citrix.com,
	stefano.stabellini@eu.citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

>>> On 28.01.14 at 15:34, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 28/01/14 14:23, Xu, Dongxiao wrote:
>>> And finally, I think the total size of the buffer here can easily
>>> exceed a page, i.e. this then ends up being a non-order-0
>>> allocation, which may _never_ succeed (i.e. the operation is
>>> then rendered useless). I guest it'd be better to e.g. vmap()
>>> the MFNs underlying the guest buffer.
>> Do you mean we check the size of the total size, and allocate MFNs one by 
> one, then vmap them?
> 
> I still think this is barking mad as a method of getting this quantity
> of data from Xen to the toolstack in a repeated fashon.
> 
> Xen should allocate a per-socket buffer at the start of day (or perhaps
> on first use of CQM), and the CQM monitoring tool gets to map those
> per-socket buffers read-only.
> 
> This way, all processing of the CQM data happens in dom0 userspace, not
> in Xen in hypercall context; All Xen has to do is periodically dump the
> MSRs into the pages.

Indeed - if the nature of the data is such that it can be exposed
read-only to suitably privileged entities, then this would be the
much better interface.

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2014-01-28 15:03         ` Jan Beulich
@ 2014-01-28 15:15           ` Xu, Dongxiao
  2014-01-28 15:21             ` Andrew Cooper
  2014-01-28 15:34             ` Jan Beulich
  0 siblings, 2 replies; 33+ messages in thread
From: Xu, Dongxiao @ 2014-01-28 15:15 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: keir@xen.org, Ian.Campbell@citrix.com, dario.faggioli@citrix.com,
	stefano.stabellini@eu.citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, January 28, 2014 11:04 PM
> To: Andrew Cooper; Xu, Dongxiao
> Cc: dario.faggioli@citrix.com; Ian.Campbell@citrix.com;
> Ian.Jackson@eu.citrix.com; stefano.stabellini@eu.citrix.com;
> xen-devel@lists.xen.org; konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov;
> keir@xen.org
> Subject: Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
> 
> >>> On 28.01.14 at 15:34, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> > On 28/01/14 14:23, Xu, Dongxiao wrote:
> >>> And finally, I think the total size of the buffer here can easily
> >>> exceed a page, i.e. this then ends up being a non-order-0
> >>> allocation, which may _never_ succeed (i.e. the operation is
> >>> then rendered useless). I guest it'd be better to e.g. vmap()
> >>> the MFNs underlying the guest buffer.
> >> Do you mean we check the size of the total size, and allocate MFNs one by
> > one, then vmap them?
> >
> > I still think this is barking mad as a method of getting this quantity
> > of data from Xen to the toolstack in a repeated fashon.
> >
> > Xen should allocate a per-socket buffer at the start of day (or perhaps
> > on first use of CQM), and the CQM monitoring tool gets to map those
> > per-socket buffers read-only.
> >
> > This way, all processing of the CQM data happens in dom0 userspace, not
> > in Xen in hypercall context; All Xen has to do is periodically dump the
> > MSRs into the pages.
> 
> Indeed - if the nature of the data is such that it can be exposed
> read-only to suitably privileged entities, then this would be the
> much better interface.

If the data fetching is not hypercall driven, do you have a recommendation on how frequent Xen dumps the MSRs into the share page?

Thanks,
Dongxiao 

> 
> Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2014-01-28 15:15           ` Xu, Dongxiao
@ 2014-01-28 15:21             ` Andrew Cooper
  2014-01-28 15:34             ` Jan Beulich
  1 sibling, 0 replies; 33+ messages in thread
From: Andrew Cooper @ 2014-01-28 15:21 UTC (permalink / raw)
  To: Xu, Dongxiao
  Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, dario.faggioli@citrix.com,
	Ian.Jackson@eu.citrix.com, xen-devel@lists.xen.org, Jan Beulich,
	dgdegra@tycho.nsa.gov

On 28/01/14 15:15, Xu, Dongxiao wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, January 28, 2014 11:04 PM
>> To: Andrew Cooper; Xu, Dongxiao
>> Cc: dario.faggioli@citrix.com; Ian.Campbell@citrix.com;
>> Ian.Jackson@eu.citrix.com; stefano.stabellini@eu.citrix.com;
>> xen-devel@lists.xen.org; konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov;
>> keir@xen.org
>> Subject: Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
>>
>>>>> On 28.01.14 at 15:34, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> On 28/01/14 14:23, Xu, Dongxiao wrote:
>>>>> And finally, I think the total size of the buffer here can easily
>>>>> exceed a page, i.e. this then ends up being a non-order-0
>>>>> allocation, which may _never_ succeed (i.e. the operation is
>>>>> then rendered useless). I guest it'd be better to e.g. vmap()
>>>>> the MFNs underlying the guest buffer.
>>>> Do you mean we check the size of the total size, and allocate MFNs one by
>>> one, then vmap them?
>>>
>>> I still think this is barking mad as a method of getting this quantity
>>> of data from Xen to the toolstack in a repeated fashon.
>>>
>>> Xen should allocate a per-socket buffer at the start of day (or perhaps
>>> on first use of CQM), and the CQM monitoring tool gets to map those
>>> per-socket buffers read-only.
>>>
>>> This way, all processing of the CQM data happens in dom0 userspace, not
>>> in Xen in hypercall context; All Xen has to do is periodically dump the
>>> MSRs into the pages.
>> Indeed - if the nature of the data is such that it can be exposed
>> read-only to suitably privileged entities, then this would be the
>> much better interface.
> If the data fetching is not hypercall driven, do you have a recommendation on how frequent Xen dumps the MSRs into the share page?
>
> Thanks,
> Dongxiao 

There is nothing preventing a hypercall existing which does a
synchronous prompt to dump data right now, but that is substantially
less overhead than then having the hypercall go and then rotate a matrix
of data so it can be consumed in a form convenient for userspace.

Other solutions involve having a single read/write control page where
the toolstack could set a bit indicating "please dump the msr when next
convenient" and the rmid context switching code could do a
test_and_clear() bit on it, which even solves the problem of "which core
on some other socket do I decide to interrupt".

~Andrew

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v6 4/7] x86: collect CQM information from all sockets
  2014-01-28 15:15           ` Xu, Dongxiao
  2014-01-28 15:21             ` Andrew Cooper
@ 2014-01-28 15:34             ` Jan Beulich
  1 sibling, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2014-01-28 15:34 UTC (permalink / raw)
  To: Andrew Cooper, Dongxiao Xu
  Cc: keir@xen.org, Ian.Campbell@citrix.com, dario.faggioli@citrix.com,
	stefano.stabellini@eu.citrix.com, Ian.Jackson@eu.citrix.com,
	xen-devel@lists.xen.org, dgdegra@tycho.nsa.gov

>>> On 28.01.14 at 16:15, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >>> On 28.01.14 at 15:34, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> > On 28/01/14 14:23, Xu, Dongxiao wrote:
>> >>> And finally, I think the total size of the buffer here can easily
>> >>> exceed a page, i.e. this then ends up being a non-order-0
>> >>> allocation, which may _never_ succeed (i.e. the operation is
>> >>> then rendered useless). I guest it'd be better to e.g. vmap()
>> >>> the MFNs underlying the guest buffer.
>> >> Do you mean we check the size of the total size, and allocate MFNs one by
>> > one, then vmap them?
>> >
>> > I still think this is barking mad as a method of getting this quantity
>> > of data from Xen to the toolstack in a repeated fashon.
>> >
>> > Xen should allocate a per-socket buffer at the start of day (or perhaps
>> > on first use of CQM), and the CQM monitoring tool gets to map those
>> > per-socket buffers read-only.
>> >
>> > This way, all processing of the CQM data happens in dom0 userspace, not
>> > in Xen in hypercall context; All Xen has to do is periodically dump the
>> > MSRs into the pages.
>> 
>> Indeed - if the nature of the data is such that it can be exposed
>> read-only to suitably privileged entities, then this would be the
>> much better interface.
> 
> If the data fetching is not hypercall driven, do you have a recommendation 
> on how frequent Xen dumps the MSRs into the share page?

Perhaps an "update the shared pages" hypercall should be made
then prior to reading the shared space?

Jan

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2014-01-28 15:34 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-05  9:38 [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Dongxiao Xu
2013-12-05  9:38 ` [PATCH v6 1/7] x86: detect and initialize Cache QoS Monitoring feature Dongxiao Xu
2013-12-05 10:54   ` Andrew Cooper
2013-12-06  8:29     ` Jan Beulich
2014-01-20 13:00   ` Jan Beulich
2014-01-28 14:01     ` Xu, Dongxiao
2014-01-28 14:04       ` Jan Beulich
2013-12-05  9:38 ` [PATCH v6 2/7] x86: dynamically attach/detach CQM service for a guest Dongxiao Xu
2014-01-20 13:13   ` Jan Beulich
2014-01-20 13:17     ` Andrew Cooper
2014-01-20 14:03       ` Jan Beulich
2014-01-28 14:09     ` Xu, Dongxiao
2014-01-28 14:37       ` Jan Beulich
2013-12-05  9:38 ` [PATCH v6 3/7] x86: initialize per socket cpu map Dongxiao Xu
2014-01-20 13:26   ` Jan Beulich
2014-01-28 14:12     ` Xu, Dongxiao
2014-01-28 14:41       ` Jan Beulich
2013-12-05  9:38 ` [PATCH v6 4/7] x86: collect CQM information from all sockets Dongxiao Xu
2014-01-20 13:52   ` Jan Beulich
2014-01-28 14:23     ` Xu, Dongxiao
2014-01-28 14:34       ` Andrew Cooper
2014-01-28 15:03         ` Jan Beulich
2014-01-28 15:15           ` Xu, Dongxiao
2014-01-28 15:21             ` Andrew Cooper
2014-01-28 15:34             ` Jan Beulich
2014-01-28 14:47       ` Jan Beulich
2013-12-05  9:38 ` [PATCH v6 5/7] x86: enable CQM monitoring for each domain RMID Dongxiao Xu
2014-01-20 13:58   ` Jan Beulich
2014-01-28 14:24     ` Xu, Dongxiao
2013-12-05  9:38 ` [PATCH v6 6/7] xsm: add platform QoS related xsm policies Dongxiao Xu
2013-12-05  9:38 ` [PATCH v6 7/7] tools: enable Cache QoS Monitoring feature for libxl/libxc Dongxiao Xu
2013-12-07  5:09 ` [PATCH v6 0/7] enable Cache QoS Monitoring (CQM) feature Xu, Dongxiao
2013-12-09  9:34   ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).