* [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs
@ 2015-05-21  8:41 Chao Peng
  2015-05-21  8:41 ` [PATCH v8 01/13] x86: add socket_cpumask Chao Peng
                   ` (12 more replies)
  0 siblings, 13 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Changes in v8:
Address comments from Jan, mainly:
* Remove total_cpus and retrofit the algorithm for calculating nr_sockets.
* Change per-socket cpumask allocation as on demand.
* Remove cat_socket_init_bitmap and rename cat_socket_enable_bitmap.
* Ensure opt_cos_max is not too small.
* Use the right notification for memory allocation/freeing.
Changes in v7:
Address comments from Jan/Ian, mainly:
* Introduce total_cpus to calculate nr_sockets.
* Clear the init/enable flag when a socket going offline.
* Reorder the statements in init_psr_cat.
* Copyback psr_cat_op only for XEN_SYSCTL_PSR_CAT_get_l3_info.
* Broadcast LIBXL_HAVE_SOCKET_BITMAP_ALLOC.
* Add PSR head1 level section and change CMT/CAT as its subsections for xl man page.
Changes in v6:
Address comments from Andrew/Dario/Ian, mainly:
* Introduce cat_socket_init(_enable)_bitmap.
* Merge xl psr-cmt/cat-hwinfo => xl psr-hwinfo.
* Add function header to explain the 'target' parameter.
* Use bitmap instead of TARGETS_ALL.
* Document fix.
Changes in v5:
* Address comments from Andrew and Ian(Detail in patch).
* Add socket_to_cpumask.
* Add xl psr-cmt/cat-hwinfo.
* Add some libxl CMT enhancement.
Changes in v4:
* Address comments from Andrew and Ian(Detail in patch).
* Split COS/CBM management patch into 4 small patches.
* Add documentation xl-psr.markdown.
Changes in v3:
* Address comments from Jan and Ian(Detail in patch).
* Add xl sample output in cover letter.
Changes in v2:
* Address comments from Konrad and Jan(Detail in patch):
* Make all cat unrelated changes into the preparation patches. 
This patch serial enables the new Cache Allocation Technology (CAT) feature
found in Intel Broadwell and later server platform. In Xen's implementation,
CAT is used to control cache allocation on VM basis.
Detail hardware spec can be found in section 17.15 of the Intel SDM [1].
The design for XEN can be found at [2].
patch1:     preparation.
patch2-8:   real work for CAT.
patch9-10: enhancement for CMT.
patch11:    libxl prepareation
patch12:    tools side work for CAT.
patch13:    xl document for CMT/MBM/CAT.
[1] Intel SDM (http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf)
[2] CAT design for XEN( http://lists.xen.org/archives/html/xen-devel/2014-12/msg01382.html)
Chao Peng (13):
  x86: add socket_cpumask
  x86: detect and initialize Intel CAT feature
  x86: maintain COS to CBM mapping for each socket
  x86: add COS information for each domain
  x86: expose CBM length and COS number information
  x86: dynamically get/set CBM for a domain
  x86: add scheduling support for Intel CAT
  xsm: add CAT related xsm policies
  tools/libxl: minor name changes for CMT commands
  tools/libxl: add command to show PSR hardware info
  tools/libxl: introduce some socket helpers
  tools: add tools support for Intel CAT
  docs: add xl-psr.markdown
 docs/man/xl.pod.1                            |  76 ++++-
 docs/misc/xen-command-line.markdown          |  15 +-
 docs/misc/xl-psr.markdown                    | 133 +++++++++
 tools/flask/policy/policy/modules/xen/xen.if |   2 +-
 tools/flask/policy/policy/modules/xen/xen.te |   4 +-
 tools/libxc/include/xenctrl.h                |  15 +
 tools/libxc/xc_psr.c                         |  76 +++++
 tools/libxl/libxl.h                          |  42 +++
 tools/libxl/libxl_internal.h                 |   2 +
 tools/libxl/libxl_psr.c                      | 143 +++++++++-
 tools/libxl/libxl_types.idl                  |  10 +
 tools/libxl/libxl_utils.c                    |  46 +++
 tools/libxl/libxl_utils.h                    |   2 +
 tools/libxl/xl.h                             |   5 +
 tools/libxl/xl_cmdimpl.c                     | 262 ++++++++++++++++-
 tools/libxl/xl_cmdtable.c                    |  27 +-
 xen/arch/x86/domain.c                        |   6 +-
 xen/arch/x86/domctl.c                        |  20 ++
 xen/arch/x86/mpparse.c                       |  12 +
 xen/arch/x86/psr.c                           | 402 ++++++++++++++++++++++++++-
 xen/arch/x86/setup.c                         |   2 +
 xen/arch/x86/smpboot.c                       |  27 +-
 xen/arch/x86/sysctl.c                        |  18 ++
 xen/include/asm-x86/cpufeature.h             |   1 +
 xen/include/asm-x86/domain.h                 |   5 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/psr.h                    |  11 +
 xen/include/asm-x86/setup.h                  |   1 +
 xen/include/asm-x86/smp.h                    |   9 +
 xen/include/public/domctl.h                  |  12 +
 xen/include/public/sysctl.h                  |  16 ++
 xen/xsm/flask/hooks.c                        |   6 +
 xen/xsm/flask/policy/access_vectors          |   4 +
 33 files changed, 1380 insertions(+), 33 deletions(-)
 create mode 100644 docs/misc/xl-psr.markdown
-- 
1.9.1
^ permalink raw reply	[flat|nested] 38+ messages in thread
* [PATCH v8 01/13] x86: add socket_cpumask
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-28 12:38   ` Jan Beulich
  2015-05-21  8:41 ` [PATCH v8 02/13] x86: detect and initialize Intel CAT feature Chao Peng
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Maintain socket_cpumask which contains all the HT and core siblings
in the same socket.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
Changes in v8:
* Remove total_cpus and retrofit the algorithm for calculating nr_sockets.
* Change per-socket cpumask allocation as on demand.
* socket_to_cpumask => socket_cpumask.
Changes in v7:
* Introduce total_cpus to calculate nr_sockets.
* Minor code sequence improvement in set_cpu_sibling_map.
* Improve comments for nr_sockets.
---
 xen/arch/x86/mpparse.c      | 12 ++++++++++++
 xen/arch/x86/setup.c        |  2 ++
 xen/arch/x86/smpboot.c      | 27 ++++++++++++++++++++++++++-
 xen/include/asm-x86/setup.h |  1 +
 xen/include/asm-x86/smp.h   |  9 +++++++++
 5 files changed, 50 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/mpparse.c b/xen/arch/x86/mpparse.c
index 003c56e..fb34492 100644
--- a/xen/arch/x86/mpparse.c
+++ b/xen/arch/x86/mpparse.c
@@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
 #endif
 }
 
+void __init set_nr_sockets(void)
+{
+    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
+                                      boot_cpu_data.x86_max_cores *
+                                      boot_cpu_data.x86_num_siblings);
+
+    if ( cpus == 0 )
+        cpus = 1;
+
+    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
+}
+
 /*
  * Intel MP BIOS table parsing routines:
  */
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 2b9787a..a864ca8 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1280,6 +1280,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
     identify_cpu(&boot_cpu_data);
 
+    set_nr_sockets();
+
     if ( cpu_has_fxsr )
         set_in_cr4(X86_CR4_OSFXSR);
     if ( cpu_has_xmm )
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 116c8f8..38431eb 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -59,6 +59,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 cpumask_t cpu_online_map __read_mostly;
 EXPORT_SYMBOL(cpu_online_map);
 
+unsigned int __read_mostly nr_sockets;
+cpumask_var_t *__read_mostly socket_cpumask;
+
 struct cpuinfo_x86 cpu_data[NR_CPUS];
 
 u32 x86_cpu_to_apicid[NR_CPUS] __read_mostly =
@@ -244,6 +247,8 @@ static void set_cpu_sibling_map(int cpu)
 
     cpumask_set_cpu(cpu, &cpu_sibling_setup_map);
 
+    cpumask_set_cpu(cpu, socket_cpumask[cpu_to_socket(cpu)]);
+
     if ( c[cpu].x86_num_siblings > 1 )
     {
         for_each_cpu ( i, &cpu_sibling_setup_map )
@@ -611,7 +616,13 @@ void cpu_exit_clear(unsigned int cpu)
 
 static void cpu_smpboot_free(unsigned int cpu)
 {
-    unsigned int order;
+    unsigned int order, socket = cpu_to_socket(cpu);
+
+    if ( cpumask_empty(socket_cpumask[socket]) )
+    {
+        free_cpumask_var(socket_cpumask[socket]);
+        socket_cpumask[socket] = NULL;
+    }
 
     free_cpumask_var(per_cpu(cpu_sibling_mask, cpu));
     free_cpumask_var(per_cpu(cpu_core_mask, cpu));
@@ -638,6 +649,8 @@ static int cpu_smpboot_alloc(unsigned int cpu)
     unsigned int order, memflags = 0;
     nodeid_t node = cpu_to_node(cpu);
     struct desc_struct *gdt;
+    unsigned int socket = cpu_to_socket(cpu);
+
 
     if ( node != NUMA_NO_NODE )
         memflags = MEMF_node(node);
@@ -667,6 +680,10 @@ static int cpu_smpboot_alloc(unsigned int cpu)
         goto oom;
     memcpy(idt_tables[cpu], idt_table, IDT_ENTRIES * sizeof(idt_entry_t));
 
+    if ( !socket_cpumask[socket] &&
+         !zalloc_cpumask_var(socket_cpumask + socket) )
+        goto oom;
+
     if ( zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, cpu)) &&
          zalloc_cpumask_var(&per_cpu(cpu_core_mask, cpu)) )
         return 0;
@@ -717,6 +734,12 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 
     stack_base[0] = stack_start;
 
+    socket_cpumask = xzalloc_array(cpumask_var_t, nr_sockets);
+    if ( !socket_cpumask )
+        panic("No memory for socket CPU siblings map");
+    if ( !zalloc_cpumask_var(socket_cpumask) )
+        panic("No memory for socket CPU siblings cpumask");
+
     if ( !zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, 0)) ||
          !zalloc_cpumask_var(&per_cpu(cpu_core_mask, 0)) )
         panic("No memory for boot CPU sibling/core maps");
@@ -782,6 +805,8 @@ remove_siblinginfo(int cpu)
     int sibling;
     struct cpuinfo_x86 *c = cpu_data;
 
+    cpumask_clear_cpu(cpu, socket_cpumask[cpu_to_socket(cpu)]);
+
     for_each_cpu ( sibling, per_cpu(cpu_core_mask, cpu) )
     {
         cpumask_clear_cpu(cpu, per_cpu(cpu_core_mask, sibling));
diff --git a/xen/include/asm-x86/setup.h b/xen/include/asm-x86/setup.h
index 08bc23a..597cedf 100644
--- a/xen/include/asm-x86/setup.h
+++ b/xen/include/asm-x86/setup.h
@@ -17,6 +17,7 @@ int centaur_init_cpu(void);
 int transmeta_init_cpu(void);
 
 void set_nr_cpu_ids(unsigned int max_cpus);
+void set_nr_sockets(void);
 
 void numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn);
 void arch_init_memory(void);
diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h
index 67518cf..2087a77 100644
--- a/xen/include/asm-x86/smp.h
+++ b/xen/include/asm-x86/smp.h
@@ -58,6 +58,15 @@ int hard_smp_processor_id(void);
 
 void __stop_this_cpu(void);
 
+/*
+ * The value may be greater than the actual socket number in the system and
+ * is considered not to change from the initial startup.
+ */
+extern unsigned int nr_sockets;
+
+/* Representing HT and core siblings in each socket */
+extern cpumask_var_t *socket_cpumask;
+
 #endif /* !__ASSEMBLY__ */
 
 #endif
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 02/13] x86: detect and initialize Intel CAT feature
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
  2015-05-21  8:41 ` [PATCH v8 01/13] x86: add socket_cpumask Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-28 12:54   ` Jan Beulich
  2015-05-21  8:41 ` [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket Chao Peng
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Detect Intel Cache Allocation Technology(CAT) feature and store the
cpuid information for later use. Currently only L3 cache allocation is
supported. The L3 CAT features may vary among sockets so per-socket
feature information is stored. The initialization can happen either at
boot time or when CPU(s) is hot plugged after booting.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes in v8:
* Remove cat_socket_init_bitmap and rename cat_socket_enable_bitmap.
* Ensure opt_cos_max is not too small.
* Use CPU_DEAD instead of CPU_DYING.
* indentation fix.
Changes in v7:
* Clear the init/enable flag when a socket going offline.
* Reorder the statements in init_psr_cat.
Changes in v6:
* Introduce cat_socket_init(_enable)_bitmap.
Changes in v5:
* Add cos_max boot option.
Changes in v4:
* check X86_FEATURE_CAT available before doing initialization.
Changes in v3:
* Remove num_sockets boot option instead calculate it at boot time.
* Name hardcoded CAT cpuid leaf as PSR_CPUID_LEVEL_CAT.
Changes in v2:
* socket_num => num_sockets and fix several documentaion issues.
* refactor boot line parameters parsing into standlone patch.
* set opt_num_sockets = NR_CPUS when opt_num_sockets > NR_CPUS.
* replace CPU_ONLINE with CPU_STARTING and integrate that into scheduling
  improvement patch.
* reimplement get_max_socket() with cpu_to_socket();
* cbm is still uint64 as there is a path forward for supporting long masks.
---
 docs/misc/xen-command-line.markdown | 15 ++++++-
 xen/arch/x86/psr.c                  | 90 ++++++++++++++++++++++++++++++++++++-
 xen/include/asm-x86/cpufeature.h    |  1 +
 xen/include/asm-x86/psr.h           |  3 ++
 4 files changed, 106 insertions(+), 3 deletions(-)
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 4889e27..28a09a8 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1137,9 +1137,9 @@ This option can be specified more than once (up to 8 times at present).
 > `= <integer>`
 
 ### psr (Intel)
-> `= List of ( cmt:<boolean> | rmid_max:<integer> )`
+> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> )`
 
-> Default: `psr=cmt:0,rmid_max:255`
+> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255`
 
 Platform Shared Resource(PSR) Services.  Intel Haswell and later server
 platforms offer information about the sharing of resources.
@@ -1149,6 +1149,12 @@ Monitoring ID(RMID) is used to bind the domain to corresponding shared
 resource.  RMID is a hardware-provided layer of abstraction between software
 and logical processors.
 
+To use the PSR cache allocation service for a certain domain, a capacity
+bitmasks(CBM) is used to bind the domain to corresponding shared resource.
+CBM represents cache capacity and indicates the degree of overlap and isolation
+between domains. In hypervisor a Class of Service(COS) ID is allocated for each
+unique CBM.
+
 The following resources are available:
 
 * Cache Monitoring Technology (Haswell and later).  Information regarding the
@@ -1159,6 +1165,11 @@ The following resources are available:
   total/local memory bandwidth. Follow the same options with Cache Monitoring
   Technology.
 
+* Cache Alllocation Technology (Broadwell and later).  Information regarding
+  the cache allocation.
+  * `cat` instructs Xen to enable/disable Cache Allocation Technology.
+  * `cos_max` indicates the max value for COS ID.
+
 ### reboot
 > `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
 
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 2490d22..471296e 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -19,14 +19,25 @@
 #include <asm/psr.h>
 
 #define PSR_CMT        (1<<0)
+#define PSR_CAT        (1<<1)
+
+struct psr_cat_socket_info {
+    unsigned int cbm_len;
+    unsigned int cos_max;
+};
 
 struct psr_assoc {
     uint64_t val;
 };
 
 struct psr_cmt *__read_mostly psr_cmt;
+
+static unsigned long *__read_mostly cat_socket_enable;
+static struct psr_cat_socket_info *__read_mostly cat_socket_info;
+
 static unsigned int __initdata opt_psr;
 static unsigned int __initdata opt_rmid_max = 255;
+static unsigned int opt_cos_max = 255;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
@@ -63,10 +74,14 @@ static void __init parse_psr_param(char *s)
             *val_str++ = '\0';
 
         parse_psr_bool(s, val_str, "cmt", PSR_CMT);
+        parse_psr_bool(s, val_str, "cat", PSR_CAT);
 
         if ( val_str && !strcmp(s, "rmid_max") )
             opt_rmid_max = simple_strtoul(val_str, NULL, 0);
 
+        if ( val_str && !strcmp(s, "cos_max") )
+            opt_cos_max = simple_strtoul(val_str, NULL, 0);
+
         s = ss + 1;
     } while ( ss );
 }
@@ -194,16 +209,86 @@ void psr_ctxt_switch_to(struct domain *d)
     }
 }
 
+static void cat_cpu_init(void)
+{
+    unsigned int eax, ebx, ecx, edx;
+    struct psr_cat_socket_info *info;
+    unsigned int socket;
+    unsigned int cpu = smp_processor_id();
+    const struct cpuinfo_x86 *c = cpu_data + cpu;
+
+    if ( !cpu_has(c, X86_FEATURE_CAT) )
+        return;
+
+    socket = cpu_to_socket(cpu);
+    if ( test_bit(socket, cat_socket_enable) )
+        return;
+
+    cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
+    if ( ebx & PSR_RESOURCE_TYPE_L3 )
+    {
+        cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
+        info = cat_socket_info + socket;
+        info->cbm_len = (eax & 0x1f) + 1;
+        info->cos_max = min(opt_cos_max, edx & 0xffff);
+
+        set_bit(socket, cat_socket_enable);
+        printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
+               socket, info->cos_max, info->cbm_len);
+    }
+}
+
+static void cat_cpu_fini(unsigned int cpu)
+{
+    unsigned int socket = cpu_to_socket(cpu);
+
+    if ( !socket_cpumask[socket] || cpumask_empty(socket_cpumask[socket]) )
+        clear_bit(socket, cat_socket_enable);
+}
+
+static void __init init_psr_cat(void)
+{
+    if ( opt_cos_max < 1 )
+    {
+        printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
+        return;
+    }
+
+    cat_socket_enable = xzalloc_array(unsigned long, BITS_TO_LONGS(nr_sockets));
+    cat_socket_info = xzalloc_array(struct psr_cat_socket_info, nr_sockets);
+
+    if ( !cat_socket_enable || !cat_socket_info )
+    {
+        xfree(cat_socket_enable);
+        cat_socket_enable = NULL;
+        xfree(cat_socket_info);
+        cat_socket_info = NULL;
+    }
+}
+
 static void psr_cpu_init(void)
 {
+    if ( cat_socket_info )
+        cat_cpu_init();
+
     psr_assoc_init();
 }
 
+static void psr_cpu_fini(unsigned int cpu)
+{
+    if ( cat_socket_info )
+        cat_cpu_fini(cpu);
+}
+
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
+    unsigned int cpu = (unsigned long)hcpu;
+
     if ( action == CPU_STARTING )
         psr_cpu_init();
+    else if ( action == CPU_DEAD )
+        psr_cpu_fini(cpu);
 
     return NOTIFY_DONE;
 }
@@ -217,8 +302,11 @@ static int __init psr_presmp_init(void)
     if ( (opt_psr & PSR_CMT) && opt_rmid_max )
         init_psr_cmt(opt_rmid_max);
 
+    if ( opt_psr & PSR_CAT )
+        init_psr_cat();
+
     psr_cpu_init();
-    if ( psr_cmt_enabled() )
+    if ( psr_cmt_enabled() || cat_socket_info )
         register_cpu_notifier(&cpu_nfb);
 
     return 0;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 7963a3a..8c0f0a6 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -149,6 +149,7 @@
 #define X86_FEATURE_CMT 	(7*32+12) /* Cache Monitoring Technology */
 #define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
 #define X86_FEATURE_MPX		(7*32+14) /* Memory Protection Extensions */
+#define X86_FEATURE_CAT 	(7*32+15) /* Cache Allocation Technology */
 #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
 #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 12d593b..bdda111 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -18,6 +18,9 @@
 
 #include <xen/types.h>
 
+/* CAT cpuid level */
+#define PSR_CPUID_LEVEL_CAT   0x10
+
 /* Resource Type Enumeration */
 #define PSR_RESOURCE_TYPE_L3            0x2
 
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
  2015-05-21  8:41 ` [PATCH v8 01/13] x86: add socket_cpumask Chao Peng
  2015-05-21  8:41 ` [PATCH v8 02/13] x86: detect and initialize Intel CAT feature Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-28 13:17   ` Jan Beulich
  2015-05-21  8:41 ` [PATCH v8 04/13] x86: add COS information for each domain Chao Peng
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
For each socket, a COS to CBM mapping structure is maintained for each
COS. The mapping is indexed by COS and the value is the corresponding
CBM. Different VMs may use the same CBM, a reference count is used to
indicate if the CBM is available.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes in v8:
* Move the memory allocation and CAT initialization code to CPU_UP_PREPARE.
* Add memory freeing code in CPU_DEAD path.
Changes in v5:
* rename cos_cbm_map to cos_to_cbm.
---
 xen/arch/x86/psr.c | 98 +++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 78 insertions(+), 20 deletions(-)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 471296e..8c844cb 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -21,9 +21,15 @@
 #define PSR_CMT        (1<<0)
 #define PSR_CAT        (1<<1)
 
+struct psr_cat_cbm {
+    uint64_t cbm;
+    unsigned int ref;
+};
+
 struct psr_cat_socket_info {
     unsigned int cbm_len;
     unsigned int cos_max;
+    struct psr_cat_cbm *cos_to_cbm;
 };
 
 struct psr_assoc {
@@ -208,34 +214,62 @@ void psr_ctxt_switch_to(struct domain *d)
         psra->val = reg;
     }
 }
-
-static void cat_cpu_init(void)
+static void do_cat_cpu_init(void *arg)
 {
     unsigned int eax, ebx, ecx, edx;
-    struct psr_cat_socket_info *info;
-    unsigned int socket;
-    unsigned int cpu = smp_processor_id();
-    const struct cpuinfo_x86 *c = cpu_data + cpu;
-
-    if ( !cpu_has(c, X86_FEATURE_CAT) )
-        return;
-
-    socket = cpu_to_socket(cpu);
-    if ( test_bit(socket, cat_socket_enable) )
-        return;
+    int *rc = arg;
 
     cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
     if ( ebx & PSR_RESOURCE_TYPE_L3 )
     {
+        struct psr_cat_socket_info *info;
+        unsigned int socket = cpu_to_socket(smp_processor_id());
+
         cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
         info = cat_socket_info + socket;
         info->cbm_len = (eax & 0x1f) + 1;
         info->cos_max = min(opt_cos_max, edx & 0xffff);
 
+        info->cos_to_cbm = xzalloc_array(struct psr_cat_cbm,
+                                         info->cos_max + 1UL);
+        if ( !info->cos_to_cbm )
+        {
+            *rc = -ENOMEM;
+            return;
+        }
+
+        /* cos=0 is reserved as default cbm(all ones). */
+        info->cos_to_cbm[0].cbm = (1ull << info->cbm_len) - 1;
+
         set_bit(socket, cat_socket_enable);
         printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
                socket, info->cos_max, info->cbm_len);
     }
+
+    *rc = 0;
+
+    return;
+
+}
+
+
+static int cat_cpu_init(unsigned int cpu)
+{
+    int rc;
+    const struct cpuinfo_x86 *c = cpu_data + cpu;
+
+    if ( !cpu_has(c, X86_FEATURE_CAT) )
+        return 0;
+
+    if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
+        return 0;
+
+    if ( cpu == smp_processor_id() )
+        do_cat_cpu_init(&rc);
+    else
+        on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, &rc, 1);
+
+    return rc;
 }
 
 static void cat_cpu_fini(unsigned int cpu)
@@ -243,7 +277,16 @@ static void cat_cpu_fini(unsigned int cpu)
     unsigned int socket = cpu_to_socket(cpu);
 
     if ( !socket_cpumask[socket] || cpumask_empty(socket_cpumask[socket]) )
+    {
+        struct psr_cat_socket_info *info = cat_socket_info + socket;
+
+        if ( info->cos_to_cbm )
+        {
+            xfree(info->cos_to_cbm);
+            info->cos_to_cbm = NULL;
+        }
         clear_bit(socket, cat_socket_enable);
+    }
 }
 
 static void __init init_psr_cat(void)
@@ -266,11 +309,16 @@ static void __init init_psr_cat(void)
     }
 }
 
-static void psr_cpu_init(void)
+static int psr_cpu_prepare(unsigned int cpu)
 {
     if ( cat_socket_info )
-        cat_cpu_init();
+        return cat_cpu_init(cpu);
 
+    return 0;
+}
+
+static void psr_cpu_starting(void)
+{
     psr_assoc_init();
 }
 
@@ -283,14 +331,24 @@ static void psr_cpu_fini(unsigned int cpu)
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
+    int rc = 0;
     unsigned int cpu = (unsigned long)hcpu;
 
-    if ( action == CPU_STARTING )
-        psr_cpu_init();
-    else if ( action == CPU_DEAD )
+    switch ( action )
+    {
+    case CPU_UP_PREPARE:
+        rc = psr_cpu_prepare(cpu);
+        break;
+    case CPU_STARTING:
+        psr_cpu_starting();
+        break;
+    case CPU_UP_CANCELED:
+    case CPU_DEAD:
         psr_cpu_fini(cpu);
+        break;
+    }
 
-    return NOTIFY_DONE;
+    return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
 }
 
 static struct notifier_block cpu_nfb = {
@@ -305,7 +363,7 @@ static int __init psr_presmp_init(void)
     if ( opt_psr & PSR_CAT )
         init_psr_cat();
 
-    psr_cpu_init();
+    psr_cpu_prepare(0);
     if ( psr_cmt_enabled() || cat_socket_info )
         register_cpu_notifier(&cpu_nfb);
 
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 04/13] x86: add COS information for each domain
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (2 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 05/13] x86: expose CBM length and COS number information Chao Peng
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
In Xen's implementation, the CAT enforcement granularity is per domain.
Due to the length of CBM and the number of COS may be socket-different,
each domain has COS ID for each socket. The domain get COS=0 by default
and at runtime its COS is then allocated dynamically when user specifies
a CBM for the domain.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes in v6:
* Add spinlock for cos_to_cbm.
---
 xen/arch/x86/domain.c        |  6 +++++-
 xen/arch/x86/psr.c           | 50 ++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/domain.h |  5 ++++-
 xen/include/asm-x86/psr.h    |  3 +++
 4 files changed, 62 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 1f1550e..68ea35d 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -617,6 +617,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
         /* 64-bit PV guest by default. */
         d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
 
+    if ( (rc = psr_domain_init(d)) != 0 )
+        goto fail;
+
     /* initialize default tsc behavior in case tools don't */
     tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0);
     spin_lock_init(&d->arch.vtsc_lock);
@@ -635,6 +638,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
     free_perdomain_mappings(d);
     if ( is_pv_domain(d) )
         free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+    psr_domain_free(d);
     return rc;
 }
 
@@ -658,7 +662,7 @@ void arch_domain_destroy(struct domain *d)
     free_xenheap_page(d->shared_info);
     cleanup_domain_irq_mapping(d);
 
-    psr_free_rmid(d);
+    psr_domain_free(d);
 }
 
 void arch_domain_shutdown(struct domain *d)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 8c844cb..e6d127b 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -30,6 +30,7 @@ struct psr_cat_socket_info {
     unsigned int cbm_len;
     unsigned int cos_max;
     struct psr_cat_cbm *cos_to_cbm;
+    spinlock_t cbm_lock;
 };
 
 struct psr_assoc {
@@ -214,6 +215,53 @@ void psr_ctxt_switch_to(struct domain *d)
         psra->val = reg;
     }
 }
+
+/* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
+static void psr_free_cos(struct domain *d)
+{
+    unsigned int socket;
+    unsigned int cos;
+    struct psr_cat_socket_info *info;
+
+    if( !d->arch.psr_cos_ids )
+        return;
+
+    for ( socket = 0; socket < nr_sockets; socket++ )
+    {
+        if ( !test_bit(socket, cat_socket_enable) )
+            continue;
+
+        if ( (cos = d->arch.psr_cos_ids[socket]) == 0 )
+            continue;
+
+        info = cat_socket_info + socket;
+        spin_lock(&info->cbm_lock);
+        info->cos_to_cbm[cos].ref--;
+        spin_unlock(&info->cbm_lock);
+    }
+
+    xfree(d->arch.psr_cos_ids);
+    d->arch.psr_cos_ids = NULL;
+}
+
+int psr_domain_init(struct domain *d)
+{
+    if ( cat_socket_info )
+    {
+        d->arch.psr_cos_ids = xzalloc_array(unsigned int, nr_sockets);
+        if ( !d->arch.psr_cos_ids )
+            return -ENOMEM;
+    }
+
+    return 0;
+}
+
+void psr_domain_free(struct domain *d)
+{
+    psr_free_rmid(d);
+    psr_free_cos(d);
+}
+
 static void do_cat_cpu_init(void *arg)
 {
     unsigned int eax, ebx, ecx, edx;
@@ -241,6 +289,8 @@ static void do_cat_cpu_init(void *arg)
         /* cos=0 is reserved as default cbm(all ones). */
         info->cos_to_cbm[0].cbm = (1ull << info->cbm_len) - 1;
 
+        spin_lock_init(&info->cbm_lock);
+
         set_bit(socket, cat_socket_enable);
         printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
                socket, info->cos_max, info->cbm_len);
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 45b5283..fee50a1 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -333,7 +333,10 @@ struct arch_domain
     struct e820entry *e820;
     unsigned int nr_e820;
 
-    unsigned int psr_rmid; /* RMID assigned to the domain for CMT */
+    /* RMID assigned to the domain for CMT */
+    unsigned int psr_rmid;
+    /* COS assigned to the domain for each socket */
+    unsigned int *psr_cos_ids;
 
     /* Shared page for notifying that explicit PIRQ EOI is required. */
     unsigned long *pirq_eoi_map;
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index bdda111..1023d5f 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -51,6 +51,9 @@ int psr_alloc_rmid(struct domain *d);
 void psr_free_rmid(struct domain *d);
 void psr_ctxt_switch_to(struct domain *d);
 
+int psr_domain_init(struct domain *d);
+void psr_domain_free(struct domain *d);
+
 #endif /* __ASM_PSR_H__ */
 
 /*
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 05/13] x86: expose CBM length and COS number information
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (3 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 04/13] x86: add COS information for each domain Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-28 13:26   ` Jan Beulich
  2015-05-21  8:41 ` [PATCH v8 06/13] x86: dynamically get/set CBM for a domain Chao Peng
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
General CAT information such as maximum COS and CBM length are exposed to
user space by a SYSCTL hypercall, to help user space to construct the CBM.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes in v7:
* Copyback psr_cat_op only for XEN_SYSCTL_PSR_CAT_get_l3_info.
---
 xen/arch/x86/psr.c          | 32 ++++++++++++++++++++++++++++++++
 xen/arch/x86/sysctl.c       | 18 ++++++++++++++++++
 xen/include/asm-x86/psr.h   |  3 +++
 xen/include/public/sysctl.h | 16 ++++++++++++++++
 4 files changed, 69 insertions(+)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index e6d127b..8404da0 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -216,6 +216,38 @@ void psr_ctxt_switch_to(struct domain *d)
     }
 }
 
+static int get_cat_socket_info(unsigned int socket,
+                               struct psr_cat_socket_info **info)
+{
+    if ( !cat_socket_info )
+        return -ENODEV;
+
+    if ( socket >= nr_sockets )
+        return -EBADSLT;
+
+    if ( !test_bit(socket, cat_socket_enable) )
+        return -ENOENT;
+
+    *info = cat_socket_info + socket;
+
+    return 0;
+}
+
+int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
+                        uint32_t *cos_max)
+{
+    struct psr_cat_socket_info *info;
+    int ret = get_cat_socket_info(socket, &info);
+
+    if ( ret )
+        return ret;
+
+    *cbm_len = info->cbm_len;
+    *cos_max = info->cos_max;
+
+    return 0;
+}
+
 /* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
 static void psr_free_cos(struct domain *d)
 {
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 611a291..f36b52f 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -171,6 +171,24 @@ long arch_do_sysctl(
 
         break;
 
+    case XEN_SYSCTL_psr_cat_op:
+        switch ( sysctl->u.psr_cat_op.cmd )
+        {
+        case XEN_SYSCTL_PSR_CAT_get_l3_info:
+            ret = psr_get_cat_l3_info(sysctl->u.psr_cat_op.target,
+                                      &sysctl->u.psr_cat_op.u.l3_info.cbm_len,
+                                      &sysctl->u.psr_cat_op.u.l3_info.cos_max);
+
+            if ( !ret && __copy_field_to_guest(u_sysctl, sysctl, u.psr_cat_op) )
+                ret = -EFAULT;
+
+            break;
+        default:
+            ret = -EOPNOTSUPP;
+            break;
+        }
+        break;
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 1023d5f..d364e8c 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -51,6 +51,9 @@ int psr_alloc_rmid(struct domain *d);
 void psr_free_rmid(struct domain *d);
 void psr_ctxt_switch_to(struct domain *d);
 
+int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
+                        uint32_t *cos_max);
+
 int psr_domain_init(struct domain *d);
 void psr_domain_free(struct domain *d);
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 0cf9277..e7fdc44 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
 typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
 
+#define XEN_SYSCTL_PSR_CAT_get_l3_info               0
+struct xen_sysctl_psr_cat_op {
+    uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
+    uint32_t target;    /* IN: socket to be operated on */
+    union {
+        struct {
+            uint32_t cbm_len;   /* OUT: CBM length */
+            uint32_t cos_max;   /* OUT: Maximum COS */
+        } l3_info;
+    } u;
+};
+typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
+
 struct xen_sysctl {
     uint32_t cmd;
 #define XEN_SYSCTL_readconsole                    1
@@ -717,6 +731,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_coverage_op                   20
 #define XEN_SYSCTL_psr_cmt_op                    21
 #define XEN_SYSCTL_pcitopoinfo                   22
+#define XEN_SYSCTL_psr_cat_op                    23
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
@@ -740,6 +755,7 @@ struct xen_sysctl {
         struct xen_sysctl_scheduler_op      scheduler_op;
         struct xen_sysctl_coverage_op       coverage_op;
         struct xen_sysctl_psr_cmt_op        psr_cmt_op;
+        struct xen_sysctl_psr_cat_op        psr_cat_op;
         uint8_t                             pad[128];
     } u;
 };
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 06/13] x86: dynamically get/set CBM for a domain
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (4 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 05/13] x86: expose CBM length and COS number information Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 07/13] x86: add scheduling support for Intel CAT Chao Peng
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
For CAT, COS is maintained in hypervisor only while CBM is exposed to
user space directly to allow getting/setting domain's cache capacity.
For each specified CBM, hypervisor will either use a existed COS which
has the same CBM or allocate a new one if the same CBM is not found. If
the allocation fails because of no enough COS available then error is
returned. The getting/setting are always operated on a specified socket.
For multiple sockets system, the interface may be called several times.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v8:
* Add likely for 'socket < nr_sockets' in get_socket_cpu.
Changes in v7:
* find => found in psr_set_l3_cbm().
Changes in v6:
* Correct spin_lock scope.
Changes in v5:
* Add spin_lock to protect cbm_map.
---
 xen/arch/x86/domctl.c           |  20 ++++++
 xen/arch/x86/psr.c              | 139 ++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/msr-index.h |   1 +
 xen/include/asm-x86/psr.h       |   2 +
 xen/include/public/domctl.h     |  12 ++++
 5 files changed, 174 insertions(+)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 20cdccb..0f39eab 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1286,6 +1286,26 @@ long arch_do_domctl(
         }
         break;
 
+    case XEN_DOMCTL_psr_cat_op:
+        switch ( domctl->u.psr_cat_op.cmd )
+        {
+        case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM:
+            ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target,
+                                 domctl->u.psr_cat_op.data);
+            break;
+
+        case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
+            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
+                                 &domctl->u.psr_cat_op.data);
+            copyback = 1;
+            break;
+
+        default:
+            ret = -EOPNOTSUPP;
+            break;
+        }
+        break;
+
     default:
         ret = iommu_do_domctl(domctl, d, u_domctl);
         break;
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 8404da0..7af84b1 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -48,6 +48,14 @@ static unsigned int opt_cos_max = 255;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
+static unsigned int get_socket_cpu(unsigned int socket)
+{
+    if ( likely(socket < nr_sockets) )
+        return cpumask_any(socket_cpumask[socket]);
+
+    return nr_cpu_ids;
+}
+
 static void __init parse_psr_bool(char *s, char *value, char *feature,
                                   unsigned int mask)
 {
@@ -248,6 +256,137 @@ int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
     return 0;
 }
 
+int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm)
+{
+    unsigned int cos;
+    struct psr_cat_socket_info *info;
+    int ret = get_cat_socket_info(socket, &info);
+
+    if ( ret )
+        return ret;
+
+    cos = d->arch.psr_cos_ids[socket];
+    *cbm = info->cos_to_cbm[cos].cbm;
+
+    return 0;
+}
+
+static bool_t psr_check_cbm(unsigned int cbm_len, uint64_t cbm)
+{
+    unsigned int first_bit, zero_bit;
+
+    /* Set bits should only in the range of [0, cbm_len). */
+    if ( cbm & (~0ull << cbm_len) )
+        return 0;
+
+    first_bit = find_first_bit(&cbm, cbm_len);
+    zero_bit = find_next_zero_bit(&cbm, cbm_len, first_bit);
+
+    /* Set bits should be contiguous. */
+    if ( zero_bit < cbm_len &&
+         find_next_bit(&cbm, cbm_len, zero_bit) < cbm_len )
+        return 0;
+
+    return 1;
+}
+
+struct cos_cbm_info
+{
+    unsigned int cos;
+    uint64_t cbm;
+};
+
+static void do_write_l3_cbm(void *data)
+{
+    struct cos_cbm_info *info = data;
+
+    wrmsrl(MSR_IA32_PSR_L3_MASK(info->cos), info->cbm);
+}
+
+static int write_l3_cbm(unsigned int socket, unsigned int cos, uint64_t cbm)
+{
+    struct cos_cbm_info info = { .cos = cos, .cbm = cbm };
+
+    if ( socket == cpu_to_socket(smp_processor_id()) )
+        do_write_l3_cbm(&info);
+    else
+    {
+        unsigned int cpu = get_socket_cpu(socket);
+
+        if ( cpu >= nr_cpu_ids )
+            return -EBADSLT;
+        on_selected_cpus(cpumask_of(cpu), do_write_l3_cbm, &info, 1);
+    }
+
+    return 0;
+}
+
+int psr_set_l3_cbm(struct domain *d, unsigned int socket, uint64_t cbm)
+{
+    unsigned int old_cos, cos;
+    struct psr_cat_cbm *map, *found = NULL;
+    struct psr_cat_socket_info *info;
+    int ret = get_cat_socket_info(socket, &info);
+
+    if ( ret )
+        return ret;
+
+    if ( !psr_check_cbm(info->cbm_len, cbm) )
+        return -EINVAL;
+
+    old_cos = d->arch.psr_cos_ids[socket];
+    map = info->cos_to_cbm;
+
+    spin_lock(&info->cbm_lock);
+
+    for ( cos = 0; cos <= info->cos_max; cos++ )
+    {
+        /* If still not found, then keep unused one. */
+        if ( !found && cos != 0 && map[cos].ref == 0 )
+            found = map + cos;
+        else if ( map[cos].cbm == cbm )
+        {
+            if ( unlikely(cos == old_cos) )
+            {
+                spin_unlock(&info->cbm_lock);
+                return 0;
+            }
+            found = map + cos;
+            break;
+        }
+    }
+
+    /* If old cos is referred only by the domain, then use it. */
+    if ( !found && map[old_cos].ref == 1 )
+        found = map + old_cos;
+
+    if ( !found )
+    {
+        spin_unlock(&info->cbm_lock);
+        return -EUSERS;
+    }
+
+    cos = found - map;
+    if ( found->cbm != cbm )
+    {
+        ret = write_l3_cbm(socket, cos, cbm);
+        if ( ret )
+        {
+            spin_unlock(&info->cbm_lock);
+            return ret;
+        }
+        found->cbm = cbm;
+    }
+
+    found->ref++;
+    map[old_cos].ref--;
+    spin_unlock(&info->cbm_lock);
+
+    d->arch.psr_cos_ids[socket] = cos;
+
+    return 0;
+}
+
 /* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
 static void psr_free_cos(struct domain *d)
 {
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 83f2f70..5425f77 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -327,6 +327,7 @@
 #define MSR_IA32_CMT_EVTSEL		0x00000c8d
 #define MSR_IA32_CMT_CTR		0x00000c8e
 #define MSR_IA32_PSR_ASSOC		0x00000c8f
+#define MSR_IA32_PSR_L3_MASK(n)	(0x00000c90 + (n))
 
 /* Intel Model 6 */
 #define MSR_P6_PERFCTR(n)		(0x000000c1 + (n))
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index d364e8c..081750f 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -53,6 +53,8 @@ void psr_ctxt_switch_to(struct domain *d);
 
 int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
                         uint32_t *cos_max);
+int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm);
+int psr_set_l3_cbm(struct domain *d, unsigned int socket, uint64_t cbm);
 
 int psr_domain_init(struct domain *d);
 void psr_domain_free(struct domain *d);
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 0c0ea4a..8fc6ccf 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1065,6 +1065,16 @@ struct xen_domctl_monitor_op {
 typedef struct xen_domctl__op xen_domctl_monitor_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_monitor_op_t);
 
+struct xen_domctl_psr_cat_op {
+#define XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM     0
+#define XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM     1
+    uint32_t cmd;       /* IN: XEN_DOMCTL_PSR_CAT_OP_* */
+    uint32_t target;    /* IN: socket to be operated on */
+    uint64_t data;      /* IN/OUT */
+};
+typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1140,6 +1150,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_setvnumainfo                  74
 #define XEN_DOMCTL_psr_cmt_op                    75
 #define XEN_DOMCTL_monitor_op                    77
+#define XEN_DOMCTL_psr_cat_op                    78
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1203,6 +1214,7 @@ struct xen_domctl {
         struct xen_domctl_vnuma             vnuma;
         struct xen_domctl_psr_cmt_op        psr_cmt_op;
         struct xen_domctl_monitor_op        monitor_op;
+        struct xen_domctl_psr_cat_op        psr_cat_op;
         uint8_t                             pad[128];
     } u;
 };
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 07/13] x86: add scheduling support for Intel CAT
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (5 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 06/13] x86: dynamically get/set CBM for a domain Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 08/13] xsm: add CAT related xsm policies Chao Peng
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
On context switch, write the the domain's Class of Service(COS) to MSR
IA32_PQR_ASSOC, to notify hardware to use the new COS.
For performance reason, the COS mask for current cpu is also cached in
the local per-CPU variable.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v5:
* Remove the need to cache socket.
Changes in v2:
* merge common scheduling changes into scheduling improvement patch.
* use readable expr for psra->cos_mask.
---
 xen/arch/x86/psr.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 7af84b1..0e75c77 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -35,6 +35,7 @@ struct psr_cat_socket_info {
 
 struct psr_assoc {
     uint64_t val;
+    uint64_t cos_mask;
 };
 
 struct psr_cmt *__read_mostly psr_cmt;
@@ -200,7 +201,16 @@ static inline void psr_assoc_init(void)
 {
     struct psr_assoc *psra = &this_cpu(psr_assoc);
 
-    if ( psr_cmt_enabled() )
+    if ( cat_socket_info )
+    {
+        unsigned int socket = cpu_to_socket(smp_processor_id());
+
+        if ( test_bit(socket, cat_socket_enable) )
+            psra->cos_mask = ((1ull << get_count_order(
+                             cat_socket_info[socket].cos_max)) - 1) << 32;
+    }
+
+    if ( psr_cmt_enabled() || psra->cos_mask )
         rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
 }
 
@@ -209,6 +219,12 @@ static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
     *reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
 }
 
+static inline void psr_assoc_cos(uint64_t *reg, unsigned int cos,
+                                 uint64_t cos_mask)
+{
+    *reg = (*reg & ~cos_mask) | (((uint64_t)cos << 32) & cos_mask);
+}
+
 void psr_ctxt_switch_to(struct domain *d)
 {
     struct psr_assoc *psra = &this_cpu(psr_assoc);
@@ -217,6 +233,11 @@ void psr_ctxt_switch_to(struct domain *d)
     if ( psr_cmt_enabled() )
         psr_assoc_rmid(®, d->arch.psr_rmid);
 
+    if ( psra->cos_mask )
+        psr_assoc_cos(®, d->arch.psr_cos_ids ?
+                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
+                      0, psra->cos_mask);
+
     if ( reg != psra->val )
     {
         wrmsrl(MSR_IA32_PSR_ASSOC, reg);
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 08/13] xsm: add CAT related xsm policies
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (6 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 07/13] x86: add scheduling support for Intel CAT Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 09/13] tools/libxl: minor name changes for CMT commands Chao Peng
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Add xsm policies for Cache Allocation Technology(CAT) related hypercalls
to restrict the functions visibility to control domain only.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by:  Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 tools/flask/policy/policy/modules/xen/xen.if | 2 +-
 tools/flask/policy/policy/modules/xen/xen.te | 4 +++-
 xen/xsm/flask/hooks.c                        | 6 ++++++
 xen/xsm/flask/policy/access_vectors          | 4 ++++
 4 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index 620d151..aa5eb72 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -51,7 +51,7 @@ define(`create_domain_common', `
 			getaffinity setaffinity setvcpuextstate };
 	allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
 			set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
-			psr_cmt_op };
+			psr_cmt_op psr_cat_op };
 	allow $1 $2:security check_context;
 	allow $1 $2:shadow enable;
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te
index e555d11..6dcf953 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -67,6 +67,7 @@ allow dom0_t xen_t:xen {
 allow dom0_t xen_t:xen2 {
     resource_op
     psr_cmt_op
+    psr_cat_op
 };
 allow dom0_t xen_t:mmu memorymap;
 
@@ -80,7 +81,8 @@ allow dom0_t dom0_t:domain {
 	getpodtarget setpodtarget set_misc_info set_virq_handler
 };
 allow dom0_t dom0_t:domain2 {
-	set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo get_vnumainfo psr_cmt_op
+	set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo
+	get_vnumainfo psr_cmt_op psr_cat_op
 };
 allow dom0_t dom0_t:resource { add remove };
 
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 11b7453..c08d502 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -737,6 +737,9 @@ static int flask_domctl(struct domain *d, int cmd)
     case XEN_DOMCTL_psr_cmt_op:
         return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PSR_CMT_OP);
 
+    case XEN_DOMCTL_psr_cat_op:
+        return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PSR_CAT_OP);
+
     default:
         printk("flask_domctl: Unknown op %d\n", cmd);
         return -EPERM;
@@ -796,6 +799,9 @@ static int flask_sysctl(int cmd)
     case XEN_SYSCTL_psr_cmt_op:
         return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
                                     XEN2__PSR_CMT_OP, NULL);
+    case XEN_SYSCTL_psr_cat_op:
+        return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
+                                    XEN2__PSR_CAT_OP, NULL);
 
     default:
         printk("flask_sysctl: Unknown op %d\n", cmd);
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index ea556df..939bb1a 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -85,6 +85,8 @@ class xen2
     resource_op
 # XEN_SYSCTL_psr_cmt_op
     psr_cmt_op
+# XEN_SYSCTL_psr_cat_op
+    psr_cat_op
 }
 
 # Classes domain and domain2 consist of operations that a domain performs on
@@ -230,6 +232,8 @@ class domain2
     mem_paging
 # XENMEM_sharing_op
     mem_sharing
+# XEN_DOMCTL_psr_cat_op
+    psr_cat_op
 }
 
 # Similar to class domain, but primarily contains domctls related to HVM domains
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 09/13] tools/libxl: minor name changes for CMT commands
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (7 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 08/13] xsm: add CAT related xsm policies Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 10/13] tools/libxl: add command to show PSR hardware info Chao Peng
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Use "-" instead of  "_" for monitor types.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/xl_cmdimpl.c  | 6 +++---
 tools/libxl/xl_cmdtable.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 373aa37..b58a242 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -8261,11 +8261,11 @@ int main_psr_cmt_show(int argc, char **argv)
         /* No options */
     }
 
-    if (!strcmp(argv[optind], "cache_occupancy"))
+    if (!strcmp(argv[optind], "cache-occupancy"))
         type = LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY;
-    else if (!strcmp(argv[optind], "total_mem_bandwidth"))
+    else if (!strcmp(argv[optind], "total-mem-bandwidth"))
         type = LIBXL_PSR_CMT_TYPE_TOTAL_MEM_COUNT;
-    else if (!strcmp(argv[optind], "local_mem_bandwidth"))
+    else if (!strcmp(argv[optind], "local-mem-bandwidth"))
         type = LIBXL_PSR_CMT_TYPE_LOCAL_MEM_COUNT;
     else {
         help("psr-cmt-show");
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 7f4759b..12899d1 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -540,9 +540,9 @@ struct cmd_spec cmd_table[] = {
       "Show Cache Monitoring Technology information",
       "<PSR-CMT-Type> <Domain>",
       "Available monitor types:\n"
-      "\"cache_occupancy\":         Show L3 cache occupancy(KB)\n"
-      "\"total_mem_bandwidth\":     Show total memory bandwidth(KB/s)\n"
-      "\"local_mem_bandwidth\":     Show local memory bandwidth(KB/s)\n",
+      "\"cache-occupancy\":         Show L3 cache occupancy(KB)\n"
+      "\"total-mem-bandwidth\":     Show total memory bandwidth(KB/s)\n"
+      "\"local-mem-bandwidth\":     Show local memory bandwidth(KB/s)\n",
     },
 #endif
 };
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 10/13] tools/libxl: add command to show PSR hardware info
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (8 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 09/13] tools/libxl: minor name changes for CMT commands Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 11/13] tools/libxl: introduce some socket helpers Chao Peng
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Add dedicated one to show hardware information.
[root@vmm-psr]xl psr-hwinfo
Cache Monitoring Technology (CMT):
Enabled         : 1
Total RMID      : 63
Supported monitor types:
cache-occupancy
total-mem-bandwidth
local-mem-bandwidth
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v6:
* Add SWITCH_FOREACH_OPT to make '-h' work.
---
 docs/man/xl.pod.1         |  4 ++++
 tools/libxl/xl.h          |  1 +
 tools/libxl/xl_cmdimpl.c  | 41 +++++++++++++++++++++++++++++++++++++++++
 tools/libxl/xl_cmdtable.c |  5 +++++
 4 files changed, 51 insertions(+)
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 02bf531..b221882 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1502,6 +1502,10 @@ for any of these monitoring types.
 
 =over 4
 
+=item B<psr-hwinfo>
+
+Show CMT hardware information.
+
 =item B<psr-cmt-attach> [I<domain-id>]
 
 attach: Attach the platform shared resource monitoring service to a domain.
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index 5bc138c..7b56449 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -113,6 +113,7 @@ int main_remus(int argc, char **argv);
 #endif
 int main_devd(int argc, char **argv);
 #ifdef LIBXL_HAVE_PSR_CMT
+int main_psr_hwinfo(int argc, char **argv);
 int main_psr_cmt_attach(int argc, char **argv);
 int main_psr_cmt_detach(int argc, char **argv);
 int main_psr_cmt_show(int argc, char **argv);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index b58a242..b3c4ec0 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -8055,6 +8055,36 @@ out:
 }
 
 #ifdef LIBXL_HAVE_PSR_CMT
+static int psr_cmt_hwinfo(void)
+{
+    int rc;
+    int enabled;
+    uint32_t total_rmid;
+
+    printf("Cache Monitoring Technology (CMT):\n");
+
+    enabled = libxl_psr_cmt_enabled(ctx);
+    printf("%-16s: %s\n", "Enabled", enabled ? "1" : "0");
+    if (!enabled)
+        return 0;
+
+    rc = libxl_psr_cmt_get_total_rmid(ctx, &total_rmid);
+    if (rc) {
+        fprintf(stderr, "Failed to get max RMID value\n");
+        return rc;
+    }
+    printf("%-16s: %u\n", "Total RMID", total_rmid);
+
+    printf("Supported monitor types:\n");
+    if (libxl_psr_cmt_type_supported(ctx, LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY))
+        printf("cache-occupancy\n");
+    if (libxl_psr_cmt_type_supported(ctx, LIBXL_PSR_CMT_TYPE_TOTAL_MEM_COUNT))
+        printf("total-mem-bandwidth\n");
+    if (libxl_psr_cmt_type_supported(ctx, LIBXL_PSR_CMT_TYPE_LOCAL_MEM_COUNT))
+        printf("local-mem-bandwidth\n");
+
+    return rc;
+}
 
 #define MBM_SAMPLE_RETRY_MAX 4
 static int psr_cmt_get_mem_bandwidth(uint32_t domid,
@@ -8221,6 +8251,17 @@ static int psr_cmt_show(libxl_psr_cmt_type type, uint32_t domid)
     return 0;
 }
 
+int main_psr_hwinfo(int argc, char **argv)
+{
+    int opt;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-hwinfo", 0) {
+        /* No options */
+    }
+
+    return psr_cmt_hwinfo();
+}
+
 int main_psr_cmt_attach(int argc, char **argv)
 {
     uint32_t domid;
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 12899d1..77a37c5 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -525,6 +525,11 @@ struct cmd_spec cmd_table[] = {
       "-F                      Run in the foreground",
     },
 #ifdef LIBXL_HAVE_PSR_CMT
+    { "psr-hwinfo",
+      &main_psr_hwinfo, 0, 1,
+      "Show hardware information for Platform Shared Resource",
+      "",
+    },
     { "psr-cmt-attach",
       &main_psr_cmt_attach, 0, 1,
       "Attach Cache Monitoring Technology service to a domain",
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 11/13] tools/libxl: introduce some socket helpers
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (9 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 10/13] tools/libxl: add command to show PSR hardware info Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 12/13] tools: add tools support for Intel CAT Chao Peng
  2015-05-21  8:41 ` [PATCH v8 13/13] docs: add xl-psr.markdown Chao Peng
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Add libxl_socket_bitmap_alloc() to allow allocating a socket specific
libxl_bitmap (as it is for cpu/node bitmap).
Internal function libxl__count_physical_sockets() is introduced together
to get the socket count when the size of bitmap is not specified.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v7:
* Broadcast LIBXL_HAVE_SOCKET_BITMAP_ALLOC
---
 tools/libxl/libxl.h          |  7 +++++++
 tools/libxl/libxl_internal.h |  2 ++
 tools/libxl/libxl_utils.c    | 46 ++++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_utils.h    |  2 ++
 4 files changed, 57 insertions(+)
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 2ed7194..6ce93f5 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -768,6 +768,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
  */
 #define LIBXL_HAVE_PCITOPOLOGY 1
 
+/*
+ * LIBXL_HAVE_SOCKET_BITMAP_ALLOC
+ *
+ * If this is defined, then libxl_socket_bitmap_alloc exists.
+ */
+#define LIBXL_HAVE_SOCKET_BITMAP_ALLOC 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8aaa1ad..89ca694 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3697,6 +3697,8 @@ static inline void libxl__update_config_vtpm(libxl__gc *gc,
  */
 void libxl__bitmap_copy_best_effort(libxl__gc *gc, libxl_bitmap *dptr,
                                     const libxl_bitmap *sptr);
+
+int libxl__count_physical_sockets(libxl__gc *gc, int *sockets);
 #endif
 
 /*
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
index f6be2d7..bfc9699 100644
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -840,6 +840,52 @@ int libxl_node_bitmap_alloc(libxl_ctx *ctx, libxl_bitmap *nodemap,
     return rc;
 }
 
+int libxl__count_physical_sockets(libxl__gc *gc, int *sockets)
+{
+    int rc;
+    libxl_physinfo info;
+
+    libxl_physinfo_init(&info);
+
+    rc = libxl_get_physinfo(CTX, &info);
+    if (rc)
+        return rc;
+
+    *sockets = info.nr_cpus / info.threads_per_core
+                            / info.cores_per_socket;
+
+    libxl_physinfo_dispose(&info);
+    return 0;
+}
+
+int libxl_socket_bitmap_alloc(libxl_ctx *ctx, libxl_bitmap *socketmap,
+                              int max_sockets)
+{
+    GC_INIT(ctx);
+    int rc = 0;
+
+    if (max_sockets < 0) {
+        rc = ERROR_INVAL;
+        LOG(ERROR, "invalid number of sockets provided");
+        goto out;
+    }
+
+    if (max_sockets == 0) {
+        rc = libxl__count_physical_sockets(gc, &max_sockets);
+        if (rc) {
+            LOGE(ERROR, "failed to get system socket count");
+            goto out;
+        }
+    }
+    /* This can't fail: no need to check and log */
+    libxl_bitmap_alloc(ctx, socketmap, max_sockets);
+
+ out:
+    GC_FREE;
+    return rc;
+
+}
+
 int libxl_nodemap_to_cpumap(libxl_ctx *ctx,
                             const libxl_bitmap *nodemap,
                             libxl_bitmap *cpumap)
diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
index 1c1761d..82340ec 100644
--- a/tools/libxl/libxl_utils.h
+++ b/tools/libxl/libxl_utils.h
@@ -141,6 +141,8 @@ static inline int libxl_bitmap_equal(const libxl_bitmap *ba,
 int libxl_cpu_bitmap_alloc(libxl_ctx *ctx, libxl_bitmap *cpumap, int max_cpus);
 int libxl_node_bitmap_alloc(libxl_ctx *ctx, libxl_bitmap *nodemap,
                             int max_nodes);
+int libxl_socket_bitmap_alloc(libxl_ctx *ctx, libxl_bitmap *socketmap,
+                              int max_sockets);
 
 /* Populate cpumap with the cpus spanned by the nodes in nodemap */
 int libxl_nodemap_to_cpumap(libxl_ctx *ctx,
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 12/13] tools: add tools support for Intel CAT
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (10 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 11/13] tools/libxl: introduce some socket helpers Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  2015-05-21  8:41 ` [PATCH v8 13/13] docs: add xl-psr.markdown Chao Peng
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
This is the xc/xl changes to support Intel Cache Allocation
Technology(CAT).
'xl psr-hwinfo' is updated to show CAT info and two new commands
for CAT are introduced:
- xl psr-cat-cbm-set [-s socket] <domain> <cbm>
  Set cache capacity bitmasks(CBM) for a domain.
- xl psr-cat-show <domain>
  Show CAT domain information.
Examples:
[root@vmm-psr vmm]# xl psr-hwinfo --cat
Cache Allocation Technology (CAT):
Socket ID       : 0
L3 Cache        : 12288KB
Maximum COS     : 15
CBM length      : 12
Default CBM     : 0xfff
[root@vmm-psr vmm]# xl psr-cat-cbm-set 0 0xff
[root@vmm-psr vmm]# xl psr-cat-show
Socket ID       : 0
L3 Cache        : 12288KB
Default CBM     : 0xfff
   ID                     NAME             CBM
    0                 Domain-0            0xff
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v7:
* Add PSR head1 level section and change CMT/CAT as its subsections for xl man page.
* Other minor document changes.
Changes in v6:
* Merge xl psr-cmt/cat-hwinfo => xl psr-hwinfo.
* Add function header to explain the 'target' parameter.
* Use bitmap instead of TARGETS_ALL.
* Remove the need to store the return value form libxc.
* Minor document/commit msg adjustment.
Changes in v5:
* Add psr-cat-hwinfo.
* Add libxl_psr_cat_info_list_free.
* malloc => libxl__malloc
* Other comments from Ian/Wei.
Changes in v4:
* Add example output in commit message.
* Make libxl__count_physical_sockets private to libxl_psr.c.
* Set errno in several error cases.
* Change libxl_psr_cat_get_l3_info to return all sockets information.
* Remove unused libxl_domain_info call.
Changes in v3:
* Add manpage.
* libxl_psr_cat_set/get_domain_data => libxl_psr_cat_set/get_cbm.
* Move libxl_count_physical_sockets into seperate patch.
* Support LIBXL_PSR_TARGET_ALL for libxl_psr_cat_set_cbm.
* Clean up the print codes.
---
 docs/man/xl.pod.1             |  75 +++++++++++--
 tools/libxc/include/xenctrl.h |  15 +++
 tools/libxc/xc_psr.c          |  76 ++++++++++++++
 tools/libxl/libxl.h           |  35 +++++++
 tools/libxl/libxl_psr.c       | 143 +++++++++++++++++++++++--
 tools/libxl/libxl_types.idl   |  10 ++
 tools/libxl/xl.h              |   4 +
 tools/libxl/xl_cmdimpl.c      | 237 ++++++++++++++++++++++++++++++++++++++++--
 tools/libxl/xl_cmdtable.c     |  18 +++-
 9 files changed, 584 insertions(+), 29 deletions(-)
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index b221882..4dd5388 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1484,28 +1484,52 @@ policy. Loading new security policy will reset runtime changes to device labels.
 
 =back
 
-=head1 CACHE MONITORING TECHNOLOGY
+=head1 PLATFORM SHARED RESOURCE MONITORING/CONTROL
+
+Intel Haswell and later server platforms offer shared resource monitoring
+and control technologies. The availability of these technologies and the
+hardware capabilities can be shown with B<psr-hwinfo>.
+
+=over 4
+
+=item B<psr-hwinfo> [I<OPTIONS>]
+
+Show Platform Shared Resource (PSR) hardware information.
+
+B<OPTIONS>
+
+=over 4
+
+=item B<-m>, B<--cmt>
+
+Show Cache Monitoring Technology (CMT) hardware information.
+
+=item B<-a>, B<--cat>
+
+Show Cache Allocation Technology (CAT) hardware information.
+
+=back
+
+=back
+
+=head2 CACHE MONITORING TECHNOLOGY
 
 Intel Haswell and later server platforms offer monitoring capability in each
 logical processor to measure specific platform shared resource metric, for
-example, L3 cache occupancy. In Xen implementation, the monitoring granularity
-is domain level. To monitor a specific domain, just attach the domain id with
-the monitoring service. When the domain doesn't need to be monitored any more,
-detach the domain id from the monitoring service.
+example, L3 cache occupancy. In the Xen implementation, the monitoring
+granularity is domain level. To monitor a specific domain, just attach the
+domain id with the monitoring service. When the domain doesn't need to be
+monitored any more, detach the domain id from the monitoring service.
 
 Intel Broadwell and later server platforms also offer total/local memory
 bandwidth monitoring. Xen supports per-domain monitoring for these two
 additional monitoring types. Both memory bandwidth monitoring and L3 cache
 occupancy monitoring share the same set of underlying monitoring service. Once
-a domain is attached to the monitoring service, monitoring data can be showed
+a domain is attached to the monitoring service, monitoring data can be shown
 for any of these monitoring types.
 
 =over 4
 
-=item B<psr-hwinfo>
-
-Show CMT hardware information.
-
 =item B<psr-cmt-attach> [I<domain-id>]
 
 attach: Attach the platform shared resource monitoring service to a domain.
@@ -1524,6 +1548,37 @@ monitor types are:
 
 =back
 
+=head2 CACHE ALLOCATION TECHNOLOGY
+
+Intel Broadwell and later server platforms offer capabilities to configure and
+make use of the Cache Allocation Technology (CAT) mechanisms, which enable more
+cache resources (i.e. L3 cache) to be made available for high priority
+applications. In the Xen implementation, CAT is used to control cache allocation
+on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
+(CBM) for the domain.
+
+=over 4
+
+=item B<psr-cat-cbm-set> [I<OPTIONS>] I<domain-id> I<cbm>
+
+Set cache capacity bitmasks(CBM) for a domain.
+
+B<OPTIONS>
+
+=over 4
+
+=item B<-s SOCKET>, B<--socket=SOCKET>
+
+Specify the socket to process, otherwise all sockets are processed.
+
+=back
+
+=item B<psr-cat-show> [I<domain-id>]
+
+Show CAT settings for a certain domain or all domains.
+
+=back
+
 =head1 TO BE DOCUMENTED
 
 We need better documentation for:
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 09a7450..8d9c005 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2772,6 +2772,12 @@ enum xc_psr_cmt_type {
     XC_PSR_CMT_LOCAL_MEM_COUNT,
 };
 typedef enum xc_psr_cmt_type xc_psr_cmt_type;
+
+enum xc_psr_cat_type {
+    XC_PSR_CAT_L3_CBM = 1,
+};
+typedef enum xc_psr_cat_type xc_psr_cat_type;
+
 int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid);
 int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid);
 int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
@@ -2786,6 +2792,15 @@ int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu,
                         uint32_t psr_cmt_type, uint64_t *monitor_data,
                         uint64_t *tsc);
 int xc_psr_cmt_enabled(xc_interface *xch);
+
+int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
+                               xc_psr_cat_type type, uint32_t target,
+                               uint64_t data);
+int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
+                               xc_psr_cat_type type, uint32_t target,
+                               uint64_t *data);
+int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
+                           uint32_t *cos_max, uint32_t *cbm_len);
 #endif
 
 #endif /* XENCTRL_H */
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index e367a80..d8b3a51 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -248,6 +248,82 @@ int xc_psr_cmt_enabled(xc_interface *xch)
 
     return 0;
 }
+int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
+                               xc_psr_cat_type type, uint32_t target,
+                               uint64_t data)
+{
+    DECLARE_DOMCTL;
+    uint32_t cmd;
+
+    switch ( type )
+    {
+    case XC_PSR_CAT_L3_CBM:
+        cmd = XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM;
+        break;
+    default:
+        errno = EINVAL;
+        return -1;
+    }
+
+    domctl.cmd = XEN_DOMCTL_psr_cat_op;
+    domctl.domain = (domid_t)domid;
+    domctl.u.psr_cat_op.cmd = cmd;
+    domctl.u.psr_cat_op.target = target;
+    domctl.u.psr_cat_op.data = data;
+
+    return do_domctl(xch, &domctl);
+}
+
+int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
+                               xc_psr_cat_type type, uint32_t target,
+                               uint64_t *data)
+{
+    int rc;
+    DECLARE_DOMCTL;
+    uint32_t cmd;
+
+    switch ( type )
+    {
+    case XC_PSR_CAT_L3_CBM:
+        cmd = XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM;
+        break;
+    default:
+        errno = EINVAL;
+        return -1;
+    }
+
+    domctl.cmd = XEN_DOMCTL_psr_cat_op;
+    domctl.domain = (domid_t)domid;
+    domctl.u.psr_cat_op.cmd = cmd;
+    domctl.u.psr_cat_op.target = target;
+
+    rc = do_domctl(xch, &domctl);
+
+    if ( !rc )
+        *data = domctl.u.psr_cat_op.data;
+
+    return rc;
+}
+
+int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
+                           uint32_t *cos_max, uint32_t *cbm_len)
+{
+    int rc;
+    DECLARE_SYSCTL;
+
+    sysctl.cmd = XEN_SYSCTL_psr_cat_op;
+    sysctl.u.psr_cat_op.cmd = XEN_SYSCTL_PSR_CAT_get_l3_info;
+    sysctl.u.psr_cat_op.target = socket;
+
+    rc = xc_sysctl(xch, &sysctl);
+    if ( !rc )
+    {
+        *cos_max = sysctl.u.psr_cat_op.u.l3_info.cos_max;
+        *cbm_len = sysctl.u.psr_cat_op.u.l3_info.cbm_len;
+    }
+
+    return rc;
+}
 
 /*
  * Local variables:
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 6ce93f5..4d97300 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -758,6 +758,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
  * If this is defined, the Memory Bandwidth Monitoring feature is supported.
  */
 #define LIBXL_HAVE_PSR_MBM 1
+
+/*
+ * LIBXL_HAVE_PSR_CAT
+ *
+ * If this is defined, the Cache Allocation Technology feature is supported.
+ */
+#define LIBXL_HAVE_PSR_CAT 1
 #endif
 
 /*
@@ -1578,6 +1585,34 @@ int libxl_psr_cmt_get_sample(libxl_ctx *ctx,
                              uint64_t *tsc_r);
 #endif
 
+#ifdef LIBXL_HAVE_PSR_CAT
+/*
+ * Function to set a domain's cbm. It operates on a single or multiple
+ * target(s) defined in 'target_map'. The definition of 'target_map' is
+ * related to 'type':
+ * 'L3_CBM': 'target_map' specifies all the sockets to be operated on.
+ */
+int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
+                          libxl_psr_cbm_type type, libxl_bitmap *target_map,
+                          uint64_t cbm);
+/*
+ * Function to get a domain's cbm. It operates on a single 'target'.
+ * The definition of 'target' is related to 'type':
+ * 'L3_CBM': 'target' specifies which socket to be operated on.
+ */
+int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
+                          libxl_psr_cbm_type type, uint32_t target,
+                          uint64_t *cbm_r);
+
+/*
+ * On success, the function returns an array of elements in 'info',
+ * and the length in 'nr'.
+ */
+int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
+                              int *nr);
+void libxl_psr_cat_info_list_free(libxl_psr_cat_info *list, int nr);
+#endif
+
 /* misc */
 
 /* Each of these sets or clears the flag according to whether the
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 3e1c792..313ec86 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -19,14 +19,37 @@
 
 #define IA32_QM_CTR_ERROR_MASK         (0x3ul << 62)
 
-static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
+static void libxl__psr_log_err_msg(libxl__gc *gc, int err)
 {
     char *msg;
 
     switch (err) {
     case ENOSYS:
+    case EOPNOTSUPP:
         msg = "unsupported operation";
         break;
+    case ESRCH:
+        msg = "invalid domain ID";
+        break;
+    case EBADSLT:
+        msg = "socket is not supported";
+        break;
+    case EFAULT:
+        msg = "failed to exchange data with Xen";
+        break;
+    default:
+        msg = "unknown error";
+        break;
+    }
+
+    LOGE(ERROR, "%s", msg);
+}
+
+static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
+{
+    char *msg;
+
+    switch (err) {
     case ENODEV:
         msg = "CMT is not supported in this system";
         break;
@@ -39,15 +62,35 @@ static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
     case EUSERS:
         msg = "no free RMID available";
         break;
-    case ESRCH:
-        msg = "invalid domain ID";
+    default:
+        libxl__psr_log_err_msg(gc, err);
+        return;
+    }
+
+    LOGE(ERROR, "%s", msg);
+}
+
+static void libxl__psr_cat_log_err_msg(libxl__gc *gc, int err)
+{
+    char *msg;
+
+    switch (err) {
+    case ENODEV:
+        msg = "CAT is not supported in this system";
         break;
-    case EFAULT:
-        msg = "failed to exchange data with Xen";
+    case ENOENT:
+        msg = "CAT is not enabled on the socket";
         break;
-    default:
-        msg = "unknown error";
+    case EUSERS:
+        msg = "no free COS available";
         break;
+    case EEXIST:
+        msg = "The same CBM is already set to this domain";
+        break;
+
+    default:
+        libxl__psr_log_err_msg(gc, err);
+        return;
     }
 
     LOGE(ERROR, "%s", msg);
@@ -247,6 +290,92 @@ out:
     return rc;
 }
 
+int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
+                          libxl_psr_cbm_type type, libxl_bitmap *target_map,
+                          uint64_t cbm)
+{
+    GC_INIT(ctx);
+    int rc;
+    int socket, nr_sockets;
+
+    rc = libxl__count_physical_sockets(gc, &nr_sockets);
+    if (rc) {
+        LOGE(ERROR, "failed to get system socket count");
+        goto out;
+    }
+
+    libxl_for_each_set_bit(socket, *target_map) {
+        if (socket >= nr_sockets)
+            break;
+        if (xc_psr_cat_set_domain_data(ctx->xch, domid, type, socket, cbm)) {
+            libxl__psr_cat_log_err_msg(gc, errno);
+            rc = ERROR_FAIL;
+        }
+    }
+
+out:
+    GC_FREE;
+    return rc;
+}
+
+int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
+                          libxl_psr_cbm_type type, uint32_t target,
+                          uint64_t *cbm_r)
+{
+    GC_INIT(ctx);
+    int rc = 0;
+
+    if (xc_psr_cat_get_domain_data(ctx->xch, domid, type, target, cbm_r)) {
+        libxl__psr_cat_log_err_msg(gc, errno);
+        rc = ERROR_FAIL;
+    }
+
+    GC_FREE;
+    return rc;
+}
+
+int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
+                              int *nr)
+{
+    GC_INIT(ctx);
+    int rc;
+    int i, nr_sockets;
+    libxl_psr_cat_info *ptr;
+
+    rc = libxl__count_physical_sockets(gc, &nr_sockets);
+    if (rc) {
+        LOGE(ERROR, "failed to get system socket count");
+        goto out;
+    }
+
+    ptr = libxl__malloc(NOGC, nr_sockets * sizeof(libxl_psr_cat_info));
+
+    for (i = 0; i < nr_sockets; i++) {
+        if (xc_psr_cat_get_l3_info(ctx->xch, i, &ptr[i].cos_max,
+                                                &ptr[i].cbm_len)) {
+            libxl__psr_cat_log_err_msg(gc, errno);
+            rc = ERROR_FAIL;
+            free(ptr);
+            goto out;
+        }
+    }
+
+    *info = ptr;
+    *nr = nr_sockets;
+out:
+    GC_FREE;
+    return rc;
+}
+
+void libxl_psr_cat_info_list_free(libxl_psr_cat_info *list, int nr)
+{
+    int i;
+
+    for (i = 0; i < nr; i++)
+        libxl_psr_cat_info_dispose(&list[i]);
+    free(list);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 65d479f..9565ba4 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -718,3 +718,13 @@ libxl_psr_cmt_type = Enumeration("psr_cmt_type", [
     (2, "TOTAL_MEM_COUNT"),
     (3, "LOCAL_MEM_COUNT"),
     ])
+
+libxl_psr_cbm_type = Enumeration("psr_cbm_type", [
+    (0, "UNKNOWN"),
+    (1, "L3_CBM"),
+    ])
+
+libxl_psr_cat_info = Struct("psr_cat_info", [
+    ("cos_max", uint32),
+    ("cbm_len", uint32),
+    ])
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index 7b56449..bef32d7 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -118,6 +118,10 @@ int main_psr_cmt_attach(int argc, char **argv);
 int main_psr_cmt_detach(int argc, char **argv);
 int main_psr_cmt_show(int argc, char **argv);
 #endif
+#ifdef LIBXL_HAVE_PSR_CAT
+int main_psr_cat_cbm_set(int argc, char **argv);
+int main_psr_cat_show(int argc, char **argv);
+#endif
 
 void help(const char *command);
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index b3c4ec0..a035be0 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -8251,17 +8251,6 @@ static int psr_cmt_show(libxl_psr_cmt_type type, uint32_t domid)
     return 0;
 }
 
-int main_psr_hwinfo(int argc, char **argv)
-{
-    int opt;
-
-    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-hwinfo", 0) {
-        /* No options */
-    }
-
-    return psr_cmt_hwinfo();
-}
-
 int main_psr_cmt_attach(int argc, char **argv)
 {
     uint32_t domid;
@@ -8328,6 +8317,232 @@ int main_psr_cmt_show(int argc, char **argv)
 }
 #endif
 
+#ifdef LIBXL_HAVE_PSR_CAT
+static int psr_cat_hwinfo(void)
+{
+    int rc;
+    int socket, nr_sockets;
+    uint32_t l3_cache_size;
+    libxl_psr_cat_info *info;
+
+    printf("Cache Allocation Technology (CAT):\n");
+
+    rc = libxl_psr_cat_get_l3_info(ctx, &info, &nr_sockets);
+    if (rc) {
+        fprintf(stderr, "Failed to get cat info\n");
+        return rc;
+    }
+
+    for (socket = 0; socket < nr_sockets; socket++) {
+        rc = libxl_psr_cmt_get_l3_cache_size(ctx, socket, &l3_cache_size);
+        if (rc) {
+            fprintf(stderr, "Failed to get l3 cache size for socket:%d\n",
+                    socket);
+            goto out;
+        }
+        printf("%-16s: %u\n", "Socket ID", socket);
+        printf("%-16s: %uKB\n", "L3 Cache", l3_cache_size);
+        printf("%-16s: %u\n", "Maximum COS", info->cos_max);
+        printf("%-16s: %u\n", "CBM length", info->cbm_len);
+        printf("%-16s: %#"PRIx64"\n", "Default CBM",
+               (1ul << info->cbm_len) - 1);
+    }
+
+out:
+    libxl_psr_cat_info_list_free(info, nr_sockets);
+    return rc;
+}
+
+static void psr_cat_print_one_domain_cbm(uint32_t domid, uint32_t socket)
+{
+    char *domain_name;
+    uint64_t cbm;
+
+    domain_name = libxl_domid_to_name(ctx, domid);
+    printf("%5d%25s", domid, domain_name);
+    free(domain_name);
+
+    if (!libxl_psr_cat_get_cbm(ctx, domid, LIBXL_PSR_CBM_TYPE_L3_CBM,
+                               socket, &cbm))
+         printf("%#16"PRIx64, cbm);
+
+    printf("\n");
+}
+
+static int psr_cat_print_domain_cbm(uint32_t domid, uint32_t socket)
+{
+    int i, nr_domains;
+    libxl_dominfo *list;
+
+    if (domid != INVALID_DOMID) {
+        psr_cat_print_one_domain_cbm(domid, socket);
+        return 0;
+    }
+
+    if (!(list = libxl_list_domain(ctx, &nr_domains))) {
+        fprintf(stderr, "Failed to get domain list for cbm display\n");
+        return -1;
+    }
+
+    for (i = 0; i < nr_domains; i++)
+        psr_cat_print_one_domain_cbm(list[i].domid, socket);
+    libxl_dominfo_list_free(list, nr_domains);
+
+    return 0;
+}
+
+static int psr_cat_print_socket(uint32_t domid, uint32_t socket,
+                                libxl_psr_cat_info *info)
+{
+    int rc;
+    uint32_t l3_cache_size;
+
+    rc = libxl_psr_cmt_get_l3_cache_size(ctx, socket, &l3_cache_size);
+    if (rc) {
+        fprintf(stderr, "Failed to get l3 cache size for socket:%d\n", socket);
+        return -1;
+    }
+
+    printf("%-16s: %u\n", "Socket ID", socket);
+    printf("%-16s: %uKB\n", "L3 Cache", l3_cache_size);
+    printf("%-16s: %#"PRIx64"\n", "Default CBM", (1ul << info->cbm_len) - 1);
+    printf("%5s%25s%16s\n", "ID", "NAME", "CBM");
+
+    return psr_cat_print_domain_cbm(domid, socket);
+}
+
+static int psr_cat_show(uint32_t domid)
+{
+    int socket, nr_sockets;
+    int rc;
+    libxl_psr_cat_info *info;
+
+    rc = libxl_psr_cat_get_l3_info(ctx, &info, &nr_sockets);
+    if (rc) {
+        fprintf(stderr, "Failed to get cat info\n");
+        return rc;
+    }
+
+    for (socket = 0; socket < nr_sockets; socket++) {
+        rc = psr_cat_print_socket(domid, socket, info + socket);
+        if (rc)
+            goto out;
+    }
+
+out:
+    libxl_psr_cat_info_list_free(info, nr_sockets);
+    return rc;
+}
+
+int main_psr_cat_cbm_set(int argc, char **argv)
+{
+    uint32_t domid;
+    libxl_psr_cbm_type type = LIBXL_PSR_CBM_TYPE_L3_CBM;
+    uint64_t cbm;
+    int ret, opt = 0;
+    libxl_bitmap target_map;
+    char *value;
+    libxl_string_list socket_list;
+    unsigned long start, end;
+    int i, j, len;
+
+    static struct option opts[] = {
+        {"socket", 0, 0, 's'},
+        COMMON_LONG_OPTS,
+        {0, 0, 0, 0}
+    };
+
+    libxl_socket_bitmap_alloc(ctx, &target_map, 0);
+    libxl_bitmap_set_none(&target_map);
+
+    SWITCH_FOREACH_OPT(opt, "s", opts, "psr-cat-cbm-set", 1) {
+    case 's':
+        trim(isspace, optarg, &value);
+        split_string_into_string_list(value, ",", &socket_list);
+        len = libxl_string_list_length(&socket_list);
+        for (i = 0; i < len; i++) {
+            parse_range(socket_list[i], &start, &end);
+            for (j = start; j < end; j++)
+                libxl_bitmap_set(&target_map, j);
+        }
+
+        libxl_string_list_dispose(&socket_list);
+        free(value);
+        break;
+    }
+
+    if (libxl_bitmap_is_empty(&target_map))
+        libxl_bitmap_set_any(&target_map);
+
+    domid = find_domain(argv[optind]);
+    cbm = strtoll(argv[optind + 1], NULL , 0);
+
+    ret = libxl_psr_cat_set_cbm(ctx, domid, type, &target_map, cbm);
+
+    libxl_bitmap_dispose(&target_map);
+    return ret;
+}
+
+int main_psr_cat_show(int argc, char **argv)
+{
+    int opt;
+    uint32_t domid;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cat-show", 0) {
+        /* No options */
+    }
+
+    if (optind >= argc)
+        domid = INVALID_DOMID;
+    else if (optind == argc - 1)
+        domid = find_domain(argv[optind]);
+    else {
+        help("psr-cat-show");
+        return 2;
+    }
+
+    return psr_cat_show(domid);
+}
+
+int main_psr_hwinfo(int argc, char **argv)
+{
+    int opt, ret;
+    int cmt = 0, cat = 0;
+    static struct option opts[] = {
+        {"cmt", 0, 0, 'm'},
+        {"cat", 0, 0, 'a'},
+        COMMON_LONG_OPTS,
+        {0, 0, 0, 0}
+    };
+
+    SWITCH_FOREACH_OPT(opt, "ma", opts, "psr-hwinfo", 0) {
+    case 'm':
+        cmt = 1;
+        break;
+    case 'a':
+        cat = 1;
+        break;
+    }
+
+    if (!(cmt | cat)) {
+        cmt = 1;
+        cat = 1;
+    }
+
+    if (cmt)
+        ret = psr_cmt_hwinfo();
+
+    if (ret)
+        return ret;
+
+    if (cat)
+        ret = psr_cat_hwinfo();
+
+    return ret;
+}
+
+#endif
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 77a37c5..51a56fb 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -528,7 +528,9 @@ struct cmd_spec cmd_table[] = {
     { "psr-hwinfo",
       &main_psr_hwinfo, 0, 1,
       "Show hardware information for Platform Shared Resource",
-      "",
+      "[options]",
+      "-m, --cmt       Show Cache Monitoring Technology (CMT) hardware info\n"
+      "-a, --cat       Show Cache Allocation Technology (CAT) hardware info\n"
     },
     { "psr-cmt-attach",
       &main_psr_cmt_attach, 0, 1,
@@ -550,6 +552,20 @@ struct cmd_spec cmd_table[] = {
       "\"local-mem-bandwidth\":     Show local memory bandwidth(KB/s)\n",
     },
 #endif
+#ifdef LIBXL_HAVE_PSR_CAT
+    { "psr-cat-cbm-set",
+      &main_psr_cat_cbm_set, 0, 1,
+      "Set cache capacity bitmasks(CBM) for a domain",
+      "[options] <Domain> <CBM>",
+      "-s <socket>       Specify the socket to process, otherwise all sockets are processed\n"
+    },
+    { "psr-cat-show",
+      &main_psr_cat_show, 0, 1,
+      "Show Cache Allocation Technology information",
+      "<Domain>",
+    },
+
+#endif
 };
 
 int cmdtable_len = sizeof(cmd_table)/sizeof(struct cmd_spec);
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* [PATCH v8 13/13] docs: add xl-psr.markdown
  2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
                   ` (11 preceding siblings ...)
  2015-05-21  8:41 ` [PATCH v8 12/13] tools: add tools support for Intel CAT Chao Peng
@ 2015-05-21  8:41 ` Chao Peng
  12 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-05-21  8:41 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, will.auld, JBeulich, wei.liu2,
	dgdegra
Add document to introduce basic concepts and terms in PSR family
technologies and the xl interfaces.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v7:
* Correct 'xl psr-hwinfo'.
Changes in v6:
* Address comments from Ian.
Changes in v5:
* Address comments from Andrew/Ian.
---
 docs/man/xl.pod.1         |   7 ++-
 docs/misc/xl-psr.markdown | 133 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 139 insertions(+), 1 deletion(-)
 create mode 100644 docs/misc/xl-psr.markdown
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 4dd5388..1fe3ac4 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1490,6 +1490,9 @@ Intel Haswell and later server platforms offer shared resource monitoring
 and control technologies. The availability of these technologies and the
 hardware capabilities can be shown with B<psr-hwinfo>.
 
+See L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html> for more
+information.
+
 =over 4
 
 =item B<psr-hwinfo> [I<OPTIONS>]
@@ -1561,7 +1564,8 @@ on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
 
 =item B<psr-cat-cbm-set> [I<OPTIONS>] I<domain-id> I<cbm>
 
-Set cache capacity bitmasks(CBM) for a domain.
+Set cache capacity bitmasks(CBM) for a domain. For how to specify I<cbm>
+please refer to L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>.
 
 B<OPTIONS>
 
@@ -1602,6 +1606,7 @@ And the following documents on the xen.org website:
 L<http://xenbits.xen.org/docs/unstable/misc/xl-network-configuration.html>
 L<http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt>
 L<http://xenbits.xen.org/docs/unstable/misc/xsm-flask.txt>
+L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>
 
 For systems that don't automatically bring CPU online:
 
diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
new file mode 100644
index 0000000..3545912
--- /dev/null
+++ b/docs/misc/xl-psr.markdown
@@ -0,0 +1,133 @@
+# Intel Platform Shared Resource Monitoring/Control in xl
+
+This document introduces Intel Platform Shared Resource Monitoring/Control
+technologies, their basic concepts and the xl interfaces.
+
+## Cache Monitoring Technology (CMT)
+
+Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
+and later server platforms that allows an OS or Hypervisor/VMM to determine
+the usage of cache (currently only L3 cache supported) by applications running
+on the platform. A Resource Monitoring ID (RMID) is the abstraction of the
+application(s) that will be monitored for its cache usage. The CMT hardware
+tracks cache utilization of memory accesses according to the RMID and reports
+monitored data via a counter register.
+
+For more detailed information please refer to Intel SDM chapter
+"17.14 - Platform Shared Resource Monitoring: Cache Monitoring Technology".
+
+In Xen's implementation, each domain in the system can be assigned a RMID
+independently, while RMID=0 is reserved for monitoring domains that don't
+have CMT service attached. RMID is opaque for xl/libxl and is only used in
+hypervisor.
+
+### xl interfaces
+
+A domain is assigned a RMID implicitly by attaching it to CMT service:
+
+`xl psr-cmt-attach <domid>`
+
+After that, cache usage for the domain can be shown by:
+
+`xl psr-cmt-show cache-occupancy <domid>`
+
+Once monitoring is not needed any more, the domain can be detached from the
+CMT service by:
+
+`xl psr-cmt-detach <domid>`
+
+An attach may fail because of no free RMID available. In such case unused
+RMID(s) can be freed by detaching corresponding domains from CMT service.
+
+Maximum RMID and supported monitor types in the system can be obtained by:
+
+`xl psr-hwinfo --cmt`
+
+## Memory Bandwidth Monitoring (MBM)
+
+Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel
+Broadwell and later server platforms which builds on the CMT infrastructure to
+allow monitoring of system memory bandwidth. It introduces two new monitoring
+event type to monitor system total/local memory bandwidth. The same RMID can
+be used to monitor both cache usage and memory bandwidth at the same time.
+
+For more detailed information please refer to Intel SDM chapter
+"17.14 - Platform Shared Resource Monitoring: Cache Monitoring Technology".
+
+In Xen's implementation, MBM shares the same set of underlying monitoring
+service with CMT and can be used to monitor memory bandwidth on a per domain
+basis.
+
+The xl interfaces are the same with that of CMT. The difference is the
+monitor type is corresponding memory monitoring type (local-mem-bandwidth/
+total-mem-bandwidth instead of cache-occupancy). E.g. after a `xl psr-cmt-attach`:
+
+`xl psr-cmt-show local-mem-bandwidth <domid>`
+
+`xl psr-cmt-show total-mem-bandwidth <domid>`
+
+## Cache Allocation Technology (CAT)
+
+Cache Allocation Technology (CAT) is a new feature available on Intel
+Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
+partition cache allocation (i.e. L3 cache) based on application priority or
+Class of Service (COS). Each COS is configured using capacity bitmasks (CBM)
+which represent cache capacity and indicate the degree of overlap and
+isolation between classes. System cache resource is divided into numbers of
+minimum portions which is then made up into subset for cache partition. Each
+portion corresponds to a bit in CBM and the set bit represents the
+corresponding cache portion is available.
+
+For example, assuming a system with 8 portions and 3 domains:
+
+ * A CBM of 0xff for every domain means each domain can access the whole cache.
+   This is the default.
+
+ * Giving one domain a CBM of 0x0f and the other two domain's 0xf0 means that
+   the first domain gets exclusive access to half of the cache (half of the
+   portions) and the other two will share the other half.
+
+ * Giving one domain a CBM of 0x0f, one 0x30 and the last 0xc0 would give the
+   first domain exclusive access to half the cache, and the other two exclusive
+   access to one quarter each.
+
+For more detailed information please refer to Intel SDM chapter
+"17.15 - Platform Shared Resource Control: Cache Allocation Technology".
+
+In Xen's implementation, CBM can be configured with libxl/xl interfaces but
+COS is maintained in hypervisor only. The cache partition granularity is per
+domain, each domain has COS=0 assigned by default, the corresponding CBM is
+all-ones, which means all the cache resource can be used by default.
+
+### xl interfaces
+
+System CAT information such as maximum COS and CBM length can be obtained by:
+
+`xl psr-hwinfo --cat`
+
+The simplest way to change a domain's CBM from its default is running:
+
+`xl psr-cat-cbm-set  [OPTIONS] <domid> <cbm>`
+
+where cbm is a number to represent the corresponding cache subset can be used.
+A cbm is valid only when:
+
+ * Set bits only exist in the range of [0, cbm_len), where cbm_len can be
+   obtained with `xl psr-hwinfo --cat`.
+ * All the set bits are contiguous.
+
+In a multi-socket system, the same cbm will be set on each socket by default.
+Per socket cbm can be specified with the `--socket SOCKET` option.
+
+Setting the CBM may not be successful if insufficient COS is available. In
+such case unused COS(es) may be freed by setting CBM of all related domains to
+its default value(all-ones).
+
+Per domain CBM settings can be shown by:
+
+`xl psr-cat-show`
+
+## Reference
+
+[1] Intel SDM
+(http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html).
-- 
1.9.1
^ permalink raw reply related	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-05-21  8:41 ` [PATCH v8 01/13] x86: add socket_cpumask Chao Peng
@ 2015-05-28 12:38   ` Jan Beulich
  2015-05-29  2:35     ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-05-28 12:38 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> --- a/xen/arch/x86/mpparse.c
> +++ b/xen/arch/x86/mpparse.c
> @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
>  #endif
>  }
>  
> +void __init set_nr_sockets(void)
> +{
> +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
> +                                      boot_cpu_data.x86_max_cores *
> +                                      boot_cpu_data.x86_num_siblings);
How did you come to this expression for the bitmap size? I.e.
why not simply physids_weight(phys_cpu_present_map)?
> +
> +    if ( cpus == 0 )
> +        cpus = 1;
> +
> +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
> +}
Is there a reason why this can't just be added to the end of the
immediately preceding set_nr_cpu_ids()?
> @@ -638,6 +649,8 @@ static int cpu_smpboot_alloc(unsigned int cpu)
>      unsigned int order, memflags = 0;
>      nodeid_t node = cpu_to_node(cpu);
>      struct desc_struct *gdt;
> +    unsigned int socket = cpu_to_socket(cpu);
> +
>  
>      if ( node != NUMA_NO_NODE )
Stray blank line being added.
> @@ -717,6 +734,12 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  
>      stack_base[0] = stack_start;
>  
> +    socket_cpumask = xzalloc_array(cpumask_var_t, nr_sockets);
> +    if ( !socket_cpumask )
> +        panic("No memory for socket CPU siblings map");
> +    if ( !zalloc_cpumask_var(socket_cpumask) )
> +        panic("No memory for socket CPU siblings cpumask");
Please combine the two if()s to have just a single panic() invocation.
If either fails, it doesn't really matter which one it was.
> --- a/xen/include/asm-x86/smp.h
> +++ b/xen/include/asm-x86/smp.h
> @@ -58,6 +58,15 @@ int hard_smp_processor_id(void);
>  
>  void __stop_this_cpu(void);
>  
> +/*
> + * The value may be greater than the actual socket number in the system and
> + * is considered not to change from the initial startup.
> + */
> +extern unsigned int nr_sockets;
In the comment, instead of "considered" do you perhaps mean
"expected" or even "required"?
> +/* Representing HT and core siblings in each socket */
> +extern cpumask_var_t *socket_cpumask;
Comment style.
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 02/13] x86: detect and initialize Intel CAT feature
  2015-05-21  8:41 ` [PATCH v8 02/13] x86: detect and initialize Intel CAT feature Chao Peng
@ 2015-05-28 12:54   ` Jan Beulich
  2015-05-29  2:40     ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-05-28 12:54 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -19,14 +19,25 @@
>  #include <asm/psr.h>
>  
>  #define PSR_CMT        (1<<0)
> +#define PSR_CAT        (1<<1)
> +
> +struct psr_cat_socket_info {
> +    unsigned int cbm_len;
> +    unsigned int cos_max;
> +};
>  
>  struct psr_assoc {
>      uint64_t val;
>  };
>  
>  struct psr_cmt *__read_mostly psr_cmt;
> +
> +static unsigned long *__read_mostly cat_socket_enable;
> +static struct psr_cat_socket_info *__read_mostly cat_socket_info;
> +
>  static unsigned int __initdata opt_psr;
>  static unsigned int __initdata opt_rmid_max = 255;
> +static unsigned int opt_cos_max = 255;
opt_* variables should generally be either __initdata or
__read_mostly (the latter in this case).
> @@ -194,16 +209,86 @@ void psr_ctxt_switch_to(struct domain *d)
>      }
>  }
>  
> +static void cat_cpu_init(void)
> +{
> +    unsigned int eax, ebx, ecx, edx;
> +    struct psr_cat_socket_info *info;
> +    unsigned int socket;
> +    unsigned int cpu = smp_processor_id();
> +    const struct cpuinfo_x86 *c = cpu_data + cpu;
> +
> +    if ( !cpu_has(c, X86_FEATURE_CAT) )
> +        return;
> +
> +    socket = cpu_to_socket(cpu);
> +    if ( test_bit(socket, cat_socket_enable) )
> +        return;
> +
> +    cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
While one would hope that X86_FEATURE_CAT implies the respective
CPUID leaf being available, I think explicitly checking this should still
be done just like is the case elsewhere.
> +    if ( ebx & PSR_RESOURCE_TYPE_L3 )
> +    {
> +        cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
> +        info = cat_socket_info + socket;
> +        info->cbm_len = (eax & 0x1f) + 1;
> +        info->cos_max = min(opt_cos_max, edx & 0xffff);
> +
> +        set_bit(socket, cat_socket_enable);
> +        printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
> +               socket, info->cos_max, info->cbm_len);
> +    }
> +}
> +
> +static void cat_cpu_fini(unsigned int cpu)
> +{
> +    unsigned int socket = cpu_to_socket(cpu);
> +
> +    if ( !socket_cpumask[socket] || cpumask_empty(socket_cpumask[socket]) )
> +        clear_bit(socket, cat_socket_enable);
> +}
This being called from the CPU_DEAD notification, you now depend
on cpu_smpboot_free) to run ahead of you. Which isn't the case
afaict, and even if it happened to be that way you shouldn't rely
on it without explicitly enforcing ordering between the two by
setting the priority of on of them to a non-default value.
> +static void __init init_psr_cat(void)
> +{
> +    if ( opt_cos_max < 1 )
> +    {
> +        printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
> +        return;
> +    }
Is opt_cos_max == 1 really useful for anything?
>  static int cpu_callback(
>      struct notifier_block *nfb, unsigned long action, void *hcpu)
>  {
> +    unsigned int cpu = (unsigned long)hcpu;
> +
>      if ( action == CPU_STARTING )
>          psr_cpu_init();
> +    else if ( action == CPU_DEAD )
> +        psr_cpu_fini(cpu);
switch()
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-05-21  8:41 ` [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket Chao Peng
@ 2015-05-28 13:17   ` Jan Beulich
  2015-05-29  2:43     ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-05-28 13:17 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> For each socket, a COS to CBM mapping structure is maintained for each
> COS. The mapping is indexed by COS and the value is the corresponding
> CBM. Different VMs may use the same CBM, a reference count is used to
> indicate if the CBM is available.
> 
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> Changes in v8:
> * Move the memory allocation and CAT initialization code to CPU_UP_PREPARE.
> * Add memory freeing code in CPU_DEAD path.
Changes like this imo invalidate any tags given for earlier versions.
> +
> +    *rc = 0;
> +
> +    return;
> +
> +}
> +
> +
Stray return and blank lines.
> +static int cat_cpu_init(unsigned int cpu)
> +{
> +    int rc;
> +    const struct cpuinfo_x86 *c = cpu_data + cpu;
> +
> +    if ( !cpu_has(c, X86_FEATURE_CAT) )
> +        return 0;
> +
> +    if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
> +        return 0;
> +
> +    if ( cpu == smp_processor_id() )
> +        do_cat_cpu_init(&rc);
> +    else
> +        on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, &rc, 1);
This now being called in the context of CPU_UP_PREPARE, I can't see
how this works at all: Neither would the CPU's cpu_data[] instance be
initialized by that time, nor would you be able to IPI that CPU, nor can I
see how the if() branch could ever get entered. Was this tested at all?
> @@ -283,14 +331,24 @@ static void psr_cpu_fini(unsigned int cpu)
>  static int cpu_callback(
>      struct notifier_block *nfb, unsigned long action, void *hcpu)
>  {
> +    int rc = 0;
>      unsigned int cpu = (unsigned long)hcpu;
>  
> -    if ( action == CPU_STARTING )
> -        psr_cpu_init();
> -    else if ( action == CPU_DEAD )
> +    switch ( action )
> +    {
> +    case CPU_UP_PREPARE:
> +        rc = psr_cpu_prepare(cpu);
> +        break;
> +    case CPU_STARTING:
> +        psr_cpu_starting();
This not being run for the boot CPU, ...
> @@ -305,7 +363,7 @@ static int __init psr_presmp_init(void)
>      if ( opt_psr & PSR_CAT )
>          init_psr_cat();
>  
> -    psr_cpu_init();
> +    psr_cpu_prepare(0);
>      if ( psr_cmt_enabled() || cat_socket_info )
... don't you need to call it here too?
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 05/13] x86: expose CBM length and COS number information
  2015-05-21  8:41 ` [PATCH v8 05/13] x86: expose CBM length and COS number information Chao Peng
@ 2015-05-28 13:26   ` Jan Beulich
  2015-05-28 15:46     ` Dario Faggioli
  2015-05-29  2:47     ` Chao Peng
  0 siblings, 2 replies; 38+ messages in thread
From: Jan Beulich @ 2015-05-28 13:26 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -216,6 +216,38 @@ void psr_ctxt_switch_to(struct domain *d)
>      }
>  }
>  
> +static int get_cat_socket_info(unsigned int socket,
> +                               struct psr_cat_socket_info **info)
> +{
> +    if ( !cat_socket_info )
> +        return -ENODEV;
> +
> +    if ( socket >= nr_sockets )
> +        return -EBADSLT;
> +
> +    if ( !test_bit(socket, cat_socket_enable) )
> +        return -ENOENT;
> +
> +    *info = cat_socket_info + socket;
> +
> +    return 0;
> +}
> +
> +int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
> +                        uint32_t *cos_max)
> +{
> +    struct psr_cat_socket_info *info;
> +    int ret = get_cat_socket_info(socket, &info);
> +
> +    if ( ret )
> +        return ret;
> +
> +    *cbm_len = info->cbm_len;
> +    *cos_max = info->cos_max;
> +
> +    return 0;
> +}
I doubt all supported compiler versions will be able to see that "info"
can't be used uninitialized here. Please make this explicit.
I also think this small a function shouldn't have two return points.
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
>  typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
>  
> +#define XEN_SYSCTL_PSR_CAT_get_l3_info               0
> +struct xen_sysctl_psr_cat_op {
> +    uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
> +    uint32_t target;    /* IN: socket to be operated on */
If this is always the socket number, why would the variable be
named anything other than "socket". If otoh subsequent patches
use it differently, I think the comment should be omitted now
rather than being dropped then (or it should be given its final
wording from the beginning).
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 05/13] x86: expose CBM length and COS number information
  2015-05-28 13:26   ` Jan Beulich
@ 2015-05-28 15:46     ` Dario Faggioli
  2015-05-29  2:47     ` Chao Peng
  1 sibling, 0 replies; 38+ messages in thread
From: Dario Faggioli @ 2015-05-28 15:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	Ian.Jackson, xen-devel, will.auld, Chao Peng, dgdegra, keir
[-- Attachment #1.1: Type: text/plain, Size: 1566 bytes --]
On Thu, 2015-05-28 at 14:26 +0100, Jan Beulich wrote:
> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
> >  typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
> >  
> > +#define XEN_SYSCTL_PSR_CAT_get_l3_info               0
> > +struct xen_sysctl_psr_cat_op {
> > +    uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
> > +    uint32_t target;    /* IN: socket to be operated on */
> 
> If this is always the socket number, why would the variable be
> named anything other than "socket". If otoh subsequent patches
> use it differently, I think the comment should be omitted now
> rather than being dropped then (or it should be given its final
> wording from the beginning).
> 
ISTR asking about this interface before (it might have been at the
toolstack level, though), and the answer was that it is not subsequent
patches _in_this_series_ (of course), but maybe future ones that will
modify the 'per-socket-ness' nature of this feature.
So, yes, maybe the comment should say something along those lines (and
be updated when things change).
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-05-28 12:38   ` Jan Beulich
@ 2015-05-29  2:35     ` Chao Peng
  2015-05-29  8:01       ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-29  2:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> > --- a/xen/arch/x86/mpparse.c
> > +++ b/xen/arch/x86/mpparse.c
> > @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
> >  #endif
> >  }
> >  
> > +void __init set_nr_sockets(void)
> > +{
> > +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
> > +                                      boot_cpu_data.x86_max_cores *
> > +                                      boot_cpu_data.x86_num_siblings);
> 
> How did you come to this expression for the bitmap size? I.e.
> why not simply physids_weight(phys_cpu_present_map)?
physids_weight(phys_cpu_present_map) gives me cpus for all sockets.
While here the 'cpus' is actually _cpus_per_socket_. I used the max
possible cpus indicated in cpuid as the upper bound so bitmap_weight()
returns the actual available cpus on socket 0.
> 
> > +
> > +    if ( cpus == 0 )
> > +        cpus = 1;
> > +
> > +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
> > +}
> 
> Is there a reason why this can't just be added to the end of the
> immediately preceding set_nr_cpu_ids()?
You mean the declaration or invocation? If the former I have no special
reason for it (e.g. I can change it).
> > +/* Representing HT and core siblings in each socket */
> > +extern cpumask_var_t *socket_cpumask;
> 
> Comment style.
Ah, stop is missing here.
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 02/13] x86: detect and initialize Intel CAT feature
  2015-05-28 12:54   ` Jan Beulich
@ 2015-05-29  2:40     ` Chao Peng
  2015-05-29  8:03       ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-29  2:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Thu, May 28, 2015 at 01:54:39PM +0100, Jan Beulich wrote:
> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> > +
> > +    if ( !cpu_has(c, X86_FEATURE_CAT) )
> > +        return;
> > +
> > +    socket = cpu_to_socket(cpu);
> > +    if ( test_bit(socket, cat_socket_enable) )
> > +        return;
> > +
> > +    cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
> 
> While one would hope that X86_FEATURE_CAT implies the respective
> CPUID leaf being available, I think explicitly checking this should still
> be done just like is the case elsewhere.
Against cpuid_level?
> 
> > +    if ( ebx & PSR_RESOURCE_TYPE_L3 )
> > +    {
> > +        cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
> > +        info = cat_socket_info + socket;
> > +        info->cbm_len = (eax & 0x1f) + 1;
> > +        info->cos_max = min(opt_cos_max, edx & 0xffff);
> > +
> > +        set_bit(socket, cat_socket_enable);
> > +        printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
> > +               socket, info->cos_max, info->cbm_len);
> > +    }
> > +}
> > +
> > +static void cat_cpu_fini(unsigned int cpu)
> > +{
> > +    unsigned int socket = cpu_to_socket(cpu);
> > +
> > +    if ( !socket_cpumask[socket] || cpumask_empty(socket_cpumask[socket]) )
> > +        clear_bit(socket, cat_socket_enable);
> > +}
> 
> This being called from the CPU_DEAD notification, you now depend
> on cpu_smpboot_free) to run ahead of you. Which isn't the case
> afaict, and even if it happened to be that way you shouldn't rely
> on it without explicitly enforcing ordering between the two by
> setting the priority of on of them to a non-default value.
Yes, seems changing the priority of psr_cpu_callback to 1 is enough.
> 
> > +static void __init init_psr_cat(void)
> > +{
> > +    if ( opt_cos_max < 1 )
> > +    {
> > +        printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
> > +        return;
> > +    }
> 
> Is opt_cos_max == 1 really useful for anything?
That means two COSes are available. cos=0 is reserved and cos=1 can
still be used anyway.
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-05-28 13:17   ` Jan Beulich
@ 2015-05-29  2:43     ` Chao Peng
  2015-05-29  8:06       ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-29  2:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Thu, May 28, 2015 at 02:17:54PM +0100, Jan Beulich wrote:
> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> > For each socket, a COS to CBM mapping structure is maintained for each
> > COS. The mapping is indexed by COS and the value is the corresponding
> > CBM. Different VMs may use the same CBM, a reference count is used to
> > indicate if the CBM is available.
> > 
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > ---
> > Changes in v8:
> > * Move the memory allocation and CAT initialization code to CPU_UP_PREPARE.
> > * Add memory freeing code in CPU_DEAD path.
> 
> Changes like this imo invalidate any tags given for earlier versions.
Sure, I will remove it.
> > +static int cat_cpu_init(unsigned int cpu)
> > +{
> > +    int rc;
> > +    const struct cpuinfo_x86 *c = cpu_data + cpu;
> > +
> > +    if ( !cpu_has(c, X86_FEATURE_CAT) )
> > +        return 0;
> > +
> > +    if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
> > +        return 0;
> > +
> > +    if ( cpu == smp_processor_id() )
> > +        do_cat_cpu_init(&rc);
> > +    else
> > +        on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, &rc, 1);
> 
> This now being called in the context of CPU_UP_PREPARE, I can't see
> how this works at all: Neither would the CPU's cpu_data[] instance be
> initialized by that time, nor would you be able to IPI that CPU, nor can I
> see how the if() branch could ever get entered. Was this tested at all?
Ah, yes! So it sounds really a little difficult to move the memory
allocation from CPU_STARTING to CPU_PREPARA for this case.
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 05/13] x86: expose CBM length and COS number information
  2015-05-28 13:26   ` Jan Beulich
  2015-05-28 15:46     ` Dario Faggioli
@ 2015-05-29  2:47     ` Chao Peng
  2015-05-29  8:07       ` Jan Beulich
  1 sibling, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-29  2:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Thu, May 28, 2015 at 02:26:03PM +0100, Jan Beulich wrote:
> 
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
> >  typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
> >  
> > +#define XEN_SYSCTL_PSR_CAT_get_l3_info               0
> > +struct xen_sysctl_psr_cat_op {
> > +    uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
> > +    uint32_t target;    /* IN: socket to be operated on */
> 
> If this is always the socket number, why would the variable be
> named anything other than "socket". If otoh subsequent patches
> use it differently, I think the comment should be omitted now
> rather than being dropped then (or it should be given its final
> wording from the beginning).
Or 'target to be operated on'?
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-05-29  2:35     ` Chao Peng
@ 2015-05-29  8:01       ` Jan Beulich
  2015-05-29  8:28         ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-05-29  8:01 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 29.05.15 at 04:35, <chao.p.peng@linux.intel.com> wrote:
> On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
>> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
>> > --- a/xen/arch/x86/mpparse.c
>> > +++ b/xen/arch/x86/mpparse.c
>> > @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
>> >  #endif
>> >  }
>> >  
>> > +void __init set_nr_sockets(void)
>> > +{
>> > +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
>> > +                                      boot_cpu_data.x86_max_cores *
>> > +                                      boot_cpu_data.x86_num_siblings);
>> 
>> How did you come to this expression for the bitmap size? I.e.
>> why not simply physids_weight(phys_cpu_present_map)?
> 
> physids_weight(phys_cpu_present_map) gives me cpus for all sockets.
> While here the 'cpus' is actually _cpus_per_socket_. I used the max
> possible cpus indicated in cpuid as the upper bound so bitmap_weight()
> returns the actual available cpus on socket 0.
In which case the variable name is badly chosen, or a respective
comment is missing.
>> > +
>> > +    if ( cpus == 0 )
>> > +        cpus = 1;
>> > +
>> > +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
>> > +}
>> 
>> Is there a reason why this can't just be added to the end of the
>> immediately preceding set_nr_cpu_ids()?
> 
> You mean the declaration or invocation? If the former I have no special
> reason for it (e.g. I can change it).
Neither - I just don't see the need for a new function.
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 02/13] x86: detect and initialize Intel CAT feature
  2015-05-29  2:40     ` Chao Peng
@ 2015-05-29  8:03       ` Jan Beulich
  0 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2015-05-29  8:03 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 29.05.15 at 04:40, <chao.p.peng@linux.intel.com> wrote:
> On Thu, May 28, 2015 at 01:54:39PM +0100, Jan Beulich wrote:
>> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
>> > +
>> > +    if ( !cpu_has(c, X86_FEATURE_CAT) )
>> > +        return;
>> > +
>> > +    socket = cpu_to_socket(cpu);
>> > +    if ( test_bit(socket, cat_socket_enable) )
>> > +        return;
>> > +
>> > +    cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
>> 
>> While one would hope that X86_FEATURE_CAT implies the respective
>> CPUID leaf being available, I think explicitly checking this should still
>> be done just like is the case elsewhere.
> 
> Against cpuid_level?
Of course.
>> > +static void __init init_psr_cat(void)
>> > +{
>> > +    if ( opt_cos_max < 1 )
>> > +    {
>> > +        printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
>> > +        return;
>> > +    }
>> 
>> Is opt_cos_max == 1 really useful for anything?
> 
> That means two COSes are available. cos=0 is reserved and cos=1 can
> still be used anyway.
Ah, sorry, this is _max_, not _count_.
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-05-29  2:43     ` Chao Peng
@ 2015-05-29  8:06       ` Jan Beulich
  2015-05-29  8:38         ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-05-29  8:06 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 29.05.15 at 04:43, <chao.p.peng@linux.intel.com> wrote:
> On Thu, May 28, 2015 at 02:17:54PM +0100, Jan Beulich wrote:
>> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
>> > +static int cat_cpu_init(unsigned int cpu)
>> > +{
>> > +    int rc;
>> > +    const struct cpuinfo_x86 *c = cpu_data + cpu;
>> > +
>> > +    if ( !cpu_has(c, X86_FEATURE_CAT) )
>> > +        return 0;
>> > +
>> > +    if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
>> > +        return 0;
>> > +
>> > +    if ( cpu == smp_processor_id() )
>> > +        do_cat_cpu_init(&rc);
>> > +    else
>> > +        on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, &rc, 1);
>> 
>> This now being called in the context of CPU_UP_PREPARE, I can't see
>> how this works at all: Neither would the CPU's cpu_data[] instance be
>> initialized by that time, nor would you be able to IPI that CPU, nor can I
>> see how the if() branch could ever get entered. Was this tested at all?
> 
> Ah, yes! So it sounds really a little difficult to move the memory
> allocation from CPU_STARTING to CPU_PREPARA for this case.
Not sure why you talk about memory allocation again. That should
be done in CPU_UP_PREPARE. But stuff that needs to happen on
the CPU should happen in CPU_STARTING. The memory allocation's
size depending on a CPU characteristic of course makes this a little
problematic, but (I think I said so before) since we're assuming
symmetry in many other places, I don't see anything wrong with
you assuming symmetry here too, and hence use e.g. the boot CPU's
value to determine the allocation size.
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 05/13] x86: expose CBM length and COS number information
  2015-05-29  2:47     ` Chao Peng
@ 2015-05-29  8:07       ` Jan Beulich
  2015-05-29  9:23         ` Dario Faggioli
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-05-29  8:07 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 29.05.15 at 04:47, <chao.p.peng@linux.intel.com> wrote:
> On Thu, May 28, 2015 at 02:26:03PM +0100, Jan Beulich wrote:
>> 
>> > --- a/xen/include/public/sysctl.h
>> > +++ b/xen/include/public/sysctl.h
>> > @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
>> >  typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
>> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
>> >  
>> > +#define XEN_SYSCTL_PSR_CAT_get_l3_info               0
>> > +struct xen_sysctl_psr_cat_op {
>> > +    uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
>> > +    uint32_t target;    /* IN: socket to be operated on */
>> 
>> If this is always the socket number, why would the variable be
>> named anything other than "socket". If otoh subsequent patches
>> use it differently, I think the comment should be omitted now
>> rather than being dropped then (or it should be given its final
>> wording from the beginning).
> 
> Or 'target to be operated on'?
Fine with me. Just not something that may end up being confusing.
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-05-29  8:01       ` Jan Beulich
@ 2015-05-29  8:28         ` Chao Peng
  2015-05-29  8:52           ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-29  8:28 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Fri, May 29, 2015 at 09:01:53AM +0100, Jan Beulich wrote:
> >>> On 29.05.15 at 04:35, <chao.p.peng@linux.intel.com> wrote:
> > On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
> >> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> >> > --- a/xen/arch/x86/mpparse.c
> >> > +++ b/xen/arch/x86/mpparse.c
> >> > @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
> >> >  #endif
> >> >  }
> >> >  
> >> > +void __init set_nr_sockets(void)
> >> > +{
> >> > +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
> >> > +                                      boot_cpu_data.x86_max_cores *
> >> > +                                      boot_cpu_data.x86_num_siblings);
> >> 
> >> How did you come to this expression for the bitmap size? I.e.
> >> why not simply physids_weight(phys_cpu_present_map)?
> > 
> > physids_weight(phys_cpu_present_map) gives me cpus for all sockets.
> > While here the 'cpus' is actually _cpus_per_socket_. I used the max
> > possible cpus indicated in cpuid as the upper bound so bitmap_weight()
> > returns the actual available cpus on socket 0.
> 
> In which case the variable name is badly chosen, or a respective
> comment is missing.
> 
> >> > +
> >> > +    if ( cpus == 0 )
> >> > +        cpus = 1;
> >> > +
> >> > +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
> >> > +}
> >> 
> >> Is there a reason why this can't just be added to the end of the
> >> immediately preceding set_nr_cpu_ids()?
> > 
> > You mean the declaration or invocation? If the former I have no special
> > reason for it (e.g. I can change it).
> 
> Neither - I just don't see the need for a new function.
In which case the invocation of set_nr_cpu_ids() should move to the
place where now set_nr_sockets() is invoked, to make sure
boot_cpu_data.x86_max_cores/x86_num_siblings available, which may not be
your expectation.
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-05-29  8:06       ` Jan Beulich
@ 2015-05-29  8:38         ` Chao Peng
  2015-06-01  8:05           ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-05-29  8:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Fri, May 29, 2015 at 09:06:46AM +0100, Jan Beulich wrote:
> >>> On 29.05.15 at 04:43, <chao.p.peng@linux.intel.com> wrote:
> > On Thu, May 28, 2015 at 02:17:54PM +0100, Jan Beulich wrote:
> >> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> >> > +static int cat_cpu_init(unsigned int cpu)
> >> > +{
> >> > +    int rc;
> >> > +    const struct cpuinfo_x86 *c = cpu_data + cpu;
> >> > +
> >> > +    if ( !cpu_has(c, X86_FEATURE_CAT) )
> >> > +        return 0;
> >> > +
> >> > +    if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
> >> > +        return 0;
> >> > +
> >> > +    if ( cpu == smp_processor_id() )
> >> > +        do_cat_cpu_init(&rc);
> >> > +    else
> >> > +        on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, &rc, 1);
> >> 
> >> This now being called in the context of CPU_UP_PREPARE, I can't see
> >> how this works at all: Neither would the CPU's cpu_data[] instance be
> >> initialized by that time, nor would you be able to IPI that CPU, nor can I
> >> see how the if() branch could ever get entered. Was this tested at all?
> > 
> > Ah, yes! So it sounds really a little difficult to move the memory
> > allocation from CPU_STARTING to CPU_PREPARA for this case.
> 
> Not sure why you talk about memory allocation again. That should
> be done in CPU_UP_PREPARE. But stuff that needs to happen on
> the CPU should happen in CPU_STARTING. The memory allocation's
> size depending on a CPU characteristic of course makes this a little
> problematic, but (I think I said so before) since we're assuming
> symmetry in many other places, I don't see anything wrong with
> you assuming symmetry here too, and hence use e.g. the boot CPU's
> value to determine the allocation size.
No problem, then I can just forget the support for asymmetry in XEN.
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-05-29  8:28         ` Chao Peng
@ 2015-05-29  8:52           ` Jan Beulich
  2015-06-02  6:35             ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-05-29  8:52 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 29.05.15 at 10:28, <chao.p.peng@linux.intel.com> wrote:
> On Fri, May 29, 2015 at 09:01:53AM +0100, Jan Beulich wrote:
>> >>> On 29.05.15 at 04:35, <chao.p.peng@linux.intel.com> wrote:
>> > On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
>> >> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
>> >> > --- a/xen/arch/x86/mpparse.c
>> >> > +++ b/xen/arch/x86/mpparse.c
>> >> > @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
>> >> >  #endif
>> >> >  }
>> >> >  
>> >> > +void __init set_nr_sockets(void)
>> >> > +{
>> >> > +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
>> >> > +                                      boot_cpu_data.x86_max_cores *
>> >> > +                                      boot_cpu_data.x86_num_siblings);
>> >> > +
>> >> > +    if ( cpus == 0 )
>> >> > +        cpus = 1;
>> >> > +
>> >> > +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
>> >> > +}
>> >> 
>> >> Is there a reason why this can't just be added to the end of the
>> >> immediately preceding set_nr_cpu_ids()?
>> > 
>> > You mean the declaration or invocation? If the former I have no special
>> > reason for it (e.g. I can change it).
>> 
>> Neither - I just don't see the need for a new function.
> 
> In which case the invocation of set_nr_cpu_ids() should move to the
> place where now set_nr_sockets() is invoked, to make sure
> boot_cpu_data.x86_max_cores/x86_num_siblings available, which may not be
> your expectation.
Ah, in which case this _is_ the explanation, albeit only provided the
use of the two boot_cpu_data fields has to remain (which I had put
under question). And if these have to remain, couldn't this be done
in a presmp initcall instead of an explicitly called function?
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 05/13] x86: expose CBM length and COS number information
  2015-05-29  8:07       ` Jan Beulich
@ 2015-05-29  9:23         ` Dario Faggioli
  2015-05-29  9:29           ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Dario Faggioli @ 2015-05-29  9:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	Ian.Jackson, xen-devel, will.auld, Chao Peng, dgdegra, keir
[-- Attachment #1.1: Type: text/plain, Size: 1632 bytes --]
On Fri, 2015-05-29 at 09:07 +0100, Jan Beulich wrote:
> >>> On 29.05.15 at 04:47, <chao.p.peng@linux.intel.com> wrote:
> > On Thu, May 28, 2015 at 02:26:03PM +0100, Jan Beulich wrote:
> >> 
> >> > --- a/xen/include/public/sysctl.h
> >> > +++ b/xen/include/public/sysctl.h
> >> > @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
> >> >  typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
> >> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
> >> >  
> >> > +#define XEN_SYSCTL_PSR_CAT_get_l3_info               0
> >> > +struct xen_sysctl_psr_cat_op {
> >> > +    uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
> >> > +    uint32_t target;    /* IN: socket to be operated on */
> >> 
> >> If this is always the socket number, why would the variable be
> >> named anything other than "socket". If otoh subsequent patches
> >> use it differently, I think the comment should be omitted now
> >> rather than being dropped then (or it should be given its final
> >> wording from the beginning).
> > 
> > Or 'target to be operated on'?
> 
> Fine with me. Just not something that may end up being confusing.
> 
So, I really don't want to turn this into pure bikeshedding, but, for a
field called 'target', a comment saying 'target to be operated on' seems
rather pointless, and I'd go for omitting it (for now).
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 05/13] x86: expose CBM length and COS number information
  2015-05-29  9:23         ` Dario Faggioli
@ 2015-05-29  9:29           ` Jan Beulich
  0 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2015-05-29  9:29 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	Ian.Jackson, xen-devel, will.auld, Chao Peng, dgdegra, keir
>>> On 29.05.15 at 11:23, <dario.faggioli@citrix.com> wrote:
> On Fri, 2015-05-29 at 09:07 +0100, Jan Beulich wrote:
>> >>> On 29.05.15 at 04:47, <chao.p.peng@linux.intel.com> wrote:
>> > On Thu, May 28, 2015 at 02:26:03PM +0100, Jan Beulich wrote:
>> >> 
>> >> > --- a/xen/include/public/sysctl.h
>> >> > +++ b/xen/include/public/sysctl.h
>> >> > @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
>> >> >  typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
>> >> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
>> >> >  
>> >> > +#define XEN_SYSCTL_PSR_CAT_get_l3_info               0
>> >> > +struct xen_sysctl_psr_cat_op {
>> >> > +    uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
>> >> > +    uint32_t target;    /* IN: socket to be operated on */
>> >> 
>> >> If this is always the socket number, why would the variable be
>> >> named anything other than "socket". If otoh subsequent patches
>> >> use it differently, I think the comment should be omitted now
>> >> rather than being dropped then (or it should be given its final
>> >> wording from the beginning).
>> > 
>> > Or 'target to be operated on'?
>> 
>> Fine with me. Just not something that may end up being confusing.
>> 
> So, I really don't want to turn this into pure bikeshedding, but, for a
> field called 'target', a comment saying 'target to be operated on' seems
> rather pointless, and I'd go for omitting it (for now).
Right - my earlier response was merely meant to say I'm not
opposed to a non-confusing comment, not that I see a strict
need for a mostly redundant one here.
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-05-29  8:38         ` Chao Peng
@ 2015-06-01  8:05           ` Chao Peng
  2015-06-01  8:36             ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-06-01  8:05 UTC (permalink / raw)
  To: Jan Beulich, andrew.cooper3, Ian.Campbell
  Cc: wei.liu2, stefano.stabellini, dario.faggioli, Ian.Jackson,
	xen-devel, will.auld, keir, dgdegra
On Fri, May 29, 2015 at 04:38:31PM +0800, Chao Peng wrote:
> On Fri, May 29, 2015 at 09:06:46AM +0100, Jan Beulich wrote:
> > >>> On 29.05.15 at 04:43, <chao.p.peng@linux.intel.com> wrote:
> > > On Thu, May 28, 2015 at 02:17:54PM +0100, Jan Beulich wrote:
> > >> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> > >> > +static int cat_cpu_init(unsigned int cpu)
> > >> > +{
> > >> > +    int rc;
> > >> > +    const struct cpuinfo_x86 *c = cpu_data + cpu;
> > >> > +
> > >> > +    if ( !cpu_has(c, X86_FEATURE_CAT) )
> > >> > +        return 0;
> > >> > +
> > >> > +    if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
> > >> > +        return 0;
> > >> > +
> > >> > +    if ( cpu == smp_processor_id() )
> > >> > +        do_cat_cpu_init(&rc);
> > >> > +    else
> > >> > +        on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, &rc, 1);
> > >> 
> > >> This now being called in the context of CPU_UP_PREPARE, I can't see
> > >> how this works at all: Neither would the CPU's cpu_data[] instance be
> > >> initialized by that time, nor would you be able to IPI that CPU, nor can I
> > >> see how the if() branch could ever get entered. Was this tested at all?
> > > 
> > > Ah, yes! So it sounds really a little difficult to move the memory
> > > allocation from CPU_STARTING to CPU_PREPARA for this case.
> > 
> > Not sure why you talk about memory allocation again. That should
> > be done in CPU_UP_PREPARE. But stuff that needs to happen on
> > the CPU should happen in CPU_STARTING. The memory allocation's
> > size depending on a CPU characteristic of course makes this a little
> > problematic, but (I think I said so before) since we're assuming
> > symmetry in many other places, I don't see anything wrong with
> > you assuming symmetry here too, and hence use e.g. the boot CPU's
> > value to determine the allocation size.
> 
> No problem, then I can just forget the support for asymmetry in XEN.
As this is quite different with our original assumption and what I
described in the design doc, I'd like to have a clear summary here
before submitting the new version.
Basically speaking, the initial design tries to support systems have
different SKUs for each socket. One example would be when plugging one
HSX (Haswell server) and BDX (Broadwell server) processor into each
socket of a Grantley platform.
However, there are difficulties to support this than just to support
systems that always have the same SKUs, AFAICS:
1)  Not able to detect nr_sockets correctly at booting time, especially
when taking cpu hotplug into account. This is also why I added a boot
option for this at the beginning of this patch serial, while I agreed
it's really not a good interface for user.
2)  Unfeasible to allocate memory first and do initialization later in
cpu hotplug notifications. My former approach is performing both the
allocation and initialization in the CPU_STARTING, which is not a good
idea indicated by Jan.
For me, it's better to have it supported. But if that's difficult (just
as described above), I feel comfortable to drop it.
Let's see what changes will be made once that support is dropped:
1)  Current per-socket psr_cat_socket_info will be dropped, instead
psr_cat will be introduced which holds cos_max/cbm_len for all the
sockets, and it will be initialized only once.
2)  cat_socket_enable will be dropped as well, instead check
!!psr_cat, just as CMT did.
3)  No special hotplug consideration, as the CAT hardware information on
booting cpu is applied to others.
4)  We still support per socket cos configuration, so per socket 
'struct psr_cat_cbm *cos_to_cbm' becomes something like
'struct psr_cat_cbm *socket_cbms' which holds all the cos_to_cbm for all
the sockets, and it will be initialized at booting time.
5)  The 'target' in xen_sysctl_psr_cat_op will be removed as now all the
sockets have the same hardware info. Due to this, I'd like to retrofit
xen_sysctl_psr_cmt_op but not introduce a new sub-op for this.
XEN_DOMCTL_psr_cat_op however will still be there.
6)  Related changes in tools side/documentation also should be done.
Though the list looks long but generally it becomes simple (of course).
Suggestions are welcome.
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-06-01  8:05           ` Chao Peng
@ 2015-06-01  8:36             ` Jan Beulich
  2015-06-01  8:56               ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-06-01  8:36 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 01.06.15 at 10:05, <chao.p.peng@linux.intel.com> wrote:
> 2)  Unfeasible to allocate memory first and do initialization later in
> cpu hotplug notifications. My former approach is performing both the
> allocation and initialization in the CPU_STARTING, which is not a good
> idea indicated by Jan.
Considering
         info->cos_max = min(opt_cos_max, edx & 0xffff);
I don't see why you couldn't do an allocation using opt_cos_max
in CPU_UP_PREPARE, potentially not using all of the slots later
on.
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
  2015-06-01  8:36             ` Jan Beulich
@ 2015-06-01  8:56               ` Chao Peng
  0 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-06-01  8:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Mon, Jun 01, 2015 at 09:36:15AM +0100, Jan Beulich wrote:
> >>> On 01.06.15 at 10:05, <chao.p.peng@linux.intel.com> wrote:
> > 2)  Unfeasible to allocate memory first and do initialization later in
> > cpu hotplug notifications. My former approach is performing both the
> > allocation and initialization in the CPU_STARTING, which is not a good
> > idea indicated by Jan.
> 
> Considering
> 
>          info->cos_max = min(opt_cos_max, edx & 0xffff);
> 
> I don't see why you couldn't do an allocation using opt_cos_max
> in CPU_UP_PREPARE, potentially not using all of the slots later
> on.
If memory waste is not a problem (especially in most cases opt_cos_max is
its default 255), then this would be a solution.
As you have suggested this, then I think I can go on this path and still
have asymmetry systems been supported.
Thanks,
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-05-29  8:52           ` Jan Beulich
@ 2015-06-02  6:35             ` Chao Peng
  2015-06-02  6:57               ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Peng @ 2015-06-02  6:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Fri, May 29, 2015 at 09:52:03AM +0100, Jan Beulich wrote:
> >>> On 29.05.15 at 10:28, <chao.p.peng@linux.intel.com> wrote:
> > On Fri, May 29, 2015 at 09:01:53AM +0100, Jan Beulich wrote:
> >> >>> On 29.05.15 at 04:35, <chao.p.peng@linux.intel.com> wrote:
> >> > On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
> >> >> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> >> >> > --- a/xen/arch/x86/mpparse.c
> >> >> > +++ b/xen/arch/x86/mpparse.c
> >> >> > @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
> >> >> >  #endif
> >> >> >  }
> >> >> >  
> >> >> > +void __init set_nr_sockets(void)
> >> >> > +{
> >> >> > +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
> >> >> > +                                      boot_cpu_data.x86_max_cores *
> >> >> > +                                      boot_cpu_data.x86_num_siblings);
> >> >> > +
> >> >> > +    if ( cpus == 0 )
> >> >> > +        cpus = 1;
> >> >> > +
> >> >> > +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
> >> >> > +}
> >> >> 
> >> >> Is there a reason why this can't just be added to the end of the
> >> >> immediately preceding set_nr_cpu_ids()?
> >> > 
> >> > You mean the declaration or invocation? If the former I have no special
> >> > reason for it (e.g. I can change it).
> >> 
> >> Neither - I just don't see the need for a new function.
> > 
> > In which case the invocation of set_nr_cpu_ids() should move to the
> > place where now set_nr_sockets() is invoked, to make sure
> > boot_cpu_data.x86_max_cores/x86_num_siblings available, which may not be
> > your expectation.
> 
> Ah, in which case this _is_ the explanation, albeit only provided the
> use of the two boot_cpu_data fields has to remain (which I had put
> under question). And if these have to remain, couldn't this be done
> in a presmp initcall instead of an explicitly called function?
presmp is too late. nr_sockets will get used in smp_prepare_cpus()
before calling set_cpu_sibling_map for cpu 0.
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-06-02  6:35             ` Chao Peng
@ 2015-06-02  6:57               ` Jan Beulich
  2015-06-02  7:19                 ` Chao Peng
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2015-06-02  6:57 UTC (permalink / raw)
  To: Chao Peng
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
>>> On 02.06.15 at 08:35, <chao.p.peng@linux.intel.com> wrote:
> On Fri, May 29, 2015 at 09:52:03AM +0100, Jan Beulich wrote:
>> >>> On 29.05.15 at 10:28, <chao.p.peng@linux.intel.com> wrote:
>> > On Fri, May 29, 2015 at 09:01:53AM +0100, Jan Beulich wrote:
>> >> >>> On 29.05.15 at 04:35, <chao.p.peng@linux.intel.com> wrote:
>> >> > On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
>> >> >> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
>> >> >> > --- a/xen/arch/x86/mpparse.c
>> >> >> > +++ b/xen/arch/x86/mpparse.c
>> >> >> > @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
>> >> >> >  #endif
>> >> >> >  }
>> >> >> >  
>> >> >> > +void __init set_nr_sockets(void)
>> >> >> > +{
>> >> >> > +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
>> >> >> > +                                      boot_cpu_data.x86_max_cores *
>> >> >> > +                                      boot_cpu_data.x86_num_siblings);
>> >> >> > +
>> >> >> > +    if ( cpus == 0 )
>> >> >> > +        cpus = 1;
>> >> >> > +
>> >> >> > +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
>> >> >> > +}
>> >> >> 
>> >> >> Is there a reason why this can't just be added to the end of the
>> >> >> immediately preceding set_nr_cpu_ids()?
>> >> > 
>> >> > You mean the declaration or invocation? If the former I have no special
>> >> > reason for it (e.g. I can change it).
>> >> 
>> >> Neither - I just don't see the need for a new function.
>> > 
>> > In which case the invocation of set_nr_cpu_ids() should move to the
>> > place where now set_nr_sockets() is invoked, to make sure
>> > boot_cpu_data.x86_max_cores/x86_num_siblings available, which may not be
>> > your expectation.
>> 
>> Ah, in which case this _is_ the explanation, albeit only provided the
>> use of the two boot_cpu_data fields has to remain (which I had put
>> under question). And if these have to remain, couldn't this be done
>> in a presmp initcall instead of an explicitly called function?
> 
> presmp is too late. nr_sockets will get used in smp_prepare_cpus()
> before calling set_cpu_sibling_map for cpu 0.
Okay. In which case - why not calculate the value there?
Jan
^ permalink raw reply	[flat|nested] 38+ messages in thread
* Re: [PATCH v8 01/13] x86: add socket_cpumask
  2015-06-02  6:57               ` Jan Beulich
@ 2015-06-02  7:19                 ` Chao Peng
  0 siblings, 0 replies; 38+ messages in thread
From: Chao Peng @ 2015-06-02  7:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, andrew.cooper3,
	dario.faggioli, Ian.Jackson, xen-devel, will.auld, keir, dgdegra
On Tue, Jun 02, 2015 at 07:57:55AM +0100, Jan Beulich wrote:
> >>> On 02.06.15 at 08:35, <chao.p.peng@linux.intel.com> wrote:
> > On Fri, May 29, 2015 at 09:52:03AM +0100, Jan Beulich wrote:
> >> >>> On 29.05.15 at 10:28, <chao.p.peng@linux.intel.com> wrote:
> >> > On Fri, May 29, 2015 at 09:01:53AM +0100, Jan Beulich wrote:
> >> >> >>> On 29.05.15 at 04:35, <chao.p.peng@linux.intel.com> wrote:
> >> >> > On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
> >> >> >> >>> On 21.05.15 at 10:41, <chao.p.peng@linux.intel.com> wrote:
> >> >> >> > --- a/xen/arch/x86/mpparse.c
> >> >> >> > +++ b/xen/arch/x86/mpparse.c
> >> >> >> > @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
> >> >> >> >  #endif
> >> >> >> >  }
> >> >> >> >  
> >> >> >> > +void __init set_nr_sockets(void)
> >> >> >> > +{
> >> >> >> > +    unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
> >> >> >> > +                                      boot_cpu_data.x86_max_cores *
> >> >> >> > +                                      boot_cpu_data.x86_num_siblings);
> >> >> >> > +
> >> >> >> > +    if ( cpus == 0 )
> >> >> >> > +        cpus = 1;
> >> >> >> > +
> >> >> >> > +    nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
> >> >> >> > +}
> >> >> >> 
> >> >> >> Is there a reason why this can't just be added to the end of the
> >> >> >> immediately preceding set_nr_cpu_ids()?
> >> >> > 
> >> >> > You mean the declaration or invocation? If the former I have no special
> >> >> > reason for it (e.g. I can change it).
> >> >> 
> >> >> Neither - I just don't see the need for a new function.
> >> > 
> >> > In which case the invocation of set_nr_cpu_ids() should move to the
> >> > place where now set_nr_sockets() is invoked, to make sure
> >> > boot_cpu_data.x86_max_cores/x86_num_siblings available, which may not be
> >> > your expectation.
> >> 
> >> Ah, in which case this _is_ the explanation, albeit only provided the
> >> use of the two boot_cpu_data fields has to remain (which I had put
> >> under question). And if these have to remain, couldn't this be done
> >> in a presmp initcall instead of an explicitly called function?
> > 
> > presmp is too late. nr_sockets will get used in smp_prepare_cpus()
> > before calling set_cpu_sibling_map for cpu 0.
> 
> Okay. In which case - why not calculate the value there?
Okay, then I just need move the invocation of set_nr_sockets() from
__start_xen() to smp_prepare_cpus().
Chao
^ permalink raw reply	[flat|nested] 38+ messages in thread
end of thread, other threads:[~2015-06-02  7:19 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-21  8:41 [PATCH v8 00/13] enable Cache Allocation Technology (CAT) for VMs Chao Peng
2015-05-21  8:41 ` [PATCH v8 01/13] x86: add socket_cpumask Chao Peng
2015-05-28 12:38   ` Jan Beulich
2015-05-29  2:35     ` Chao Peng
2015-05-29  8:01       ` Jan Beulich
2015-05-29  8:28         ` Chao Peng
2015-05-29  8:52           ` Jan Beulich
2015-06-02  6:35             ` Chao Peng
2015-06-02  6:57               ` Jan Beulich
2015-06-02  7:19                 ` Chao Peng
2015-05-21  8:41 ` [PATCH v8 02/13] x86: detect and initialize Intel CAT feature Chao Peng
2015-05-28 12:54   ` Jan Beulich
2015-05-29  2:40     ` Chao Peng
2015-05-29  8:03       ` Jan Beulich
2015-05-21  8:41 ` [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket Chao Peng
2015-05-28 13:17   ` Jan Beulich
2015-05-29  2:43     ` Chao Peng
2015-05-29  8:06       ` Jan Beulich
2015-05-29  8:38         ` Chao Peng
2015-06-01  8:05           ` Chao Peng
2015-06-01  8:36             ` Jan Beulich
2015-06-01  8:56               ` Chao Peng
2015-05-21  8:41 ` [PATCH v8 04/13] x86: add COS information for each domain Chao Peng
2015-05-21  8:41 ` [PATCH v8 05/13] x86: expose CBM length and COS number information Chao Peng
2015-05-28 13:26   ` Jan Beulich
2015-05-28 15:46     ` Dario Faggioli
2015-05-29  2:47     ` Chao Peng
2015-05-29  8:07       ` Jan Beulich
2015-05-29  9:23         ` Dario Faggioli
2015-05-29  9:29           ` Jan Beulich
2015-05-21  8:41 ` [PATCH v8 06/13] x86: dynamically get/set CBM for a domain Chao Peng
2015-05-21  8:41 ` [PATCH v8 07/13] x86: add scheduling support for Intel CAT Chao Peng
2015-05-21  8:41 ` [PATCH v8 08/13] xsm: add CAT related xsm policies Chao Peng
2015-05-21  8:41 ` [PATCH v8 09/13] tools/libxl: minor name changes for CMT commands Chao Peng
2015-05-21  8:41 ` [PATCH v8 10/13] tools/libxl: add command to show PSR hardware info Chao Peng
2015-05-21  8:41 ` [PATCH v8 11/13] tools/libxl: introduce some socket helpers Chao Peng
2015-05-21  8:41 ` [PATCH v8 12/13] tools: add tools support for Intel CAT Chao Peng
2015-05-21  8:41 ` [PATCH v8 13/13] docs: add xl-psr.markdown Chao Peng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).