* [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs
@ 2015-04-09 9:18 Chao Peng
2015-04-09 9:18 ` [PATCH v4 01/12] x86: clean up psr boot parameter parsing Chao Peng
` (12 more replies)
0 siblings, 13 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
Changes in v4:
* Address comments from Andrew and Ian(Detail in patch).
* Split COS/CBM management patch into 4 small patches.
* Add documentation xl-psr.markdown.
Changes in v3:
* Address comments from Jan and Ian(Detail in patch).
* Add xl sample output in cover letter.
Changes in v2:
* Address comments from Konrad and Jan(Detail in patch):
* Make all cat unrelated changes into the preparation patches.
This patch serial enable the new Cache Allocation Technology (CAT) feature
found in Intel Broadwell and later server platform. In Xen's implementation,
CAT is used to control cache allocation on VM basis.
Detail hardware spec can be found in section 17.15 of the Intel SDM [1].
The design for XEN can be found at [2].
patch1-2: preparation.
patch3-11: real work for CAT.
patch12: xl document for CMT/MBM/CAT.
[1] Intel SDM (http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf)
[2] CAT design for XEN( http://lists.xen.org/archives/html/xen-devel/2014-12/msg01382.html)
Chao Peng (12):
x86: clean up psr boot parameter parsing
x86: improve psr scheduling code
x86: detect and initialize Intel CAT feature
x86: maintain COS to CBM mapping for each socket
x86: maintain socket CPU mask for CAT
x86: add COS information for each domain
x86: expose CBM length and COS number information
x86: dynamically get/set CBM for a domain
x86: add scheduling support for Intel CAT
xsm: add CAT related xsm policies
tools: add tools support for Intel CAT
docs: add xl-psr.markdown
docs/man/xl.pod.1 | 38 +++
docs/misc/xen-command-line.markdown | 13 +-
docs/misc/xl-psr.markdown | 111 +++++++
tools/flask/policy/policy/modules/xen/xen.if | 2 +-
tools/flask/policy/policy/modules/xen/xen.te | 4 +-
tools/libxc/include/xenctrl.h | 15 +
tools/libxc/xc_psr.c | 76 +++++
tools/libxl/libxl.h | 26 ++
tools/libxl/libxl_psr.c | 168 +++++++++-
tools/libxl/libxl_types.idl | 10 +
tools/libxl/xl.h | 4 +
tools/libxl/xl_cmdimpl.c | 140 +++++++++
tools/libxl/xl_cmdtable.c | 12 +
xen/arch/x86/domain.c | 13 +-
xen/arch/x86/domctl.c | 18 ++
xen/arch/x86/psr.c | 446 ++++++++++++++++++++++++---
xen/arch/x86/sysctl.c | 18 ++
xen/include/asm-x86/cpufeature.h | 1 +
xen/include/asm-x86/domain.h | 5 +-
xen/include/asm-x86/msr-index.h | 1 +
xen/include/asm-x86/psr.h | 14 +-
xen/include/public/domctl.h | 12 +
xen/include/public/sysctl.h | 16 +
xen/xsm/flask/hooks.c | 6 +
xen/xsm/flask/policy/access_vectors | 6 +
25 files changed, 1120 insertions(+), 55 deletions(-)
create mode 100644 docs/misc/xl-psr.markdown
--
1.9.1
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH v4 01/12] x86: clean up psr boot parameter parsing
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 20:38 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 02/12] x86: improve psr scheduling code Chao Peng
` (11 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
Change type of opt_psr from bool to int so more psr features can fit.
Introduce a new routine to parse bool parameter so that both cmt and
future psr features like cat can use it.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
Changes in v4:
* change 'int bit' to 'unsigned int mask'.
* Remove printk that will never be called.
Changes in v3:
* Set "off" value explicity if requested.
---
xen/arch/x86/psr.c | 39 +++++++++++++++++++++++----------------
1 file changed, 23 insertions(+), 16 deletions(-)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 2ef83df..344de3c 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -26,11 +26,30 @@ struct psr_assoc {
};
struct psr_cmt *__read_mostly psr_cmt;
-static bool_t __initdata opt_psr;
+static unsigned int __initdata opt_psr;
static unsigned int __initdata opt_rmid_max = 255;
static uint64_t rmid_mask;
static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
+static void __init parse_psr_bool(char *s, char *value, char *feature,
+ unsigned int mask)
+{
+ if ( !strcmp(s, feature) )
+ {
+ if ( !value )
+ opt_psr |= mask;
+ else
+ {
+ int val_int = parse_bool(value);
+
+ if ( val_int == 0 )
+ opt_psr &= ~mask;
+ else if ( val_int == 1 )
+ opt_psr |= mask;
+ }
+ }
+}
+
static void __init parse_psr_param(char *s)
{
char *ss, *val_str;
@@ -44,21 +63,9 @@ static void __init parse_psr_param(char *s)
if ( val_str )
*val_str++ = '\0';
- if ( !strcmp(s, "cmt") )
- {
- if ( !val_str )
- opt_psr |= PSR_CMT;
- else
- {
- int val_int = parse_bool(val_str);
- if ( val_int == 1 )
- opt_psr |= PSR_CMT;
- else if ( val_int != 0 )
- printk("PSR: unknown cmt value: %s - CMT disabled!\n",
- val_str);
- }
- }
- else if ( val_str && !strcmp(s, "rmid_max") )
+ parse_psr_bool(s, val_str, "cmt", PSR_CMT);
+
+ if ( val_str && !strcmp(s, "rmid_max") )
opt_rmid_max = simple_strtoul(val_str, NULL, 0);
s = ss + 1;
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 02/12] x86: improve psr scheduling code
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
2015-04-09 9:18 ` [PATCH v4 01/12] x86: clean up psr boot parameter parsing Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 21:01 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 03/12] x86: detect and initialize Intel CAT feature Chao Peng
` (10 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
Switching RMID from previous vcpu to next vcpu only needs to write
MSR_IA32_PSR_ASSOC once. Write it with the value of next vcpu is enough,
no need to write '0' first. Idle domain has RMID set to 0 and because MSR
is already updated lazily, so just switch it as it does.
Also move the initialization of per-CPU variable which used for lazy
update from context switch to CPU starting.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
Changes in v4:
* Move psr_assoc_reg_read/psr_assoc_reg_write into psr_ctxt_switch_to.
* Use 0 instead of smp_processor_id() for boot cpu.
* add cpu parameter to psr_assoc_init.
Changes in v2:
* Move initialization for psr_assoc from context switch to CPU_STARTING.
---
xen/arch/x86/domain.c | 7 ++---
xen/arch/x86/psr.c | 75 ++++++++++++++++++++++++++++++++++-------------
xen/include/asm-x86/psr.h | 3 +-
3 files changed, 59 insertions(+), 26 deletions(-)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 04c1898..695a2eb 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1444,8 +1444,6 @@ static void __context_switch(void)
{
memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
vcpu_save_fpu(p);
- if ( psr_cmt_enabled() )
- psr_assoc_rmid(0);
p->arch.ctxt_switch_from(p);
}
@@ -1470,11 +1468,10 @@ static void __context_switch(void)
}
vcpu_restore_fpu_eager(n);
n->arch.ctxt_switch_to(n);
-
- if ( psr_cmt_enabled() && n->domain->arch.psr_rmid > 0 )
- psr_assoc_rmid(n->domain->arch.psr_rmid);
}
+ psr_ctxt_switch_to(n->domain);
+
gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
per_cpu(compat_gdt_table, cpu);
if ( need_full_gdt(n) )
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 344de3c..6119c6e 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -22,7 +22,6 @@
struct psr_assoc {
uint64_t val;
- bool_t initialized;
};
struct psr_cmt *__read_mostly psr_cmt;
@@ -122,14 +121,6 @@ static void __init init_psr_cmt(unsigned int rmid_max)
printk(XENLOG_INFO "Cache Monitoring Technology enabled\n");
}
-static int __init init_psr(void)
-{
- if ( (opt_psr & PSR_CMT) && opt_rmid_max )
- init_psr_cmt(opt_rmid_max);
- return 0;
-}
-__initcall(init_psr);
-
/* Called with domain lock held, no psr specific lock needed */
int psr_alloc_rmid(struct domain *d)
{
@@ -175,26 +166,70 @@ void psr_free_rmid(struct domain *d)
d->arch.psr_rmid = 0;
}
-void psr_assoc_rmid(unsigned int rmid)
+static inline void psr_assoc_init(unsigned int cpu)
+{
+ struct psr_assoc *psra = &per_cpu(psr_assoc, cpu);
+
+ if ( psr_cmt_enabled() )
+ rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
+}
+
+static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
+{
+ *reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
+}
+
+void psr_ctxt_switch_to(struct domain *d)
{
- uint64_t val;
- uint64_t new_val;
struct psr_assoc *psra = &this_cpu(psr_assoc);
+ uint64_t reg = psra->val;
+
+ if ( psr_cmt_enabled() )
+ psr_assoc_rmid(®, d->arch.psr_rmid);
- if ( !psra->initialized )
+ if ( reg != psra->val )
{
- rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
- psra->initialized = 1;
+ wrmsrl(MSR_IA32_PSR_ASSOC, reg);
+ psra->val = reg;
}
- val = psra->val;
+}
- new_val = (val & ~rmid_mask) | (rmid & rmid_mask);
- if ( val != new_val )
+static void psr_cpu_init(unsigned int cpu)
+{
+ psr_assoc_init(cpu);
+}
+
+static int cpu_callback(
+ struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+ unsigned int cpu = (unsigned long)hcpu;
+
+ switch ( action )
{
- wrmsrl(MSR_IA32_PSR_ASSOC, new_val);
- psra->val = new_val;
+ case CPU_STARTING:
+ psr_cpu_init(cpu);
+ break;
}
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block cpu_nfb = {
+ .notifier_call = cpu_callback
+};
+
+static int __init psr_presmp_init(void)
+{
+ if ( (opt_psr & PSR_CMT) && opt_rmid_max )
+ init_psr_cmt(opt_rmid_max);
+
+ psr_cpu_init(0);
+ if ( psr_cmt_enabled() )
+ register_cpu_notifier(&cpu_nfb);
+
+ return 0;
}
+presmp_initcall(psr_presmp_init);
/*
* Local variables:
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index c6076e9..585350c 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -46,7 +46,8 @@ static inline bool_t psr_cmt_enabled(void)
int psr_alloc_rmid(struct domain *d);
void psr_free_rmid(struct domain *d);
-void psr_assoc_rmid(unsigned int rmid);
+
+void psr_ctxt_switch_to(struct domain *d);
#endif /* __ASM_PSR_H__ */
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 03/12] x86: detect and initialize Intel CAT feature
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
2015-04-09 9:18 ` [PATCH v4 01/12] x86: clean up psr boot parameter parsing Chao Peng
2015-04-09 9:18 ` [PATCH v4 02/12] x86: improve psr scheduling code Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 21:30 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 04/12] x86: maintain COS to CBM mapping for each socket Chao Peng
` (9 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
Detect Intel Cache Allocation Technology(CAT) feature and store the
cpuid information for later use. Currently only L3 cache allocation is
supported. The L3 CAT features may vary among sockets so per-socket
feature information is stored. The initialization can happen either at
boot time or when CPU(s) is hot plugged after booting.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
Changes in v4:
* check X86_FEATURE_CAT available before doing initialization.
Changes in v3:
* Remove num_sockets boot option instead calculate it at boot time.
* Name hardcoded CAT cpuid leaf as PSR_CPUID_LEVEL_CAT.
Changes in v2:
* socket_num => num_sockets and fix several documentaion issues.
* refactor boot line parameters parsing into standlone patch.
* set opt_num_sockets = NR_CPUS when opt_num_sockets > NR_CPUS.
* replace CPU_ONLINE with CPU_STARTING and integrate that into scheduling
improvement patch.
* reimplement get_max_socket() with cpu_to_socket();
* cbm is still uint64 as there is a path forward for supporting long masks.
---
docs/misc/xen-command-line.markdown | 13 +++++--
xen/arch/x86/psr.c | 68 +++++++++++++++++++++++++++++++++++--
xen/include/asm-x86/cpufeature.h | 1 +
xen/include/asm-x86/psr.h | 3 ++
4 files changed, 81 insertions(+), 4 deletions(-)
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 1dda1f0..9ad8801 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1122,9 +1122,9 @@ This option can be specified more than once (up to 8 times at present).
> `= <integer>`
### psr (Intel)
-> `= List of ( cmt:<boolean> | rmid_max:<integer> )`
+> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> )`
-> Default: `psr=cmt:0,rmid_max:255`
+> Default: `psr=cmt:0,rmid_max:255,cat:0`
Platform Shared Resource(PSR) Services. Intel Haswell and later server
platforms offer information about the sharing of resources.
@@ -1134,6 +1134,11 @@ Monitoring ID(RMID) is used to bind the domain to corresponding shared
resource. RMID is a hardware-provided layer of abstraction between software
and logical processors.
+To use the PSR cache allocation service for a certain domain, a capacity
+bitmasks(CBM) is used to bind the domain to corresponding shared resource.
+CBM represents cache capacity and indicates the degree of overlap and isolation
+between domains.
+
The following resources are available:
* Cache Monitoring Technology (Haswell and later). Information regarding the
@@ -1144,6 +1149,10 @@ The following resources are available:
total/local memory bandwidth. Follow the same options with Cache Monitoring
Technology.
+* Cache Alllocation Technology (Broadwell and later). Information regarding
+ the cache allocation.
+ * `cat` instructs Xen to enable/disable Cache Allocation Technology.
+
### reboot
> `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 6119c6e..16c37dd 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -19,17 +19,36 @@
#include <asm/psr.h>
#define PSR_CMT (1<<0)
+#define PSR_CAT (1<<1)
+
+struct psr_cat_socket_info {
+ bool_t initialized;
+ bool_t enabled;
+ unsigned int cbm_len;
+ unsigned int cos_max;
+};
struct psr_assoc {
uint64_t val;
};
struct psr_cmt *__read_mostly psr_cmt;
+static struct psr_cat_socket_info *__read_mostly cat_socket_info;
+
static unsigned int __initdata opt_psr;
static unsigned int __initdata opt_rmid_max = 255;
+static unsigned int __read_mostly nr_sockets;
static uint64_t rmid_mask;
static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
+static unsigned int get_socket_count(void)
+{
+ unsigned int cpus_per_socket = boot_cpu_data.x86_max_cores *
+ boot_cpu_data.x86_num_siblings;
+
+ return DIV_ROUND_UP(nr_cpu_ids, cpus_per_socket);
+}
+
static void __init parse_psr_bool(char *s, char *value, char *feature,
unsigned int mask)
{
@@ -63,6 +82,7 @@ static void __init parse_psr_param(char *s)
*val_str++ = '\0';
parse_psr_bool(s, val_str, "cmt", PSR_CMT);
+ parse_psr_bool(s, val_str, "cat", PSR_CAT);
if ( val_str && !strcmp(s, "rmid_max") )
opt_rmid_max = simple_strtoul(val_str, NULL, 0);
@@ -194,8 +214,49 @@ void psr_ctxt_switch_to(struct domain *d)
}
}
+static void cat_cpu_init(unsigned int cpu)
+{
+ unsigned int eax, ebx, ecx, edx;
+ struct psr_cat_socket_info *info;
+ unsigned int socket;
+ const struct cpuinfo_x86 *c = cpu_data + cpu;
+
+ if ( !cpu_has(c, X86_FEATURE_CAT) )
+ return;
+
+ socket = cpu_to_socket(cpu);
+ ASSERT(socket < nr_sockets);
+
+ info = cat_socket_info + socket;
+
+ /* Avoid initializing more than one times for the same socket. */
+ if ( test_and_set_bool(info->initialized) )
+ return;
+
+ cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
+ if ( ebx & PSR_RESOURCE_TYPE_L3 )
+ {
+ cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
+ info->cbm_len = (eax & 0x1f) + 1;
+ info->cos_max = (edx & 0xffff);
+
+ info->enabled = 1;
+ printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
+ socket, info->cos_max, info->cbm_len);
+ }
+}
+
+static void __init init_psr_cat(void)
+{
+ nr_sockets = get_socket_count();
+ cat_socket_info = xzalloc_array(struct psr_cat_socket_info, nr_sockets);
+}
+
static void psr_cpu_init(unsigned int cpu)
{
+ if ( cat_socket_info )
+ cat_cpu_init(cpu);
+
psr_assoc_init(cpu);
}
@@ -223,9 +284,12 @@ static int __init psr_presmp_init(void)
if ( (opt_psr & PSR_CMT) && opt_rmid_max )
init_psr_cmt(opt_rmid_max);
+ if ( opt_psr & PSR_CAT )
+ init_psr_cat();
+
psr_cpu_init(0);
- if ( psr_cmt_enabled() )
- register_cpu_notifier(&cpu_nfb);
+ if ( psr_cmt_enabled() || cat_socket_info )
+ register_cpu_notifier(&cpu_nfb);
return 0;
}
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 7963a3a..8c0f0a6 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -149,6 +149,7 @@
#define X86_FEATURE_CMT (7*32+12) /* Cache Monitoring Technology */
#define X86_FEATURE_NO_FPU_SEL (7*32+13) /* FPU CS/DS stored as zero */
#define X86_FEATURE_MPX (7*32+14) /* Memory Protection Extensions */
+#define X86_FEATURE_CAT (7*32+15) /* Cache Allocation Technology */
#define X86_FEATURE_RDSEED (7*32+18) /* RDSEED instruction */
#define X86_FEATURE_ADX (7*32+19) /* ADCX, ADOX instructions */
#define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 585350c..3bc5496 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -18,6 +18,9 @@
#include <xen/types.h>
+/* CAT cpuid level */
+#define PSR_CPUID_LEVEL_CAT 0x10
+
/* Resource Type Enumeration */
#define PSR_RESOURCE_TYPE_L3 0x2
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 04/12] x86: maintain COS to CBM mapping for each socket
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (2 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 03/12] x86: detect and initialize Intel CAT feature Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 21:35 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 05/12] x86: maintain socket CPU mask for CAT Chao Peng
` (8 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
For each socket, a COS to CBM mapping structure is maintained for each
COS. The mapping is indexed by COS and the value is the corresponding
CBM. Different VMs may use the same CBM, a reference count is used to
indicate if the CBM is available.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
xen/arch/x86/psr.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 16c37dd..4aff5f6 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -21,11 +21,17 @@
#define PSR_CMT (1<<0)
#define PSR_CAT (1<<1)
+struct psr_cat_cbm {
+ unsigned int ref;
+ uint64_t cbm;
+};
+
struct psr_cat_socket_info {
bool_t initialized;
bool_t enabled;
unsigned int cbm_len;
unsigned int cos_max;
+ struct psr_cat_cbm *cos_cbm_map;
};
struct psr_assoc {
@@ -240,6 +246,14 @@ static void cat_cpu_init(unsigned int cpu)
info->cbm_len = (eax & 0x1f) + 1;
info->cos_max = (edx & 0xffff);
+ info->cos_cbm_map = xzalloc_array(struct psr_cat_cbm,
+ info->cos_max + 1UL);
+ if ( !info->cos_cbm_map )
+ return;
+
+ /* cos=0 is reserved as default cbm(all ones). */
+ info->cos_cbm_map[0].cbm = (1ull << info->cbm_len) - 1;
+
info->enabled = 1;
printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
socket, info->cos_max, info->cbm_len);
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 05/12] x86: maintain socket CPU mask for CAT
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (3 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 04/12] x86: maintain COS to CBM mapping for each socket Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 21:45 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 06/12] x86: add COS information for each domain Chao Peng
` (7 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
Some CAT resource/registers exist in socket level and they must be
accessed from the CPU of the corresponding socket. It's common to pick
an arbitrary CPU from the socket. To make the picking easy, it's useful
to maintain a reference to the cpu_core_mask which contains all the
siblings of a CPU in the same socket. The reference needs to be
synchronized with the CPU up/down.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
xen/arch/x86/psr.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 4aff5f6..7de2504 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -32,6 +32,7 @@ struct psr_cat_socket_info {
unsigned int cbm_len;
unsigned int cos_max;
struct psr_cat_cbm *cos_cbm_map;
+ cpumask_t *socket_cpu_mask;
};
struct psr_assoc {
@@ -234,6 +235,8 @@ static void cat_cpu_init(unsigned int cpu)
ASSERT(socket < nr_sockets);
info = cat_socket_info + socket;
+ if ( info->socket_cpu_mask == NULL )
+ info->socket_cpu_mask = per_cpu(cpu_core_mask, cpu);
/* Avoid initializing more than one times for the same socket. */
if ( test_and_set_bool(info->initialized) )
@@ -274,6 +277,24 @@ static void psr_cpu_init(unsigned int cpu)
psr_assoc_init(cpu);
}
+static void psr_cpu_fini(unsigned int cpu)
+{
+ unsigned int socket, next;
+ cpumask_t *cpu_mask;
+
+ if ( cat_socket_info )
+ {
+ socket = cpu_to_socket(cpu);
+ cpu_mask = cat_socket_info[socket].socket_cpu_mask;
+
+ if ( (next = cpumask_cycle(cpu, cpu_mask)) == cpu )
+ cat_socket_info[socket].socket_cpu_mask = NULL;
+ else
+ cat_socket_info[socket].socket_cpu_mask =
+ per_cpu(cpu_core_mask, next);
+ }
+}
+
static int cpu_callback(
struct notifier_block *nfb, unsigned long action, void *hcpu)
{
@@ -284,6 +305,9 @@ static int cpu_callback(
case CPU_STARTING:
psr_cpu_init(cpu);
break;
+ case CPU_DYING:
+ psr_cpu_fini(cpu);
+ break;
}
return NOTIFY_DONE;
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 06/12] x86: add COS information for each domain
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (4 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 05/12] x86: maintain socket CPU mask for CAT Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 21:54 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 07/12] x86: expose CBM length and COS number information Chao Peng
` (6 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
In Xen's implementation, the CAT enforcement granularity is per domain.
Due to the length of CBM and the number of COS may be socket-different,
each domain has COS ID for each socket. The domain get COS=0 by default
and at runtime its COS is then allocated dynamically when user specifies
a CBM for the domain.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
xen/arch/x86/domain.c | 6 +++++-
xen/arch/x86/psr.c | 42 ++++++++++++++++++++++++++++++++++++++++++
xen/include/asm-x86/domain.h | 5 ++++-
xen/include/asm-x86/psr.h | 3 +++
4 files changed, 54 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 695a2eb..129d42f 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -616,6 +616,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
/* 64-bit PV guest by default. */
d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
+ if ( (rc = psr_domain_init(d)) != 0 )
+ goto fail;
+
/* initialize default tsc behavior in case tools don't */
tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0);
spin_lock_init(&d->arch.vtsc_lock);
@@ -634,6 +637,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
free_perdomain_mappings(d);
if ( is_pv_domain(d) )
free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+ psr_domain_free(d);
return rc;
}
@@ -657,7 +661,7 @@ void arch_domain_destroy(struct domain *d)
free_xenheap_page(d->shared_info);
cleanup_domain_irq_mapping(d);
- psr_free_rmid(d);
+ psr_domain_free(d);
}
void arch_domain_shutdown(struct domain *d)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 7de2504..51faa70 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -221,6 +221,48 @@ void psr_ctxt_switch_to(struct domain *d)
}
}
+/* Called with domain lock held, no psr specific lock needed */
+static void psr_free_cos(struct domain *d)
+{
+ unsigned int socket;
+ unsigned int cos;
+
+ if( !d->arch.psr_cos_ids )
+ return;
+
+ for ( socket = 0; socket < nr_sockets; socket++ )
+ {
+ if ( !cat_socket_info[socket].enabled )
+ continue;
+
+ if ( (cos = d->arch.psr_cos_ids[socket]) == 0 )
+ continue;
+
+ cat_socket_info[socket].cos_cbm_map[cos].ref--;
+ }
+
+ xfree(d->arch.psr_cos_ids);
+ d->arch.psr_cos_ids = NULL;
+}
+
+int psr_domain_init(struct domain *d)
+{
+ if ( cat_socket_info )
+ {
+ d->arch.psr_cos_ids = xzalloc_array(unsigned int, nr_sockets);
+ if ( !d->arch.psr_cos_ids )
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+void psr_domain_free(struct domain *d)
+{
+ psr_free_rmid(d);
+ psr_free_cos(d);
+}
+
static void cat_cpu_init(unsigned int cpu)
{
unsigned int eax, ebx, ecx, edx;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 9cdffa8..9c4d0e6 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -333,7 +333,10 @@ struct arch_domain
struct e820entry *e820;
unsigned int nr_e820;
- unsigned int psr_rmid; /* RMID assigned to the domain for CMT */
+ /* RMID assigned to the domain for CMT */
+ unsigned int psr_rmid;
+ /* COS assigned to the domain for each socket */
+ unsigned int *psr_cos_ids;
/* Shared page for notifying that explicit PIRQ EOI is required. */
unsigned long *pirq_eoi_map;
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 3bc5496..45392bf 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -52,6 +52,9 @@ void psr_free_rmid(struct domain *d);
void psr_ctxt_switch_to(struct domain *d);
+int psr_domain_init(struct domain *d);
+void psr_domain_free(struct domain *d);
+
#endif /* __ASM_PSR_H__ */
/*
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 07/12] x86: expose CBM length and COS number information
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (5 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 06/12] x86: add COS information for each domain Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 21:54 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 08/12] x86: dynamically get/set CBM for a domain Chao Peng
` (5 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
General CAT information such as maximum COS and CBM length are exposed to
user space by a SYSCTL hypercall, to help user space to construct the CBM.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
xen/arch/x86/psr.c | 31 +++++++++++++++++++++++++++++++
xen/arch/x86/sysctl.c | 18 ++++++++++++++++++
xen/include/asm-x86/psr.h | 3 +++
xen/include/public/sysctl.h | 16 ++++++++++++++++
4 files changed, 68 insertions(+)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 51faa70..e390fd9 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -221,6 +221,37 @@ void psr_ctxt_switch_to(struct domain *d)
}
}
+static int get_cat_socket_info(unsigned int socket,
+ struct psr_cat_socket_info **info)
+{
+ if ( !cat_socket_info )
+ return -ENODEV;
+
+ if ( socket >= nr_sockets )
+ return -EBADSLT;
+
+ if ( !cat_socket_info[socket].enabled )
+ return -ENOENT;
+
+ *info = cat_socket_info + socket;
+ return 0;
+}
+
+int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
+ uint32_t *cos_max)
+{
+ struct psr_cat_socket_info *info;
+ int ret = get_cat_socket_info(socket, &info);
+
+ if ( ret )
+ return ret;
+
+ *cbm_len = info->cbm_len;
+ *cos_max = info->cos_max;
+
+ return 0;
+}
+
/* Called with domain lock held, no psr specific lock needed */
static void psr_free_cos(struct domain *d)
{
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 611a291..8a9e120 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -171,6 +171,24 @@ long arch_do_sysctl(
break;
+ case XEN_SYSCTL_psr_cat_op:
+ switch ( sysctl->u.psr_cat_op.cmd )
+ {
+ case XEN_SYSCTL_PSR_CAT_get_l3_info:
+ ret = psr_get_cat_l3_info(sysctl->u.psr_cat_op.target,
+ &sysctl->u.psr_cat_op.u.l3_info.cbm_len,
+ &sysctl->u.psr_cat_op.u.l3_info.cos_max);
+
+ if ( !ret && __copy_to_guest(u_sysctl, sysctl, 1) )
+ ret = -EFAULT;
+
+ break;
+ default:
+ ret = -EOPNOTSUPP;
+ break;
+ }
+ break;
+
default:
ret = -ENOSYS;
break;
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 45392bf..3a8a406 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -52,6 +52,9 @@ void psr_free_rmid(struct domain *d);
void psr_ctxt_switch_to(struct domain *d);
+int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
+ uint32_t *cos_max);
+
int psr_domain_init(struct domain *d);
void psr_domain_free(struct domain *d);
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 8552dc6..91d90b8 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -656,6 +656,20 @@ struct xen_sysctl_psr_cmt_op {
typedef struct xen_sysctl_psr_cmt_op xen_sysctl_psr_cmt_op_t;
DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cmt_op_t);
+#define XEN_SYSCTL_PSR_CAT_get_l3_info 0
+struct xen_sysctl_psr_cat_op {
+ uint32_t cmd; /* IN: XEN_SYSCTL_PSR_CAT_* */
+ uint32_t target; /* IN: socket to be operated on */
+ union {
+ struct {
+ uint32_t cbm_len; /* OUT: CBM length */
+ uint32_t cos_max; /* OUT: Maximum COS */
+ } l3_info;
+ } u;
+};
+typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
+
struct xen_sysctl {
uint32_t cmd;
#define XEN_SYSCTL_readconsole 1
@@ -678,6 +692,7 @@ struct xen_sysctl {
#define XEN_SYSCTL_scheduler_op 19
#define XEN_SYSCTL_coverage_op 20
#define XEN_SYSCTL_psr_cmt_op 21
+#define XEN_SYSCTL_psr_cat_op 22
uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
union {
struct xen_sysctl_readconsole readconsole;
@@ -700,6 +715,7 @@ struct xen_sysctl {
struct xen_sysctl_scheduler_op scheduler_op;
struct xen_sysctl_coverage_op coverage_op;
struct xen_sysctl_psr_cmt_op psr_cmt_op;
+ struct xen_sysctl_psr_cat_op psr_cat_op;
uint8_t pad[128];
} u;
};
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 08/12] x86: dynamically get/set CBM for a domain
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (6 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 07/12] x86: expose CBM length and COS number information Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 22:06 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 09/12] x86: add scheduling support for Intel CAT Chao Peng
` (4 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
For CAT, COS is maintained in hypervisor only while CBM is exposed to
user space directly to allow getting/setting domain's cache capacity.
For each specified CBM, hypervisor will either use a existed COS which
has the same CBM or allocate a new one if the same CBM is not found. If
the allocation fails because of no enough COS available then error is
returned. The getting/setting are always operated on a specified socket.
For multiple sockets system, the interface may be called several times.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
xen/arch/x86/domctl.c | 18 ++++++
xen/arch/x86/psr.c | 126 ++++++++++++++++++++++++++++++++++++++++
xen/include/asm-x86/msr-index.h | 1 +
xen/include/asm-x86/psr.h | 2 +
xen/include/public/domctl.h | 12 ++++
5 files changed, 159 insertions(+)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index d4f6ccf..89a6b33 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1326,6 +1326,24 @@ long arch_do_domctl(
}
break;
+ case XEN_DOMCTL_psr_cat_op:
+ switch ( domctl->u.psr_cat_op.cmd )
+ {
+ case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM:
+ ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target,
+ domctl->u.psr_cat_op.data);
+ break;
+ case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
+ ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
+ &domctl->u.psr_cat_op.data);
+ copyback = 1;
+ break;
+ default:
+ ret = -EOPNOTSUPP;
+ break;
+ }
+ break;
+
default:
ret = iommu_do_domctl(domctl, d, u_domctl);
break;
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index e390fd9..5247bcd 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -56,6 +56,17 @@ static unsigned int get_socket_count(void)
return DIV_ROUND_UP(nr_cpu_ids, cpus_per_socket);
}
+static unsigned int get_socket_cpu(unsigned int socket)
+{
+ if ( socket < nr_sockets )
+ {
+ cpumask_t *cpu_mask = cat_socket_info[socket].socket_cpu_mask;
+ ASSERT(cpu_mask != NULL);
+ return cpumask_any(cpu_mask);
+ }
+ return nr_cpu_ids;
+}
+
static void __init parse_psr_bool(char *s, char *value, char *feature,
unsigned int mask)
{
@@ -252,6 +263,121 @@ int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
return 0;
}
+int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm)
+{
+ unsigned int cos;
+ struct psr_cat_socket_info *info;
+ int ret = get_cat_socket_info(socket, &info);
+
+ if ( ret )
+ return ret;
+
+ cos = d->arch.psr_cos_ids[socket];
+ *cbm = info->cos_cbm_map[cos].cbm;
+ return 0;
+}
+
+static bool_t psr_check_cbm(unsigned int cbm_len, uint64_t cbm)
+{
+ unsigned int first_bit, zero_bit;
+
+ /* Set bits should only in the range of [0, cbm_len). */
+ if ( cbm & (~0ull << cbm_len) )
+ return 0;
+
+ first_bit = find_first_bit(&cbm, cbm_len);
+ zero_bit = find_next_zero_bit(&cbm, cbm_len, first_bit);
+
+ /* Set bits should be contiguous. */
+ if ( zero_bit < cbm_len &&
+ find_next_bit(&cbm, cbm_len, zero_bit) < cbm_len )
+ return 0;
+
+ return 1;
+}
+
+struct cos_cbm_info
+{
+ unsigned int cos;
+ uint64_t cbm;
+};
+
+static void do_write_l3_cbm(void *data)
+{
+ struct cos_cbm_info *info = data;
+ wrmsrl(MSR_IA32_PSR_L3_MASK(info->cos), info->cbm);
+}
+
+static int write_l3_cbm(unsigned int socket, unsigned int cos, uint64_t cbm)
+{
+ struct cos_cbm_info info = { .cos = cos, .cbm = cbm };
+
+ if ( socket == cpu_to_socket(smp_processor_id()) )
+ do_write_l3_cbm(&info);
+ else
+ {
+ unsigned int cpu = get_socket_cpu(socket);
+
+ if ( cpu >= nr_cpu_ids )
+ return -EBADSLT;
+ on_selected_cpus(cpumask_of(cpu), do_write_l3_cbm, &info, 1);
+ }
+
+ return 0;
+}
+
+int psr_set_l3_cbm(struct domain *d, unsigned int socket, uint64_t cbm)
+{
+ unsigned int old_cos, cos;
+ struct psr_cat_cbm *map, *find;
+ struct psr_cat_socket_info *info;
+ int ret = get_cat_socket_info(socket, &info);
+
+ if ( ret )
+ return ret;
+
+ if ( !psr_check_cbm(info->cbm_len, cbm) )
+ return -EINVAL;
+
+ old_cos = d->arch.psr_cos_ids[socket];
+ map = info->cos_cbm_map;
+ find = NULL;
+
+ for ( cos = 0; cos <= info->cos_max; cos++ )
+ {
+ /* If still not found, then keep unused one. */
+ if ( !find && cos != 0 && map[cos].ref == 0 )
+ find = map + cos;
+ else if ( map[cos].cbm == cbm )
+ {
+ if ( unlikely(cos == old_cos) )
+ return -EEXIST;
+ find = map + cos;
+ break;
+ }
+ }
+
+ /* If old cos is referred only by the domain, then use it. */
+ if ( !find && map[old_cos].ref == 1 )
+ find = map + old_cos;
+
+ if ( !find )
+ return -EUSERS;
+
+ cos = find - map;
+ if ( find->cbm != cbm )
+ {
+ ret = write_l3_cbm(socket, cos, cbm);
+ if ( ret )
+ return ret;
+ find->cbm = cbm;
+ }
+ find->ref++;
+ map[old_cos].ref--;
+ d->arch.psr_cos_ids[socket] = cos;
+ return 0;
+}
+
/* Called with domain lock held, no psr specific lock needed */
static void psr_free_cos(struct domain *d)
{
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 83f2f70..5425f77 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -327,6 +327,7 @@
#define MSR_IA32_CMT_EVTSEL 0x00000c8d
#define MSR_IA32_CMT_CTR 0x00000c8e
#define MSR_IA32_PSR_ASSOC 0x00000c8f
+#define MSR_IA32_PSR_L3_MASK(n) (0x00000c90 + (n))
/* Intel Model 6 */
#define MSR_P6_PERFCTR(n) (0x000000c1 + (n))
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 3a8a406..fb474bb 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -54,6 +54,8 @@ void psr_ctxt_switch_to(struct domain *d);
int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
uint32_t *cos_max);
+int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm);
+int psr_set_l3_cbm(struct domain *d, unsigned int socket, uint64_t cbm);
int psr_domain_init(struct domain *d);
void psr_domain_free(struct domain *d);
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index ca0e51e..9f04836 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1005,6 +1005,16 @@ struct xen_domctl_psr_cmt_op {
typedef struct xen_domctl_psr_cmt_op xen_domctl_psr_cmt_op_t;
DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t);
+struct xen_domctl_psr_cat_op {
+#define XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM 0
+#define XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM 1
+ uint32_t cmd; /* IN: XEN_DOMCTL_PSR_CAT_OP_* */
+ uint32_t target; /* IN: socket to be operated on */
+ uint64_t data; /* IN/OUT */
+};
+typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
+
struct xen_domctl {
uint32_t cmd;
#define XEN_DOMCTL_createdomain 1
@@ -1080,6 +1090,7 @@ struct xen_domctl {
#define XEN_DOMCTL_setvnumainfo 74
#define XEN_DOMCTL_psr_cmt_op 75
#define XEN_DOMCTL_arm_configure_domain 76
+#define XEN_DOMCTL_psr_cat_op 77
#define XEN_DOMCTL_gdbsx_guestmemio 1000
#define XEN_DOMCTL_gdbsx_pausevcpu 1001
#define XEN_DOMCTL_gdbsx_unpausevcpu 1002
@@ -1145,6 +1156,7 @@ struct xen_domctl {
struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
struct xen_domctl_vnuma vnuma;
struct xen_domctl_psr_cmt_op psr_cmt_op;
+ struct xen_domctl_psr_cat_op psr_cat_op;
uint8_t pad[128];
} u;
};
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 09/12] x86: add scheduling support for Intel CAT
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (7 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 08/12] x86: dynamically get/set CBM for a domain Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 22:12 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 10/12] xsm: add CAT related xsm policies Chao Peng
` (3 subsequent siblings)
12 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
On context switch, write the the domain's Class of Service(COS) to MSR
IA32_PQR_ASSOC, to notify hardware to use the new COS.
For performance reason, the socket number and COS mask for current cpu
is also cached in the local per-CPU variable.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
Changes in v2:
* merge common scheduling changes into scheduling improvement patch.
* use readable expr for psra->cos_mask.
---
xen/arch/x86/psr.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 5247bcd..046229d 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -37,6 +37,8 @@ struct psr_cat_socket_info {
struct psr_assoc {
uint64_t val;
+ unsigned int socket;
+ uint64_t cos_mask;
};
struct psr_cmt *__read_mostly psr_cmt;
@@ -206,9 +208,22 @@ void psr_free_rmid(struct domain *d)
static inline void psr_assoc_init(unsigned int cpu)
{
+ unsigned int socket;
+ struct psr_cat_socket_info *info;
struct psr_assoc *psra = &per_cpu(psr_assoc, cpu);
- if ( psr_cmt_enabled() )
+ if ( cat_socket_info )
+ {
+ socket = cpu_to_socket(cpu);
+ psra->socket = socket;
+
+ info = cat_socket_info + socket;
+ if ( info->enabled )
+ psra->cos_mask = ((1ull << get_count_order(info->cos_max)) - 1)
+ << 32;
+ }
+
+ if ( psr_cmt_enabled() || psra->cos_mask )
rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
}
@@ -217,6 +232,12 @@ static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
*reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
}
+static inline void psr_assoc_cos(uint64_t *reg, unsigned int cos,
+ uint64_t cos_mask)
+{
+ *reg = (*reg & ~cos_mask) | (((uint64_t)cos << 32) & cos_mask);
+}
+
void psr_ctxt_switch_to(struct domain *d)
{
struct psr_assoc *psra = &this_cpu(psr_assoc);
@@ -225,11 +246,21 @@ void psr_ctxt_switch_to(struct domain *d)
if ( psr_cmt_enabled() )
psr_assoc_rmid(®, d->arch.psr_rmid);
+ if ( psra->cos_mask )
+ {
+ if ( d->arch.psr_cos_ids )
+ psr_assoc_cos(®, d->arch.psr_cos_ids[psra->socket],
+ psra->cos_mask);
+ else
+ psr_assoc_cos(®, 0, psra->cos_mask);
+ }
+
if ( reg != psra->val )
{
wrmsrl(MSR_IA32_PSR_ASSOC, reg);
psra->val = reg;
}
+
}
static int get_cat_socket_info(unsigned int socket,
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 10/12] xsm: add CAT related xsm policies
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (8 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 09/12] x86: add scheduling support for Intel CAT Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 9:18 ` [PATCH v4 11/12] tools: add tools support for Intel CAT Chao Peng
` (2 subsequent siblings)
12 siblings, 0 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
Add xsm policies for Cache Allocation Technology(CAT) related hypercalls
to restrict the functions visibility to control domain only.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
tools/flask/policy/policy/modules/xen/xen.if | 2 +-
tools/flask/policy/policy/modules/xen/xen.te | 4 +++-
xen/xsm/flask/hooks.c | 6 ++++++
xen/xsm/flask/policy/access_vectors | 6 ++++++
4 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index 2d32e1c..8bb081a 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -51,7 +51,7 @@ define(`create_domain_common', `
getaffinity setaffinity setvcpuextstate };
allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
- psr_cmt_op configure_domain };
+ psr_cmt_op configure_domain psr_cat_op };
allow $1 $2:security check_context;
allow $1 $2:shadow enable;
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te
index c0128aa..d431aaf 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -67,6 +67,7 @@ allow dom0_t xen_t:xen {
allow dom0_t xen_t:xen2 {
resource_op
psr_cmt_op
+ psr_cat_op
};
allow dom0_t xen_t:mmu memorymap;
@@ -80,7 +81,8 @@ allow dom0_t dom0_t:domain {
getpodtarget setpodtarget set_misc_info set_virq_handler
};
allow dom0_t dom0_t:domain2 {
- set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo get_vnumainfo psr_cmt_op
+ set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo
+ get_vnumainfo psr_cmt_op psr_cat_op
};
allow dom0_t dom0_t:resource { add remove };
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 05dafed..8964321 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -729,6 +729,9 @@ static int flask_domctl(struct domain *d, int cmd)
case XEN_DOMCTL_psr_cmt_op:
return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PSR_CMT_OP);
+ case XEN_DOMCTL_psr_cat_op:
+ return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PSR_CAT_OP);
+
case XEN_DOMCTL_arm_configure_domain:
return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__CONFIGURE_DOMAIN);
@@ -790,6 +793,9 @@ static int flask_sysctl(int cmd)
case XEN_SYSCTL_psr_cmt_op:
return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
XEN2__PSR_CMT_OP, NULL);
+ case XEN_SYSCTL_psr_cat_op:
+ return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
+ XEN2__PSR_CAT_OP, NULL);
default:
printk("flask_sysctl: Unknown op %d\n", cmd);
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 8f44b9d..8cc1ef3 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -84,6 +84,9 @@ class xen2
resource_op
# XEN_SYSCTL_psr_cmt_op
psr_cmt_op
+# XEN_SYSCTL_psr_cat_op
+ psr_cat_op
+
}
# Classes domain and domain2 consist of operations that a domain performs on
@@ -221,6 +224,9 @@ class domain2
psr_cmt_op
# XEN_DOMCTL_configure_domain
configure_domain
+# XEN_DOMCTL_psr_cat_op
+ psr_cat_op
+
}
# Similar to class domain, but primarily contains domctls related to HVM domains
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 11/12] tools: add tools support for Intel CAT
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (9 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 10/12] xsm: add CAT related xsm policies Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 10:50 ` Wei Liu
2015-04-16 11:20 ` Ian Campbell
2015-04-09 9:18 ` [PATCH v4 12/12] docs: add xl-psr.markdown Chao Peng
2015-04-09 22:15 ` [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Andrew Cooper
12 siblings, 2 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
This is the xc/xl changes to support Intel Cache Allocation
Technology(CAT). Two commands are introduced:
- xl psr-cat-cbm-set [-s socket] <domain> <cbm>
Set cache capacity bitmasks(CBM) for a domain.
- xl psr-cat-show <domain>
Show Cache Allocation Technology information.
Examples:
[root@vmm-psr]# xl psr-cat-cbm-set 0 0xff
[root@vmm-psr]# xl psr-cat-show
Socket ID : 0
L3 Cache : 12288KB
Maximum COS : 15
CBM length : 12
Default CBM : 0xfff
ID NAME CBM
0 Domain-0 0xff
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
Changes in v4:
* Add example output in commit message.
* Make libxl__count_physical_sockets private to libxl_psr.c.
* Set errno in several error cases.
* Change libxl_psr_cat_get_l3_info to return all sockets information.
* Remove unused libxl_domain_info call.
Changes in v3:
* Add manpage.
* libxl_psr_cat_set/get_domain_data => libxl_psr_cat_set/get_cbm.
* Move libxl_count_physical_sockets into seperate patch.
* Support LIBXL_PSR_TARGET_ALL for libxl_psr_cat_set_cbm.
* Clean up the print codes.
---
docs/man/xl.pod.1 | 31 ++++++++
tools/libxc/include/xenctrl.h | 15 ++++
tools/libxc/xc_psr.c | 76 +++++++++++++++++++
tools/libxl/libxl.h | 26 +++++++
tools/libxl/libxl_psr.c | 168 ++++++++++++++++++++++++++++++++++++++++--
tools/libxl/libxl_types.idl | 10 +++
tools/libxl/xl.h | 4 +
tools/libxl/xl_cmdimpl.c | 140 +++++++++++++++++++++++++++++++++++
tools/libxl/xl_cmdtable.c | 12 +++
9 files changed, 475 insertions(+), 7 deletions(-)
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index b016272..dfab921 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1492,6 +1492,37 @@ monitor types are:
=back
+=head1 CACHE ALLOCATION TECHNOLOGY
+
+Intel Broadwell and later server platforms offer capabilities to configure and
+make use of the Cache Allocation Technology (CAT) mechanisms, which enable more
+cache resources (i.e. L3 cache) to be made available for high priority
+applications. In Xen implementation, CAT is used to control cache allocation
+on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
+(CBM) for the domain.
+
+=over 4
+
+=item B<psr-cat-cbm-set> [I<OPTIONS>] [I<domain-id>] [I<cbm>]
+
+Set cache capacity bitmasks(CBM) for a domain.
+
+B<OPTIONS>
+
+=over 4
+
+=item B<-s SOCKET>, B<--socket=SOCKET>
+
+Specify the socket to process, otherwise all sockets are processed.
+
+=back
+
+=item B<psr-cat-show> [I<domain-id>]
+
+Show CAT settings for a certain domain or all domains.
+
+=back
+
=head1 TO BE DOCUMENTED
We need better documentation for:
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index df18292..1373a46 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2692,6 +2692,12 @@ enum xc_psr_cmt_type {
XC_PSR_CMT_LOCAL_MEM_COUNT,
};
typedef enum xc_psr_cmt_type xc_psr_cmt_type;
+
+enum xc_psr_cat_type {
+ XC_PSR_CAT_L3_CBM = 1,
+};
+typedef enum xc_psr_cat_type xc_psr_cat_type;
+
int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid);
int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid);
int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
@@ -2706,6 +2712,15 @@ int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu,
uint32_t psr_cmt_type, uint64_t *monitor_data,
uint64_t *tsc);
int xc_psr_cmt_enabled(xc_interface *xch);
+
+int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
+ xc_psr_cat_type type, uint32_t target,
+ uint64_t data);
+int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
+ xc_psr_cat_type type, uint32_t target,
+ uint64_t *data);
+int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
+ uint32_t *cos_max, uint32_t *cbm_len);
#endif
#endif /* XENCTRL_H */
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index e367a80..d8b3a51 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -248,6 +248,82 @@ int xc_psr_cmt_enabled(xc_interface *xch)
return 0;
}
+int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
+ xc_psr_cat_type type, uint32_t target,
+ uint64_t data)
+{
+ DECLARE_DOMCTL;
+ uint32_t cmd;
+
+ switch ( type )
+ {
+ case XC_PSR_CAT_L3_CBM:
+ cmd = XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM;
+ break;
+ default:
+ errno = EINVAL;
+ return -1;
+ }
+
+ domctl.cmd = XEN_DOMCTL_psr_cat_op;
+ domctl.domain = (domid_t)domid;
+ domctl.u.psr_cat_op.cmd = cmd;
+ domctl.u.psr_cat_op.target = target;
+ domctl.u.psr_cat_op.data = data;
+
+ return do_domctl(xch, &domctl);
+}
+
+int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
+ xc_psr_cat_type type, uint32_t target,
+ uint64_t *data)
+{
+ int rc;
+ DECLARE_DOMCTL;
+ uint32_t cmd;
+
+ switch ( type )
+ {
+ case XC_PSR_CAT_L3_CBM:
+ cmd = XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM;
+ break;
+ default:
+ errno = EINVAL;
+ return -1;
+ }
+
+ domctl.cmd = XEN_DOMCTL_psr_cat_op;
+ domctl.domain = (domid_t)domid;
+ domctl.u.psr_cat_op.cmd = cmd;
+ domctl.u.psr_cat_op.target = target;
+
+ rc = do_domctl(xch, &domctl);
+
+ if ( !rc )
+ *data = domctl.u.psr_cat_op.data;
+
+ return rc;
+}
+
+int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
+ uint32_t *cos_max, uint32_t *cbm_len)
+{
+ int rc;
+ DECLARE_SYSCTL;
+
+ sysctl.cmd = XEN_SYSCTL_psr_cat_op;
+ sysctl.u.psr_cat_op.cmd = XEN_SYSCTL_PSR_CAT_get_l3_info;
+ sysctl.u.psr_cat_op.target = socket;
+
+ rc = xc_sysctl(xch, &sysctl);
+ if ( !rc )
+ {
+ *cos_max = sysctl.u.psr_cat_op.u.l3_info.cos_max;
+ *cbm_len = sysctl.u.psr_cat_op.u.l3_info.cbm_len;
+ }
+
+ return rc;
+}
/*
* Local variables:
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5eec092..3800738 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -718,6 +718,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
* If this is defined, the Memory Bandwidth Monitoring feature is supported.
*/
#define LIBXL_HAVE_PSR_MBM 1
+
+/*
+ * LIBXL_HAVE_PSR_CAT
+ *
+ * If this is defined, the Cache Allocation Technology feature is supported.
+ */
+#define LIBXL_HAVE_PSR_CAT 1
#endif
typedef char **libxl_string_list;
@@ -1513,6 +1520,25 @@ int libxl_psr_cmt_get_sample(libxl_ctx *ctx,
uint64_t *tsc_r);
#endif
+#ifdef LIBXL_HAVE_PSR_CAT
+
+#define LIBXL_PSR_TARGET_ALL (~0U)
+int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
+ libxl_psr_cbm_type type, uint32_t target,
+ uint64_t cbm);
+int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
+ libxl_psr_cbm_type type, uint32_t target,
+ uint64_t *cbm_r);
+
+/*
+ * On success, the function returns an array of elements in 'info',
+ * and the length in 'nr'. 'info' is from malloc so it must be freed
+ * by the caller.
+ */
+int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
+ uint32_t *nr);
+#endif
+
/* misc */
/* Each of these sets or clears the flag according to whether the
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 3e1c792..ac26876 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -19,14 +19,37 @@
#define IA32_QM_CTR_ERROR_MASK (0x3ul << 62)
-static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
+static void libxl__psr_log_err_msg(libxl__gc *gc, int err)
{
char *msg;
switch (err) {
case ENOSYS:
+ case EOPNOTSUPP:
msg = "unsupported operation";
break;
+ case ESRCH:
+ msg = "invalid domain ID";
+ break;
+ case EBADSLT:
+ msg = "socket is not supported";
+ break;
+ case EFAULT:
+ msg = "failed to exchange data with Xen";
+ break;
+ default:
+ msg = "unknown error";
+ break;
+ }
+
+ LOGE(ERROR, "%s", msg);
+}
+
+static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
+{
+ char *msg;
+
+ switch (err) {
case ENODEV:
msg = "CMT is not supported in this system";
break;
@@ -39,15 +62,35 @@ static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
case EUSERS:
msg = "no free RMID available";
break;
- case ESRCH:
- msg = "invalid domain ID";
+ default:
+ libxl__psr_log_err_msg(gc, err);
+ return;
+ }
+
+ LOGE(ERROR, "%s", msg);
+}
+
+static void libxl__psr_cat_log_err_msg(libxl__gc *gc, int err)
+{
+ char *msg;
+
+ switch (err) {
+ case ENODEV:
+ msg = "CAT is not supported in this system";
break;
- case EFAULT:
- msg = "failed to exchange data with Xen";
+ case ENOENT:
+ msg = "CAT is not enabled on the socket";
break;
- default:
- msg = "unknown error";
+ case EUSERS:
+ msg = "no free COS available";
+ break;
+ case EEXIST:
+ msg = "The same CBM is already set to this domain";
break;
+
+ default:
+ libxl__psr_log_err_msg(gc, err);
+ return;
}
LOGE(ERROR, "%s", msg);
@@ -73,6 +116,24 @@ static int libxl__pick_socket_cpu(libxl__gc *gc, uint32_t socketid)
return cpu;
}
+static int libxl__count_physical_sockets(libxl__gc *gc, uint32_t *sockets)
+{
+ int rc;
+ libxl_physinfo info;
+
+ libxl_physinfo_init(&info);
+
+ rc = libxl_get_physinfo(CTX, &info);
+ if (rc)
+ return rc;
+
+ *sockets = info.nr_cpus / info.threads_per_core
+ / info.cores_per_socket;
+
+ libxl_physinfo_dispose(&info);
+ return 0;
+}
+
int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t domid)
{
GC_INIT(ctx);
@@ -247,6 +308,99 @@ out:
return rc;
}
+int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
+ libxl_psr_cbm_type type, uint32_t target,
+ uint64_t cbm)
+{
+ GC_INIT(ctx);
+ int rc;
+ uint32_t i, nr_sockets;
+
+ if (target != LIBXL_PSR_TARGET_ALL) {
+ rc = xc_psr_cat_set_domain_data(ctx->xch, domid, type, target, cbm);
+ if (rc < 0) {
+ libxl__psr_cat_log_err_msg(gc, errno);
+ rc = ERROR_FAIL;
+ }
+ } else {
+ rc = libxl__count_physical_sockets(gc, &nr_sockets);
+ if (rc) {
+ LOGE(ERROR, "failed to get system socket count");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+ for (i = 0; i < nr_sockets; i++) {
+ rc = xc_psr_cat_set_domain_data(ctx->xch, domid, type, i, cbm);
+ if (rc < 0) {
+ libxl__psr_cat_log_err_msg(gc, errno);
+ rc = ERROR_FAIL;
+ goto out;
+ }
+ }
+ }
+
+out:
+ GC_FREE;
+ return rc;
+}
+
+int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
+ libxl_psr_cbm_type type, uint32_t target,
+ uint64_t *cbm_r)
+{
+ GC_INIT(ctx);
+ int rc;
+
+ rc = xc_psr_cat_get_domain_data(ctx->xch, domid, type, target, cbm_r);
+ if (rc < 0) {
+ libxl__psr_cat_log_err_msg(gc, errno);
+ rc = ERROR_FAIL;
+ }
+
+ GC_FREE;
+ return rc;
+}
+
+int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
+ uint32_t *nr)
+{
+ GC_INIT(ctx);
+ int rc;
+ uint32_t i, nr_sockets;
+ libxl_psr_cat_info *ptr;
+
+ rc = libxl__count_physical_sockets(gc, &nr_sockets);
+ if (rc) {
+ LOGE(ERROR, "failed to get system socket count");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
+ ptr = malloc(nr_sockets * sizeof(libxl_psr_cat_info));
+ if (!ptr) {
+ LOGE(ERROR, "failed to allocate cat info");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
+ for (i = 0; i < nr_sockets; i++) {
+ rc = xc_psr_cat_get_l3_info(ctx->xch, i, &ptr[i].cos_max,
+ &ptr[i].cbm_len);
+ if (rc) {
+ libxl__psr_cat_log_err_msg(gc, errno);
+ rc = ERROR_FAIL;
+ free(ptr);
+ goto out;
+ }
+ }
+
+ *info = ptr;
+ *nr = nr_sockets;
+out:
+ GC_FREE;
+ return rc;
+}
+
/*
* Local variables:
* mode: C
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 47af340..a03ee04 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -699,3 +699,13 @@ libxl_psr_cmt_type = Enumeration("psr_cmt_type", [
(2, "TOTAL_MEM_COUNT"),
(3, "LOCAL_MEM_COUNT"),
])
+
+libxl_psr_cbm_type = Enumeration("psr_cbm_type", [
+ (0, "UNKNOWN"),
+ (1, "L3_CBM"),
+ ])
+
+libxl_psr_cat_info = Struct("psr_cat_info", [
+ ("cos_max", uint32),
+ ("cbm_len", uint32),
+ ])
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index 5bc138c..85fa997 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -117,6 +117,10 @@ int main_psr_cmt_attach(int argc, char **argv);
int main_psr_cmt_detach(int argc, char **argv);
int main_psr_cmt_show(int argc, char **argv);
#endif
+#ifdef LIBXL_HAVE_PSR_CAT
+int main_psr_cat_cbm_set(int argc, char **argv);
+int main_psr_cat_show(int argc, char **argv);
+#endif
void help(const char *command);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 04faf98..0a5f436 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -8043,6 +8043,146 @@ int main_psr_cmt_show(int argc, char **argv)
}
#endif
+#ifdef LIBXL_HAVE_PSR_CAT
+static void psr_cat_print_one_domain_cbm(uint32_t domid, uint32_t socket)
+{
+ char *domain_name;
+ uint64_t cbm;
+
+ domain_name = libxl_domid_to_name(ctx, domid);
+ printf("%5d%25s", domid, domain_name);
+ free(domain_name);
+
+ if (!libxl_psr_cat_get_cbm(ctx, domid, LIBXL_PSR_CBM_TYPE_L3_CBM,
+ socket, &cbm))
+ printf("%#16"PRIx64, cbm);
+
+ printf("\n");
+}
+
+static int psr_cat_print_domain_cbm(uint32_t domid, uint32_t socket)
+{
+ int i, nr_domains;
+ libxl_dominfo *list;
+
+ if (domid != INVALID_DOMID) {
+ psr_cat_print_one_domain_cbm(domid, socket);
+ return 0;
+ }
+
+ if (!(list = libxl_list_domain(ctx, &nr_domains))) {
+ fprintf(stderr, "Failed to get domain list for cbm display\n");
+ return -1;
+ }
+
+ for (i = 0; i < nr_domains; i++)
+ psr_cat_print_one_domain_cbm(list[i].domid, socket);
+ libxl_dominfo_list_free(list, nr_domains);
+
+ printf("\n");
+
+ return 0;
+}
+
+static int psr_cat_print_socket(uint32_t domid, uint32_t socket,
+ libxl_psr_cat_info *info)
+{
+ uint32_t l3_cache_size;
+ int rc;
+
+ rc = libxl_psr_cmt_get_l3_cache_size(ctx, socket, &l3_cache_size);
+ if (rc) {
+ fprintf(stderr, "Failed to get l3 cache size for socket:%d\n", socket);
+ return -1;
+ }
+
+ /* Header */
+ printf("%-16s: %u\n", "Socket ID", socket);
+ printf("%-16s: %uKB\n", "L3 Cache", l3_cache_size);
+ printf("%-16s: %u\n", "Maximum COS", info->cos_max);
+ printf("%-16s: %u\n", "CBM length", info->cbm_len);
+ printf("%-16s: %#"PRIx64"\n", "Default CBM", (1ul << info->cbm_len) - 1);
+ printf("%5s%25s%16s\n", "ID", "NAME", "CBM");
+
+ return psr_cat_print_domain_cbm(domid, socket);
+}
+
+static int psr_cat_show(uint32_t domid)
+{
+ uint32_t socket, nr_sockets;
+ int rc;
+ libxl_psr_cat_info *info;
+
+ rc = libxl_psr_cat_get_l3_info(ctx, &info, &nr_sockets);
+ if (rc) {
+ fprintf(stderr, "Failed to get cat info\n");
+ return rc;
+ }
+
+ for (socket = 0; socket < nr_sockets; socket++) {
+ rc = psr_cat_print_socket(domid, socket, info + socket);
+ if (rc)
+ goto out;
+ }
+
+out:
+ free(info);
+ return rc;
+}
+
+int main_psr_cat_cbm_set(int argc, char **argv)
+{
+ uint32_t domid;
+ uint32_t target = LIBXL_PSR_TARGET_ALL;
+ libxl_psr_cbm_type type = LIBXL_PSR_CBM_TYPE_L3_CBM;
+ uint64_t cbm;
+ char *ptr;
+ int opt = 0;
+
+ static struct option opts[] = {
+ {"socket", 0, 0, 's'},
+ {0, 0, 0, 0}
+ };
+
+ SWITCH_FOREACH_OPT(opt, "s", opts, "psr-cat-cbm-set", 1) {
+ case 's':
+ target = strtol(optarg, NULL, 10);
+ break;
+ }
+
+ domid = find_domain(argv[optind]);
+ ptr = argv[optind + 1];
+ if (strlen(ptr) > 2 && ptr[0] == '0' && ptr[1] == 'x')
+ cbm = strtoll(ptr, NULL , 16);
+ else
+ cbm = strtoll(ptr, NULL , 10);
+
+ return libxl_psr_cat_set_cbm(ctx, domid, type, target, cbm);
+}
+
+int main_psr_cat_show(int argc, char **argv)
+{
+ int opt;
+ uint32_t domid;
+
+ SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cat-show", 0) {
+ /* No options */
+ }
+
+ if (optind >= argc)
+ domid = INVALID_DOMID;
+ else if (optind == argc - 1)
+ domid = find_domain(argv[optind]);
+ else {
+ help("psr-cat-show");
+ return 2;
+ }
+
+ return psr_cat_show(domid);
+}
+
+#endif
+
/*
* Local variables:
* mode: C
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 22ab63b..c83a60b 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -542,6 +542,18 @@ struct cmd_spec cmd_table[] = {
"\"total_mem_bandwidth\": Show total memory bandwidth(KB/s)\n"
"\"local_mem_bandwidth\": Show local memory bandwidth(KB/s)\n",
},
+ { "psr-cat-cbm-set",
+ &main_psr_cat_cbm_set, 0, 1,
+ "Set cache capacity bitmasks(CBM) for a domain",
+ "-s <socket> Specify the socket to process, otherwise all sockets are processed\n"
+ "<Domain> <CBM>",
+ },
+ { "psr-cat-show",
+ &main_psr_cat_show, 0, 1,
+ "Show Cache Allocation Technology information",
+ "<Domain>",
+ },
+
#endif
};
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v4 12/12] docs: add xl-psr.markdown
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (10 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 11/12] tools: add tools support for Intel CAT Chao Peng
@ 2015-04-09 9:18 ` Chao Peng
2015-04-09 11:29 ` Andrew Cooper
2015-04-16 11:58 ` Ian Campbell
2015-04-09 22:15 ` [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Andrew Cooper
12 siblings, 2 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-09 9:18 UTC (permalink / raw)
To: xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, will.auld, JBeulich, wei.liu2, dgdegra
Add document to introduce basic concepts and terms in PSR family
techonologies and the xl/libxl interfaces.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
docs/man/xl.pod.1 | 7 +++
docs/misc/xl-psr.markdown | 111 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 118 insertions(+)
create mode 100644 docs/misc/xl-psr.markdown
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index dfab921..b71d6e6 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1472,6 +1472,9 @@ occupancy monitoring share the same set of underlying monitoring service. Once
a domain is attached to the monitoring service, monitoring data can be showed
for any of these monitoring types.
+See L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html> for more
+informations.
+
=over 4
=item B<psr-cmt-attach> [I<domain-id>]
@@ -1501,6 +1504,9 @@ applications. In Xen implementation, CAT is used to control cache allocation
on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
(CBM) for the domain.
+See L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html> for more
+informations.
+
=over 4
=item B<psr-cat-cbm-set> [I<OPTIONS>] [I<domain-id>] [I<cbm>]
@@ -1546,6 +1552,7 @@ And the following documents on the xen.org website:
L<http://xenbits.xen.org/docs/unstable/misc/xl-network-configuration.html>
L<http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt>
L<http://xenbits.xen.org/docs/unstable/misc/xsm-flask.txt>
+L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>
For systems that don't automatically bring CPU online:
diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
new file mode 100644
index 0000000..44f6f8c
--- /dev/null
+++ b/docs/misc/xl-psr.markdown
@@ -0,0 +1,111 @@
+# Intel Platform Shared Resource Monitoring/Control in xl/libxl
+
+This document introduces Intel Platform Shared Resource Monitoring/Control
+technologies, their basic concepts and the xl/libxl interfaces.
+
+## Cache Monitoring Technology (CMT)
+
+Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
+and later server platforms that allows an OS or Hypervisor/VMM to determine
+the usage of cache(currently only L3 cache supported) by applications running
+on the platform. A Resource Monitoring ID (RMID) is the abstraction of the
+application(s) that will be monitored for its cache usage. The CMT hardware
+tracks cache utilization of memory accesses according to the RMID and reports
+monitored data via a counter register.
+
+Detailed information please refer to Intel SDM chapter 17.14.
+
+In Xen's implementation, each domain in the system can be assigned a RMID
+independently, while RMID=0 is reserved for monitoring domains that doesn't
+enable CMT service. RMID is opaque for xl/libxl and is only used in
+hypervisor.
+
+### xl interfaces
+
+A domain is assigned a RMID implicitly by attaching it to CMT service:
+
+xl psr-cmt-attach domid
+
+After that, cache usage for the domain can be showed by:
+
+xl psr-cmt-show cache_occupancy <domid>
+
+Once monitoring is not needed any more, the domain can be detached from the
+CMT service by:
+
+xl psr-cmt-detach domid
+
+The attaching may fail because of no free RMID available. In such case
+unused RMID(s) can be freed by detaching corresponding domains from CMT
+services. Maximum COS number in the system can also be obtained by:
+
+xl psr_cmt-show
+
+## Memory Bandwidth Monitoring (MBM)
+
+Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel
+Broadwell and later server platforms which builds on the CMT infrastructure to
+allow monitoring of system memory bandwidth. It introduces two new monitoring
+event type to monitor system total/local memory bandwidth. The same RMID can
+be used to monitor both cache usage and memory bandwidth at the same time.
+
+Detailed information please refer to Intel SDM chapter 17.14.
+
+In Xen's implementation, MBM shares the same set of underlying monitoring
+service with CMT and can be used to monitor memory bandwidth on domain basis.
+
+The xl/libxl interface is the same with that of CMT. The difference is the
+monitor type is corresponding memory monitoring type(local_mem_bandwidth/
+total_mem_bandwidth) but not cache_occupancy.
+
+## Cache Allocation Technology (CAT)
+
+Cache Allocation Technology (CAT) is a new feature available on Intel
+Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
+partition cache allocation(i.e. L3 cache) based on application priority or
+Class of Service(COS). Each COS is configured using capacity bitmasks (CBM)
+which represent cache capacity and indicate the degree of overlap and
+isolation between classes. System cache resource is divided into numbers of
+minimum portions which is then made up into subset for cache partition. Each
+portion corresponds to a bit in CBM and the set bit represents the
+corresponding cache portion is available.
+
+Detailed information please refer to Intel SDM chapter 17.15.
+
+In Xen's implementation, CBM can be set/get with libxl/xl interfaces but COS
+is maintained in hypervisor only. The cache partition granularity is per
+domain, each domain has COS=0 assigned by default, the corresponding CBM is
+all-ones, which means all the cache resource can be used by default.
+
+### xl interfaces
+
+The simplest way to change a domain's CBM from its default is running:
+
+xl psr-cat-cbm-set [OPTIONS] <domid> <cbm>
+
+where cbm is a decimal/hexadecimal number to represent the corresponding cache
+subset can be used.
+
+A cbm is valid only when:
+
+ * Set bits only exist in the range of [0, cbm_len), where cbm_len can be
+ obtained with 'xl psr-cat-show'.
+ * All the set bits is contiguous.
+ * Is not the same with the current cbm of the domain.
+
+In multi-sockets system, the same cbm will be set to each socket by default.
+Per socket cbm can be specified with '--socket SOCKET' option.
+
+The cbm may be not set successfully because of no enough COS available. In such
+case unused COS(es) may be freed by setting CBM of all related domains to its
+default value(all-ones).
+
+System CAT information(such as maximum COS and CBM length) and per domain CBM
+settings can be showed by:
+
+xl psr-cat-show
+
+## Reference
+
+[1] Intel SDM
+(http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html).
--
1.9.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v4 11/12] tools: add tools support for Intel CAT
2015-04-09 9:18 ` [PATCH v4 11/12] tools: add tools support for Intel CAT Chao Peng
@ 2015-04-09 10:50 ` Wei Liu
2015-04-16 11:20 ` Ian Campbell
1 sibling, 0 replies; 37+ messages in thread
From: Wei Liu @ 2015-04-09 10:50 UTC (permalink / raw)
To: Chao Peng
Cc: keir, Ian.Campbell, stefano.stabellini, andrew.cooper3,
Ian.Jackson, xen-devel, will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 05:18:24PM +0800, Chao Peng wrote:
[...]
> +int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
> + uint32_t *nr)
> +{
> + GC_INIT(ctx);
> + int rc;
> + uint32_t i, nr_sockets;
> + libxl_psr_cat_info *ptr;
> +
> + rc = libxl__count_physical_sockets(gc, &nr_sockets);
> + if (rc) {
> + LOGE(ERROR, "failed to get system socket count");
> + rc = ERROR_FAIL;
> + goto out;
> + }
> +
> + ptr = malloc(nr_sockets * sizeof(libxl_psr_cat_info));
This should be libxl__malloc(NOGC, ...);
Wei.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 12/12] docs: add xl-psr.markdown
2015-04-09 9:18 ` [PATCH v4 12/12] docs: add xl-psr.markdown Chao Peng
@ 2015-04-09 11:29 ` Andrew Cooper
2015-04-10 7:45 ` Chao Peng
2015-04-16 11:58 ` Ian Campbell
1 sibling, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 11:29 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/15 10:18, Chao Peng wrote:
> Add document to introduce basic concepts and terms in PSR family
> techonologies and the xl/libxl interfaces.
Very nice! A few minor comments...
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> docs/man/xl.pod.1 | 7 +++
> docs/misc/xl-psr.markdown | 111 ++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 118 insertions(+)
> create mode 100644 docs/misc/xl-psr.markdown
>
> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> index dfab921..b71d6e6 100644
> --- a/docs/man/xl.pod.1
> +++ b/docs/man/xl.pod.1
> @@ -1472,6 +1472,9 @@ occupancy monitoring share the same set of underlying monitoring service. Once
> a domain is attached to the monitoring service, monitoring data can be showed
> for any of these monitoring types.
>
> +See L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html> for more
> +informations.
> +
> =over 4
>
> =item B<psr-cmt-attach> [I<domain-id>]
> @@ -1501,6 +1504,9 @@ applications. In Xen implementation, CAT is used to control cache allocation
> on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
> (CBM) for the domain.
>
> +See L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html> for more
> +informations.
> +
> =over 4
>
> =item B<psr-cat-cbm-set> [I<OPTIONS>] [I<domain-id>] [I<cbm>]
> @@ -1546,6 +1552,7 @@ And the following documents on the xen.org website:
> L<http://xenbits.xen.org/docs/unstable/misc/xl-network-configuration.html>
> L<http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt>
> L<http://xenbits.xen.org/docs/unstable/misc/xsm-flask.txt>
> +L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>
>
> For systems that don't automatically bring CPU online:
>
> diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
> new file mode 100644
> index 0000000..44f6f8c
> --- /dev/null
> +++ b/docs/misc/xl-psr.markdown
> @@ -0,0 +1,111 @@
> +# Intel Platform Shared Resource Monitoring/Control in xl/libxl
> +
> +This document introduces Intel Platform Shared Resource Monitoring/Control
> +technologies, their basic concepts and the xl/libxl interfaces.
> +
> +## Cache Monitoring Technology (CMT)
> +
> +Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
> +and later server platforms that allows an OS or Hypervisor/VMM to determine
> +the usage of cache(currently only L3 cache supported)
L3, or LLC? This appears to be used ambiguously, but does have a
material impact for system with L4 caches.
> by applications running
> +on the platform. A Resource Monitoring ID (RMID) is the abstraction of the
> +application(s) that will be monitored for its cache usage. The CMT hardware
> +tracks cache utilization of memory accesses according to the RMID and reports
> +monitored data via a counter register.
> +
> +Detailed information please refer to Intel SDM chapter 17.14.
Please put the chapter title as well, as the numbering does alter slowly
over time.
> +
> +In Xen's implementation, each domain in the system can be assigned a RMID
> +independently, while RMID=0 is reserved for monitoring domains that doesn't
> +enable CMT service. RMID is opaque for xl/libxl and is only used in
> +hypervisor.
> +
> +### xl interfaces
> +
> +A domain is assigned a RMID implicitly by attaching it to CMT service:
> +
> +xl psr-cmt-attach domid
> +
> +After that, cache usage for the domain can be showed by:
> +
> +xl psr-cmt-show cache_occupancy <domid>
> +
> +Once monitoring is not needed any more, the domain can be detached from the
> +CMT service by:
> +
> +xl psr-cmt-detach domid
> +
> +The attaching may fail because of no free RMID available. In such case
> +unused RMID(s) can be freed by detaching corresponding domains from CMT
> +services. Maximum COS number in the system can also be obtained by:
You have not yet introduced COS as a term. Perhaps this bit is better
moving down to the CAT section?
> +
> +xl psr_cmt-show
"psr-cmt-show"
I am not sure how wise it is to dump information like max rmid/max cos
into cmt-show.
Is it perhaps worth having an `xl psr-hwinfo` (or equivalent) which will
dump the hardware capabilities, per-socket limits etc, as a consise way
to obtain all relevant information?
> +
> +## Memory Bandwidth Monitoring (MBM)
> +
> +Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel
> +Broadwell and later server platforms which builds on the CMT infrastructure to
> +allow monitoring of system memory bandwidth. It introduces two new monitoring
> +event type to monitor system total/local memory bandwidth. The same RMID can
> +be used to monitor both cache usage and memory bandwidth at the same time.
> +
> +Detailed information please refer to Intel SDM chapter 17.14.
> +
> +In Xen's implementation, MBM shares the same set of underlying monitoring
> +service with CMT and can be used to monitor memory bandwidth on domain basis.
> +
> +The xl/libxl interface is the same with that of CMT. The difference is the
> +monitor type is corresponding memory monitoring type(local_mem_bandwidth/
> +total_mem_bandwidth) but not cache_occupancy.
> +
> +## Cache Allocation Technology (CAT)
> +
> +Cache Allocation Technology (CAT) is a new feature available on Intel
> +Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
> +partition cache allocation(i.e. L3 cache) based on application priority or
> +Class of Service(COS). Each COS is configured using capacity bitmasks (CBM)
> +which represent cache capacity and indicate the degree of overlap and
> +isolation between classes. System cache resource is divided into numbers of
> +minimum portions which is then made up into subset for cache partition. Each
> +portion corresponds to a bit in CBM and the set bit represents the
> +corresponding cache portion is available.
> +
> +Detailed information please refer to Intel SDM chapter 17.15.
> +
> +In Xen's implementation, CBM can be set/get with libxl/xl interfaces but COS
Strictly speaking that should be "set/got" in english, but "configured"
would be a better alternative.
~Andrew
> +is maintained in hypervisor only. The cache partition granularity is per
> +domain, each domain has COS=0 assigned by default, the corresponding CBM is
> +all-ones, which means all the cache resource can be used by default.
> +
> +### xl interfaces
> +
> +The simplest way to change a domain's CBM from its default is running:
> +
> +xl psr-cat-cbm-set [OPTIONS] <domid> <cbm>
> +
> +where cbm is a decimal/hexadecimal number to represent the corresponding cache
> +subset can be used.
> +
> +A cbm is valid only when:
> +
> + * Set bits only exist in the range of [0, cbm_len), where cbm_len can be
> + obtained with 'xl psr-cat-show'.
> + * All the set bits is contiguous.
> + * Is not the same with the current cbm of the domain.
> +
> +In multi-sockets system, the same cbm will be set to each socket by default.
> +Per socket cbm can be specified with '--socket SOCKET' option.
> +
> +The cbm may be not set successfully because of no enough COS available. In such
> +case unused COS(es) may be freed by setting CBM of all related domains to its
> +default value(all-ones).
> +
> +System CAT information(such as maximum COS and CBM length) and per domain CBM
> +settings can be showed by:
> +
> +xl psr-cat-show
> +
> +## Reference
> +
> +[1] Intel SDM
> +(http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html).
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 01/12] x86: clean up psr boot parameter parsing
2015-04-09 9:18 ` [PATCH v4 01/12] x86: clean up psr boot parameter parsing Chao Peng
@ 2015-04-09 20:38 ` Andrew Cooper
0 siblings, 0 replies; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 20:38 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> Change type of opt_psr from bool to int so more psr features can fit.
>
> Introduce a new routine to parse bool parameter so that both cmt and
> future psr features like cat can use it.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 02/12] x86: improve psr scheduling code
2015-04-09 9:18 ` [PATCH v4 02/12] x86: improve psr scheduling code Chao Peng
@ 2015-04-09 21:01 ` Andrew Cooper
2015-04-10 7:24 ` Chao Peng
0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 21:01 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> Switching RMID from previous vcpu to next vcpu only needs to write
> MSR_IA32_PSR_ASSOC once. Write it with the value of next vcpu is enough,
> no need to write '0' first. Idle domain has RMID set to 0 and because MSR
> is already updated lazily, so just switch it as it does.
>
> Also move the initialization of per-CPU variable which used for lazy
> update from context switch to CPU starting.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> Changes in v4:
> * Move psr_assoc_reg_read/psr_assoc_reg_write into psr_ctxt_switch_to.
> * Use 0 instead of smp_processor_id() for boot cpu.
> * add cpu parameter to psr_assoc_init.
> Changes in v2:
> * Move initialization for psr_assoc from context switch to CPU_STARTING.
> ---
> xen/arch/x86/domain.c | 7 ++---
> xen/arch/x86/psr.c | 75 ++++++++++++++++++++++++++++++++++-------------
> xen/include/asm-x86/psr.h | 3 +-
> 3 files changed, 59 insertions(+), 26 deletions(-)
>
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 04c1898..695a2eb 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1444,8 +1444,6 @@ static void __context_switch(void)
> {
> memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
> vcpu_save_fpu(p);
> - if ( psr_cmt_enabled() )
> - psr_assoc_rmid(0);
> p->arch.ctxt_switch_from(p);
> }
>
> @@ -1470,11 +1468,10 @@ static void __context_switch(void)
> }
> vcpu_restore_fpu_eager(n);
> n->arch.ctxt_switch_to(n);
> -
> - if ( psr_cmt_enabled() && n->domain->arch.psr_rmid > 0 )
> - psr_assoc_rmid(n->domain->arch.psr_rmid);
> }
>
> + psr_ctxt_switch_to(n->domain);
> +
> gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
> per_cpu(compat_gdt_table, cpu);
> if ( need_full_gdt(n) )
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 344de3c..6119c6e 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -22,7 +22,6 @@
>
> struct psr_assoc {
> uint64_t val;
> - bool_t initialized;
> };
>
> struct psr_cmt *__read_mostly psr_cmt;
> @@ -122,14 +121,6 @@ static void __init init_psr_cmt(unsigned int rmid_max)
> printk(XENLOG_INFO "Cache Monitoring Technology enabled\n");
> }
>
> -static int __init init_psr(void)
> -{
> - if ( (opt_psr & PSR_CMT) && opt_rmid_max )
> - init_psr_cmt(opt_rmid_max);
> - return 0;
> -}
> -__initcall(init_psr);
> -
> /* Called with domain lock held, no psr specific lock needed */
> int psr_alloc_rmid(struct domain *d)
> {
> @@ -175,26 +166,70 @@ void psr_free_rmid(struct domain *d)
> d->arch.psr_rmid = 0;
> }
>
> -void psr_assoc_rmid(unsigned int rmid)
> +static inline void psr_assoc_init(unsigned int cpu)
> +{
> + struct psr_assoc *psra = &per_cpu(psr_assoc, cpu);
> +
> + if ( psr_cmt_enabled() )
> + rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
> +}
On further consideration, this would probably be better as a void
function which used this_cpu() rather than per_cpu().
Absolutely nothing good can come of calling it with cpu !=
smp_processor_id(), so we should avoid that situation arising in the
first place.
> +
> +static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
> +{
> + *reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
> +}
> +
> +void psr_ctxt_switch_to(struct domain *d)
> {
> - uint64_t val;
> - uint64_t new_val;
> struct psr_assoc *psra = &this_cpu(psr_assoc);
> + uint64_t reg = psra->val;
> +
> + if ( psr_cmt_enabled() )
> + psr_assoc_rmid(®, d->arch.psr_rmid);
>
> - if ( !psra->initialized )
> + if ( reg != psra->val )
> {
> - rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
> - psra->initialized = 1;
> + wrmsrl(MSR_IA32_PSR_ASSOC, reg);
> + psra->val = reg;
> }
> - val = psra->val;
> +}
>
> - new_val = (val & ~rmid_mask) | (rmid & rmid_mask);
> - if ( val != new_val )
> +static void psr_cpu_init(unsigned int cpu)
> +{
> + psr_assoc_init(cpu);
> +}
This can also turn into a void helper.
Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
~Andrew
> +
> +static int cpu_callback(
> + struct notifier_block *nfb, unsigned long action, void *hcpu)
> +{
> + unsigned int cpu = (unsigned long)hcpu;
> +
> + switch ( action )
> {
> - wrmsrl(MSR_IA32_PSR_ASSOC, new_val);
> - psra->val = new_val;
> + case CPU_STARTING:
> + psr_cpu_init(cpu);
> + break;
> }
> +
> + return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block cpu_nfb = {
> + .notifier_call = cpu_callback
> +};
> +
> +static int __init psr_presmp_init(void)
> +{
> + if ( (opt_psr & PSR_CMT) && opt_rmid_max )
> + init_psr_cmt(opt_rmid_max);
> +
> + psr_cpu_init(0);
> + if ( psr_cmt_enabled() )
> + register_cpu_notifier(&cpu_nfb);
> +
> + return 0;
> }
> +presmp_initcall(psr_presmp_init);
>
> /*
> * Local variables:
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> index c6076e9..585350c 100644
> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -46,7 +46,8 @@ static inline bool_t psr_cmt_enabled(void)
>
> int psr_alloc_rmid(struct domain *d);
> void psr_free_rmid(struct domain *d);
> -void psr_assoc_rmid(unsigned int rmid);
> +
> +void psr_ctxt_switch_to(struct domain *d);
>
> #endif /* __ASM_PSR_H__ */
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 03/12] x86: detect and initialize Intel CAT feature
2015-04-09 9:18 ` [PATCH v4 03/12] x86: detect and initialize Intel CAT feature Chao Peng
@ 2015-04-09 21:30 ` Andrew Cooper
0 siblings, 0 replies; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 21:30 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> Detect Intel Cache Allocation Technology(CAT) feature and store the
> cpuid information for later use. Currently only L3 cache allocation is
> supported. The L3 CAT features may vary among sockets so per-socket
> feature information is stored. The initialization can happen either at
> boot time or when CPU(s) is hot plugged after booting.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> Changes in v4:
> * check X86_FEATURE_CAT available before doing initialization.
> Changes in v3:
> * Remove num_sockets boot option instead calculate it at boot time.
> * Name hardcoded CAT cpuid leaf as PSR_CPUID_LEVEL_CAT.
> Changes in v2:
> * socket_num => num_sockets and fix several documentaion issues.
> * refactor boot line parameters parsing into standlone patch.
> * set opt_num_sockets = NR_CPUS when opt_num_sockets > NR_CPUS.
> * replace CPU_ONLINE with CPU_STARTING and integrate that into scheduling
> improvement patch.
> * reimplement get_max_socket() with cpu_to_socket();
> * cbm is still uint64 as there is a path forward for supporting long masks.
> ---
> docs/misc/xen-command-line.markdown | 13 +++++--
> xen/arch/x86/psr.c | 68 +++++++++++++++++++++++++++++++++++--
> xen/include/asm-x86/cpufeature.h | 1 +
> xen/include/asm-x86/psr.h | 3 ++
> 4 files changed, 81 insertions(+), 4 deletions(-)
>
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index 1dda1f0..9ad8801 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1122,9 +1122,9 @@ This option can be specified more than once (up to 8 times at present).
> > `= <integer>`
>
> ### psr (Intel)
> -> `= List of ( cmt:<boolean> | rmid_max:<integer> )`
> +> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> )`
>
> -> Default: `psr=cmt:0,rmid_max:255`
> +> Default: `psr=cmt:0,rmid_max:255,cat:0`
>
> Platform Shared Resource(PSR) Services. Intel Haswell and later server
> platforms offer information about the sharing of resources.
> @@ -1134,6 +1134,11 @@ Monitoring ID(RMID) is used to bind the domain to corresponding shared
> resource. RMID is a hardware-provided layer of abstraction between software
> and logical processors.
>
> +To use the PSR cache allocation service for a certain domain, a capacity
> +bitmasks(CBM) is used to bind the domain to corresponding shared resource.
> +CBM represents cache capacity and indicates the degree of overlap and isolation
> +between domains.
> +
> The following resources are available:
>
> * Cache Monitoring Technology (Haswell and later). Information regarding the
> @@ -1144,6 +1149,10 @@ The following resources are available:
> total/local memory bandwidth. Follow the same options with Cache Monitoring
> Technology.
>
> +* Cache Alllocation Technology (Broadwell and later). Information regarding
> + the cache allocation.
> + * `cat` instructs Xen to enable/disable Cache Allocation Technology.
> +
> ### reboot
> > `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
>
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 6119c6e..16c37dd 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -19,17 +19,36 @@
> #include <asm/psr.h>
>
> #define PSR_CMT (1<<0)
> +#define PSR_CAT (1<<1)
> +
> +struct psr_cat_socket_info {
> + bool_t initialized;
> + bool_t enabled;
> + unsigned int cbm_len;
> + unsigned int cos_max;
> +};
>
> struct psr_assoc {
> uint64_t val;
> };
>
> struct psr_cmt *__read_mostly psr_cmt;
> +static struct psr_cat_socket_info *__read_mostly cat_socket_info;
> +
> static unsigned int __initdata opt_psr;
> static unsigned int __initdata opt_rmid_max = 255;
> +static unsigned int __read_mostly nr_sockets;
> static uint64_t rmid_mask;
> static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
>
> +static unsigned int get_socket_count(void)
> +{
> + unsigned int cpus_per_socket = boot_cpu_data.x86_max_cores *
> + boot_cpu_data.x86_num_siblings;
> +
> + return DIV_ROUND_UP(nr_cpu_ids, cpus_per_socket);
> +}
> +
> static void __init parse_psr_bool(char *s, char *value, char *feature,
> unsigned int mask)
> {
> @@ -63,6 +82,7 @@ static void __init parse_psr_param(char *s)
> *val_str++ = '\0';
>
> parse_psr_bool(s, val_str, "cmt", PSR_CMT);
> + parse_psr_bool(s, val_str, "cat", PSR_CAT);
>
> if ( val_str && !strcmp(s, "rmid_max") )
> opt_rmid_max = simple_strtoul(val_str, NULL, 0);
> @@ -194,8 +214,49 @@ void psr_ctxt_switch_to(struct domain *d)
> }
> }
>
> +static void cat_cpu_init(unsigned int cpu)
> +{
> + unsigned int eax, ebx, ecx, edx;
> + struct psr_cat_socket_info *info;
> + unsigned int socket;
> + const struct cpuinfo_x86 *c = cpu_data + cpu;
> +
> + if ( !cpu_has(c, X86_FEATURE_CAT) )
> + return;
> +
> + socket = cpu_to_socket(cpu);
> + ASSERT(socket < nr_sockets);
> +
> + info = cat_socket_info + socket;
> +
> + /* Avoid initializing more than one times for the same socket. */
> + if ( test_and_set_bool(info->initialized) )
> + return;
> +
> + cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
> + if ( ebx & PSR_RESOURCE_TYPE_L3 )
> + {
> + cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
> + info->cbm_len = (eax & 0x1f) + 1;
> + info->cos_max = (edx & 0xffff);
> +
> + info->enabled = 1;
> + printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
> + socket, info->cos_max, info->cbm_len);
> + }
> +}
> +
> +static void __init init_psr_cat(void)
> +{
> + nr_sockets = get_socket_count();
> + cat_socket_info = xzalloc_array(struct psr_cat_socket_info, nr_sockets);
> +}
> +
> static void psr_cpu_init(unsigned int cpu)
> {
> + if ( cat_socket_info )
> + cat_cpu_init(cpu);
> +
> psr_assoc_init(cpu);
> }
>
> @@ -223,9 +284,12 @@ static int __init psr_presmp_init(void)
> if ( (opt_psr & PSR_CMT) && opt_rmid_max )
> init_psr_cmt(opt_rmid_max);
>
> + if ( opt_psr & PSR_CAT )
> + init_psr_cat();
> +
> psr_cpu_init(0);
> - if ( psr_cmt_enabled() )
> - register_cpu_notifier(&cpu_nfb);
> + if ( psr_cmt_enabled() || cat_socket_info )
> + register_cpu_notifier(&cpu_nfb);
>
> return 0;
> }
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index 7963a3a..8c0f0a6 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -149,6 +149,7 @@
> #define X86_FEATURE_CMT (7*32+12) /* Cache Monitoring Technology */
> #define X86_FEATURE_NO_FPU_SEL (7*32+13) /* FPU CS/DS stored as zero */
> #define X86_FEATURE_MPX (7*32+14) /* Memory Protection Extensions */
> +#define X86_FEATURE_CAT (7*32+15) /* Cache Allocation Technology */
> #define X86_FEATURE_RDSEED (7*32+18) /* RDSEED instruction */
> #define X86_FEATURE_ADX (7*32+19) /* ADCX, ADOX instructions */
> #define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> index 585350c..3bc5496 100644
> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -18,6 +18,9 @@
>
> #include <xen/types.h>
>
> +/* CAT cpuid level */
> +#define PSR_CPUID_LEVEL_CAT 0x10
> +
> /* Resource Type Enumeration */
> #define PSR_RESOURCE_TYPE_L3 0x2
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 04/12] x86: maintain COS to CBM mapping for each socket
2015-04-09 9:18 ` [PATCH v4 04/12] x86: maintain COS to CBM mapping for each socket Chao Peng
@ 2015-04-09 21:35 ` Andrew Cooper
2015-04-10 7:26 ` Chao Peng
0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 21:35 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> For each socket, a COS to CBM mapping structure is maintained for each
> COS. The mapping is indexed by COS and the value is the corresponding
> CBM. Different VMs may use the same CBM, a reference count is used to
> indicate if the CBM is available.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> xen/arch/x86/psr.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 16c37dd..4aff5f6 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -21,11 +21,17 @@
> #define PSR_CMT (1<<0)
> #define PSR_CAT (1<<1)
>
> +struct psr_cat_cbm {
> + unsigned int ref;
> + uint64_t cbm;
> +};
> +
> struct psr_cat_socket_info {
> bool_t initialized;
> bool_t enabled;
> unsigned int cbm_len;
> unsigned int cos_max;
> + struct psr_cat_cbm *cos_cbm_map;
"cos_to_cmb" would be more in keeping with Xen style, and IMO easier to
read in code.
> };
>
> struct psr_assoc {
> @@ -240,6 +246,14 @@ static void cat_cpu_init(unsigned int cpu)
> info->cbm_len = (eax & 0x1f) + 1;
> info->cos_max = (edx & 0xffff);
Apologies for missing this in the previous patch, but cos_max should
have a command line parameter like rmid_max if a lower limit wants to be
enforced.
Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> + info->cos_cbm_map = xzalloc_array(struct psr_cat_cbm,
> + info->cos_max + 1UL);
> + if ( !info->cos_cbm_map )
> + return;
> +
> + /* cos=0 is reserved as default cbm(all ones). */
> + info->cos_cbm_map[0].cbm = (1ull << info->cbm_len) - 1;
> +
> info->enabled = 1;
> printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
> socket, info->cos_max, info->cbm_len);
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 05/12] x86: maintain socket CPU mask for CAT
2015-04-09 9:18 ` [PATCH v4 05/12] x86: maintain socket CPU mask for CAT Chao Peng
@ 2015-04-09 21:45 ` Andrew Cooper
2015-04-10 7:33 ` Chao Peng
0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 21:45 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> Some CAT resource/registers exist in socket level and they must be
> accessed from the CPU of the corresponding socket. It's common to pick
> an arbitrary CPU from the socket. To make the picking easy, it's useful
> to maintain a reference to the cpu_core_mask which contains all the
> siblings of a CPU in the same socket. The reference needs to be
> synchronized with the CPU up/down.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> xen/arch/x86/psr.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 4aff5f6..7de2504 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -32,6 +32,7 @@ struct psr_cat_socket_info {
> unsigned int cbm_len;
> unsigned int cos_max;
> struct psr_cat_cbm *cos_cbm_map;
> + cpumask_t *socket_cpu_mask;
At a pinch, you could get away with just "cpus" as a variable name, as
it is part of a structure, and instanced in an array, with "socket" in
the name.
> };
>
> struct psr_assoc {
> @@ -234,6 +235,8 @@ static void cat_cpu_init(unsigned int cpu)
> ASSERT(socket < nr_sockets);
>
> info = cat_socket_info + socket;
> + if ( info->socket_cpu_mask == NULL )
> + info->socket_cpu_mask = per_cpu(cpu_core_mask, cpu);
Surely after the test_and_set_bool() ?
>
> /* Avoid initializing more than one times for the same socket. */
> if ( test_and_set_bool(info->initialized) )
> @@ -274,6 +277,24 @@ static void psr_cpu_init(unsigned int cpu)
> psr_assoc_init(cpu);
> }
>
> +static void psr_cpu_fini(unsigned int cpu)
cat_cpu_fini() to mirror cat_cpu_init() or perhaps both?
> +{
> + unsigned int socket, next;
> + cpumask_t *cpu_mask;
> +
> + if ( cat_socket_info )
> + {
> + socket = cpu_to_socket(cpu);
> + cpu_mask = cat_socket_info[socket].socket_cpu_mask;
> +
> + if ( (next = cpumask_cycle(cpu, cpu_mask)) == cpu )
> + cat_socket_info[socket].socket_cpu_mask = NULL;
> + else
> + cat_socket_info[socket].socket_cpu_mask =
> + per_cpu(cpu_core_mask, next);
Might it be easier to copy cpu_core_mask rather than playing these games
to avoid pointing into a stale per_cpu() area?
~Andrew
> + }
> +}
> +
> static int cpu_callback(
> struct notifier_block *nfb, unsigned long action, void *hcpu)
> {
> @@ -284,6 +305,9 @@ static int cpu_callback(
> case CPU_STARTING:
> psr_cpu_init(cpu);
> break;
> + case CPU_DYING:
> + psr_cpu_fini(cpu);
> + break;
> }
>
> return NOTIFY_DONE;
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 06/12] x86: add COS information for each domain
2015-04-09 9:18 ` [PATCH v4 06/12] x86: add COS information for each domain Chao Peng
@ 2015-04-09 21:54 ` Andrew Cooper
2015-04-10 7:35 ` Chao Peng
0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 21:54 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> In Xen's implementation, the CAT enforcement granularity is per domain.
> Due to the length of CBM and the number of COS may be socket-different,
> each domain has COS ID for each socket. The domain get COS=0 by default
> and at runtime its COS is then allocated dynamically when user specifies
> a CBM for the domain.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> xen/arch/x86/domain.c | 6 +++++-
> xen/arch/x86/psr.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> xen/include/asm-x86/domain.h | 5 ++++-
> xen/include/asm-x86/psr.h | 3 +++
> 4 files changed, 54 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 695a2eb..129d42f 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -616,6 +616,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
> /* 64-bit PV guest by default. */
> d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
>
> + if ( (rc = psr_domain_init(d)) != 0 )
> + goto fail;
> +
> /* initialize default tsc behavior in case tools don't */
> tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0);
> spin_lock_init(&d->arch.vtsc_lock);
> @@ -634,6 +637,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
> free_perdomain_mappings(d);
> if ( is_pv_domain(d) )
> free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
> + psr_domain_free(d);
> return rc;
> }
>
> @@ -657,7 +661,7 @@ void arch_domain_destroy(struct domain *d)
> free_xenheap_page(d->shared_info);
> cleanup_domain_irq_mapping(d);
>
> - psr_free_rmid(d);
> + psr_domain_free(d);
> }
>
> void arch_domain_shutdown(struct domain *d)
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 7de2504..51faa70 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -221,6 +221,48 @@ void psr_ctxt_switch_to(struct domain *d)
> }
> }
>
> +/* Called with domain lock held, no psr specific lock needed */
> +static void psr_free_cos(struct domain *d)
> +{
> + unsigned int socket;
> + unsigned int cos;
> +
> + if( !d->arch.psr_cos_ids )
> + return;
> +
> + for ( socket = 0; socket < nr_sockets; socket++ )
> + {
> + if ( !cat_socket_info[socket].enabled )
> + continue;
> +
> + if ( (cos = d->arch.psr_cos_ids[socket]) == 0 )
> + continue;
> +
> + cat_socket_info[socket].cos_cbm_map[cos].ref--;
> + }
> +
> + xfree(d->arch.psr_cos_ids);
> + d->arch.psr_cos_ids = NULL;
> +}
> +
> +int psr_domain_init(struct domain *d)
> +{
> + if ( cat_socket_info )
> + {
> + d->arch.psr_cos_ids = xzalloc_array(unsigned int, nr_sockets);
It is perhaps worth leaving a comment in patch 3 stating that nr_sockets
must never change after domains have been created.
That, or whomever implements CAT/socket hotplug support has to fix this
issue. (I think the code is fine to leave in its current state, but
nr_sockets changing under the feet of Xen will be a subtle bug for
someone to track down)
Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> + if ( !d->arch.psr_cos_ids )
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> +void psr_domain_free(struct domain *d)
> +{
> + psr_free_rmid(d);
> + psr_free_cos(d);
> +}
> +
> static void cat_cpu_init(unsigned int cpu)
> {
> unsigned int eax, ebx, ecx, edx;
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index 9cdffa8..9c4d0e6 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -333,7 +333,10 @@ struct arch_domain
> struct e820entry *e820;
> unsigned int nr_e820;
>
> - unsigned int psr_rmid; /* RMID assigned to the domain for CMT */
> + /* RMID assigned to the domain for CMT */
> + unsigned int psr_rmid;
> + /* COS assigned to the domain for each socket */
> + unsigned int *psr_cos_ids;
>
> /* Shared page for notifying that explicit PIRQ EOI is required. */
> unsigned long *pirq_eoi_map;
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> index 3bc5496..45392bf 100644
> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -52,6 +52,9 @@ void psr_free_rmid(struct domain *d);
>
> void psr_ctxt_switch_to(struct domain *d);
>
> +int psr_domain_init(struct domain *d);
> +void psr_domain_free(struct domain *d);
> +
> #endif /* __ASM_PSR_H__ */
>
> /*
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 07/12] x86: expose CBM length and COS number information
2015-04-09 9:18 ` [PATCH v4 07/12] x86: expose CBM length and COS number information Chao Peng
@ 2015-04-09 21:54 ` Andrew Cooper
0 siblings, 0 replies; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 21:54 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> General CAT information such as maximum COS and CBM length are exposed to
> user space by a SYSCTL hypercall, to help user space to construct the CBM.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> xen/arch/x86/psr.c | 31 +++++++++++++++++++++++++++++++
> xen/arch/x86/sysctl.c | 18 ++++++++++++++++++
> xen/include/asm-x86/psr.h | 3 +++
> xen/include/public/sysctl.h | 16 ++++++++++++++++
> 4 files changed, 68 insertions(+)
>
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 51faa70..e390fd9 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -221,6 +221,37 @@ void psr_ctxt_switch_to(struct domain *d)
> }
> }
>
> +static int get_cat_socket_info(unsigned int socket,
> + struct psr_cat_socket_info **info)
> +{
> + if ( !cat_socket_info )
> + return -ENODEV;
> +
> + if ( socket >= nr_sockets )
> + return -EBADSLT;
> +
> + if ( !cat_socket_info[socket].enabled )
> + return -ENOENT;
> +
> + *info = cat_socket_info + socket;
> + return 0;
> +}
> +
> +int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
> + uint32_t *cos_max)
> +{
> + struct psr_cat_socket_info *info;
> + int ret = get_cat_socket_info(socket, &info);
> +
> + if ( ret )
> + return ret;
> +
> + *cbm_len = info->cbm_len;
> + *cos_max = info->cos_max;
> +
> + return 0;
> +}
> +
> /* Called with domain lock held, no psr specific lock needed */
> static void psr_free_cos(struct domain *d)
> {
> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> index 611a291..8a9e120 100644
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -171,6 +171,24 @@ long arch_do_sysctl(
>
> break;
>
> + case XEN_SYSCTL_psr_cat_op:
> + switch ( sysctl->u.psr_cat_op.cmd )
> + {
> + case XEN_SYSCTL_PSR_CAT_get_l3_info:
> + ret = psr_get_cat_l3_info(sysctl->u.psr_cat_op.target,
> + &sysctl->u.psr_cat_op.u.l3_info.cbm_len,
> + &sysctl->u.psr_cat_op.u.l3_info.cos_max);
> +
> + if ( !ret && __copy_to_guest(u_sysctl, sysctl, 1) )
> + ret = -EFAULT;
> +
> + break;
> + default:
> + ret = -EOPNOTSUPP;
> + break;
> + }
> + break;
> +
> default:
> ret = -ENOSYS;
> break;
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> index 45392bf..3a8a406 100644
> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -52,6 +52,9 @@ void psr_free_rmid(struct domain *d);
>
> void psr_ctxt_switch_to(struct domain *d);
>
> +int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
> + uint32_t *cos_max);
> +
> int psr_domain_init(struct domain *d);
> void psr_domain_free(struct domain *d);
>
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 8552dc6..91d90b8 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -656,6 +656,20 @@ struct xen_sysctl_psr_cmt_op {
> typedef struct xen_sysctl_psr_cmt_op xen_sysctl_psr_cmt_op_t;
> DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cmt_op_t);
>
> +#define XEN_SYSCTL_PSR_CAT_get_l3_info 0
> +struct xen_sysctl_psr_cat_op {
> + uint32_t cmd; /* IN: XEN_SYSCTL_PSR_CAT_* */
> + uint32_t target; /* IN: socket to be operated on */
> + union {
> + struct {
> + uint32_t cbm_len; /* OUT: CBM length */
> + uint32_t cos_max; /* OUT: Maximum COS */
> + } l3_info;
> + } u;
> +};
> +typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
> +
> struct xen_sysctl {
> uint32_t cmd;
> #define XEN_SYSCTL_readconsole 1
> @@ -678,6 +692,7 @@ struct xen_sysctl {
> #define XEN_SYSCTL_scheduler_op 19
> #define XEN_SYSCTL_coverage_op 20
> #define XEN_SYSCTL_psr_cmt_op 21
> +#define XEN_SYSCTL_psr_cat_op 22
> uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
> union {
> struct xen_sysctl_readconsole readconsole;
> @@ -700,6 +715,7 @@ struct xen_sysctl {
> struct xen_sysctl_scheduler_op scheduler_op;
> struct xen_sysctl_coverage_op coverage_op;
> struct xen_sysctl_psr_cmt_op psr_cmt_op;
> + struct xen_sysctl_psr_cat_op psr_cat_op;
> uint8_t pad[128];
> } u;
> };
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 08/12] x86: dynamically get/set CBM for a domain
2015-04-09 9:18 ` [PATCH v4 08/12] x86: dynamically get/set CBM for a domain Chao Peng
@ 2015-04-09 22:06 ` Andrew Cooper
2015-04-10 7:37 ` Chao Peng
0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 22:06 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> For CAT, COS is maintained in hypervisor only while CBM is exposed to
> user space directly to allow getting/setting domain's cache capacity.
> For each specified CBM, hypervisor will either use a existed COS which
> has the same CBM or allocate a new one if the same CBM is not found. If
> the allocation fails because of no enough COS available then error is
> returned. The getting/setting are always operated on a specified socket.
> For multiple sockets system, the interface may be called several times.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> xen/arch/x86/domctl.c | 18 ++++++
> xen/arch/x86/psr.c | 126 ++++++++++++++++++++++++++++++++++++++++
> xen/include/asm-x86/msr-index.h | 1 +
> xen/include/asm-x86/psr.h | 2 +
> xen/include/public/domctl.h | 12 ++++
> 5 files changed, 159 insertions(+)
>
> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index d4f6ccf..89a6b33 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1326,6 +1326,24 @@ long arch_do_domctl(
> }
> break;
>
> + case XEN_DOMCTL_psr_cat_op:
> + switch ( domctl->u.psr_cat_op.cmd )
> + {
> + case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM:
> + ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target,
> + domctl->u.psr_cat_op.data);
> + break;
As I have just fixed up the style everywhere else in arch_do_domctl(),
could you put a newline in here, just to visually separate the case
statements slightly.
> + case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
> + ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
> + &domctl->u.psr_cat_op.data);
> + copyback = 1;
> + break;
> + default:
> + ret = -EOPNOTSUPP;
> + break;
> + }
> + break;
> +
> default:
> ret = iommu_do_domctl(domctl, d, u_domctl);
> break;
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index e390fd9..5247bcd 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -56,6 +56,17 @@ static unsigned int get_socket_count(void)
> return DIV_ROUND_UP(nr_cpu_ids, cpus_per_socket);
> }
>
> +static unsigned int get_socket_cpu(unsigned int socket)
> +{
> + if ( socket < nr_sockets )
> + {
> + cpumask_t *cpu_mask = cat_socket_info[socket].socket_cpu_mask;
Blank line between variables and code please.
> + ASSERT(cpu_mask != NULL);
> + return cpumask_any(cpu_mask);
> + }
> + return nr_cpu_ids;
> +}
> +
> static void __init parse_psr_bool(char *s, char *value, char *feature,
> unsigned int mask)
> {
> @@ -252,6 +263,121 @@ int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
> return 0;
> }
>
> +int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm)
> +{
> + unsigned int cos;
> + struct psr_cat_socket_info *info;
> + int ret = get_cat_socket_info(socket, &info);
> +
> + if ( ret )
> + return ret;
> +
> + cos = d->arch.psr_cos_ids[socket];
> + *cbm = info->cos_cbm_map[cos].cbm;
> + return 0;
> +}
> +
> +static bool_t psr_check_cbm(unsigned int cbm_len, uint64_t cbm)
> +{
> + unsigned int first_bit, zero_bit;
> +
> + /* Set bits should only in the range of [0, cbm_len). */
> + if ( cbm & (~0ull << cbm_len) )
> + return 0;
> +
> + first_bit = find_first_bit(&cbm, cbm_len);
> + zero_bit = find_next_zero_bit(&cbm, cbm_len, first_bit);
> +
> + /* Set bits should be contiguous. */
> + if ( zero_bit < cbm_len &&
> + find_next_bit(&cbm, cbm_len, zero_bit) < cbm_len )
> + return 0;
> +
> + return 1;
> +}
> +
> +struct cos_cbm_info
> +{
> + unsigned int cos;
> + uint64_t cbm;
> +};
> +
> +static void do_write_l3_cbm(void *data)
> +{
> + struct cos_cbm_info *info = data;
And again here please.
> + wrmsrl(MSR_IA32_PSR_L3_MASK(info->cos), info->cbm);
> +}
> +
> +static int write_l3_cbm(unsigned int socket, unsigned int cos, uint64_t cbm)
> +{
> + struct cos_cbm_info info = { .cos = cos, .cbm = cbm };
> +
> + if ( socket == cpu_to_socket(smp_processor_id()) )
> + do_write_l3_cbm(&info);
> + else
> + {
> + unsigned int cpu = get_socket_cpu(socket);
> +
> + if ( cpu >= nr_cpu_ids )
> + return -EBADSLT;
> + on_selected_cpus(cpumask_of(cpu), do_write_l3_cbm, &info, 1);
> + }
> +
> + return 0;
> +}
> +
> +int psr_set_l3_cbm(struct domain *d, unsigned int socket, uint64_t cbm)
> +{
> + unsigned int old_cos, cos;
> + struct psr_cat_cbm *map, *find;
> + struct psr_cat_socket_info *info;
> + int ret = get_cat_socket_info(socket, &info);
> +
> + if ( ret )
> + return ret;
> +
> + if ( !psr_check_cbm(info->cbm_len, cbm) )
> + return -EINVAL;
> +
> + old_cos = d->arch.psr_cos_ids[socket];
> + map = info->cos_cbm_map;
> + find = NULL;
> +
> + for ( cos = 0; cos <= info->cos_max; cos++ )
> + {
> + /* If still not found, then keep unused one. */
> + if ( !find && cos != 0 && map[cos].ref == 0 )
> + find = map + cos;
> + else if ( map[cos].cbm == cbm )
> + {
> + if ( unlikely(cos == old_cos) )
> + return -EEXIST;
> + find = map + cos;
> + break;
> + }
> + }
> +
> + /* If old cos is referred only by the domain, then use it. */
> + if ( !find && map[old_cos].ref == 1 )
> + find = map + old_cos;
> +
> + if ( !find )
> + return -EUSERS;
> +
> + cos = find - map;
> + if ( find->cbm != cbm )
> + {
> + ret = write_l3_cbm(socket, cos, cbm);
> + if ( ret )
> + return ret;
> + find->cbm = cbm;
> + }
> + find->ref++;
> + map[old_cos].ref--;
This can race with psr_free_cos() leading to corruption. It is possible
for two different domains to be holding their own spinlock but using the
same cos index.
You definitely do need a psr/cos spinlock for safety.
~Andrew
> + d->arch.psr_cos_ids[socket] = cos;
> + return 0;
> +}
> +
> /* Called with domain lock held, no psr specific lock needed */
> static void psr_free_cos(struct domain *d)
> {
> diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
> index 83f2f70..5425f77 100644
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -327,6 +327,7 @@
> #define MSR_IA32_CMT_EVTSEL 0x00000c8d
> #define MSR_IA32_CMT_CTR 0x00000c8e
> #define MSR_IA32_PSR_ASSOC 0x00000c8f
> +#define MSR_IA32_PSR_L3_MASK(n) (0x00000c90 + (n))
>
> /* Intel Model 6 */
> #define MSR_P6_PERFCTR(n) (0x000000c1 + (n))
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> index 3a8a406..fb474bb 100644
> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -54,6 +54,8 @@ void psr_ctxt_switch_to(struct domain *d);
>
> int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
> uint32_t *cos_max);
> +int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm);
> +int psr_set_l3_cbm(struct domain *d, unsigned int socket, uint64_t cbm);
>
> int psr_domain_init(struct domain *d);
> void psr_domain_free(struct domain *d);
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index ca0e51e..9f04836 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1005,6 +1005,16 @@ struct xen_domctl_psr_cmt_op {
> typedef struct xen_domctl_psr_cmt_op xen_domctl_psr_cmt_op_t;
> DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t);
>
> +struct xen_domctl_psr_cat_op {
> +#define XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM 0
> +#define XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM 1
> + uint32_t cmd; /* IN: XEN_DOMCTL_PSR_CAT_OP_* */
> + uint32_t target; /* IN: socket to be operated on */
> + uint64_t data; /* IN/OUT */
> +};
> +typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
> +
> struct xen_domctl {
> uint32_t cmd;
> #define XEN_DOMCTL_createdomain 1
> @@ -1080,6 +1090,7 @@ struct xen_domctl {
> #define XEN_DOMCTL_setvnumainfo 74
> #define XEN_DOMCTL_psr_cmt_op 75
> #define XEN_DOMCTL_arm_configure_domain 76
> +#define XEN_DOMCTL_psr_cat_op 77
> #define XEN_DOMCTL_gdbsx_guestmemio 1000
> #define XEN_DOMCTL_gdbsx_pausevcpu 1001
> #define XEN_DOMCTL_gdbsx_unpausevcpu 1002
> @@ -1145,6 +1156,7 @@ struct xen_domctl {
> struct xen_domctl_gdbsx_domstatus gdbsx_domstatus;
> struct xen_domctl_vnuma vnuma;
> struct xen_domctl_psr_cmt_op psr_cmt_op;
> + struct xen_domctl_psr_cat_op psr_cat_op;
> uint8_t pad[128];
> } u;
> };
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 09/12] x86: add scheduling support for Intel CAT
2015-04-09 9:18 ` [PATCH v4 09/12] x86: add scheduling support for Intel CAT Chao Peng
@ 2015-04-09 22:12 ` Andrew Cooper
2015-04-10 7:41 ` Chao Peng
0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 22:12 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> On context switch, write the the domain's Class of Service(COS) to MSR
> IA32_PQR_ASSOC, to notify hardware to use the new COS.
>
> For performance reason, the socket number and COS mask for current cpu
> is also cached in the local per-CPU variable.
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> Changes in v2:
> * merge common scheduling changes into scheduling improvement patch.
> * use readable expr for psra->cos_mask.
> ---
> xen/arch/x86/psr.c | 33 ++++++++++++++++++++++++++++++++-
> 1 file changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 5247bcd..046229d 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -37,6 +37,8 @@ struct psr_cat_socket_info {
>
> struct psr_assoc {
> uint64_t val;
> + unsigned int socket;
psr_assoc is per-cpu. Why does it need to cache its own socket like this?
If it does, please reorder socket and cos_mask to avoid the padding space.
> + uint64_t cos_mask;
> };
>
> struct psr_cmt *__read_mostly psr_cmt;
> @@ -206,9 +208,22 @@ void psr_free_rmid(struct domain *d)
>
> static inline void psr_assoc_init(unsigned int cpu)
> {
> + unsigned int socket;
> + struct psr_cat_socket_info *info;
> struct psr_assoc *psra = &per_cpu(psr_assoc, cpu);
>
> - if ( psr_cmt_enabled() )
> + if ( cat_socket_info )
> + {
> + socket = cpu_to_socket(cpu);
> + psra->socket = socket;
> +
> + info = cat_socket_info + socket;
> + if ( info->enabled )
> + psra->cos_mask = ((1ull << get_count_order(info->cos_max)) - 1)
> + << 32;
> + }
> +
> + if ( psr_cmt_enabled() || psra->cos_mask )
> rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
> }
>
> @@ -217,6 +232,12 @@ static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
> *reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
> }
>
> +static inline void psr_assoc_cos(uint64_t *reg, unsigned int cos,
> + uint64_t cos_mask)
> +{
> + *reg = (*reg & ~cos_mask) | (((uint64_t)cos << 32) & cos_mask);
> +}
> +
> void psr_ctxt_switch_to(struct domain *d)
> {
> struct psr_assoc *psra = &this_cpu(psr_assoc);
> @@ -225,11 +246,21 @@ void psr_ctxt_switch_to(struct domain *d)
> if ( psr_cmt_enabled() )
> psr_assoc_rmid(®, d->arch.psr_rmid);
>
> + if ( psra->cos_mask )
> + {
> + if ( d->arch.psr_cos_ids )
> + psr_assoc_cos(®, d->arch.psr_cos_ids[psra->socket],
> + psra->cos_mask);
> + else
> + psr_assoc_cos(®, 0, psra->cos_mask);
> + }
This can be collapsed somewhat to
if ( psra->cos_mask )
psr_assoc_cos(®, d->arch.psr_cos_ids ?
d->arch.psr_cos_ids[psra->socket] : 0, psra->cos_mask);
> +
> if ( reg != psra->val )
> {
> wrmsrl(MSR_IA32_PSR_ASSOC, reg);
> psra->val = reg;
> }
> +
Spurious whitespace change.
~Andrew
> }
>
> static int get_cat_socket_info(unsigned int socket,
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
` (11 preceding siblings ...)
2015-04-09 9:18 ` [PATCH v4 12/12] docs: add xl-psr.markdown Chao Peng
@ 2015-04-09 22:15 ` Andrew Cooper
12 siblings, 0 replies; 37+ messages in thread
From: Andrew Cooper @ 2015-04-09 22:15 UTC (permalink / raw)
To: Chao Peng, xen-devel
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, will.auld,
JBeulich, wei.liu2, dgdegra
On 09/04/2015 10:18, Chao Peng wrote:
> Changes in v4:
> * Address comments from Andrew and Ian(Detail in patch).
> * Split COS/CBM management patch into 4 small patches.
Thankyou for this - it was substantially easier to review as as result.
I think it looking in fairly good shape now, but I think you do need
some better synchronisation around the cbm recounting.
~Andrew
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 02/12] x86: improve psr scheduling code
2015-04-09 21:01 ` Andrew Cooper
@ 2015-04-10 7:24 ` Chao Peng
2015-04-10 9:28 ` Andrew Cooper
0 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-10 7:24 UTC (permalink / raw)
To: Andrew Cooper
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 10:01:53PM +0100, Andrew Cooper wrote:
> On 09/04/2015 10:18, Chao Peng wrote:
> > +static inline void psr_assoc_init(unsigned int cpu)
> > +{
> > + struct psr_assoc *psra = &per_cpu(psr_assoc, cpu);
> > +
> > + if ( psr_cmt_enabled() )
> > + rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
> > +}
>
> On further consideration, this would probably be better as a void
> function which used this_cpu() rather than per_cpu().
>
> Absolutely nothing good can come of calling it with cpu !=
> smp_processor_id(), so we should avoid that situation arising in the
> first place.
Agreed.
> > +static void psr_cpu_init(unsigned int cpu)
> > +{
> > + psr_assoc_init(cpu);
> > +}
>
> This can also turn into a void helper.
This is, however, a little different. The next patch will add
cat_cpu_init() which will make use of this 'cpu' parameter. So do you
mean calling smp_processor_id() in cat_cpu_init() as well?
Thanks,
Chao
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 04/12] x86: maintain COS to CBM mapping for each socket
2015-04-09 21:35 ` Andrew Cooper
@ 2015-04-10 7:26 ` Chao Peng
0 siblings, 0 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-10 7:26 UTC (permalink / raw)
To: Andrew Cooper
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 10:35:43PM +0100, Andrew Cooper wrote:
> On 09/04/2015 10:18, Chao Peng wrote:
> > struct psr_cat_socket_info {
> > bool_t initialized;
> > bool_t enabled;
> > unsigned int cbm_len;
> > unsigned int cos_max;
> > + struct psr_cat_cbm *cos_cbm_map;
>
> "cos_to_cmb" would be more in keeping with Xen style, and IMO easier to
> read in code.
Yeah, I like this name.
>
> > };
> >
> > struct psr_assoc {
> > @@ -240,6 +246,14 @@ static void cat_cpu_init(unsigned int cpu)
> > info->cbm_len = (eax & 0x1f) + 1;
> > info->cos_max = (edx & 0xffff);
>
> Apologies for missing this in the previous patch, but cos_max should
> have a command line parameter like rmid_max if a lower limit wants to be
> enforced.
OK, I will add it.
Thanks,
Chao
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 05/12] x86: maintain socket CPU mask for CAT
2015-04-09 21:45 ` Andrew Cooper
@ 2015-04-10 7:33 ` Chao Peng
2015-04-10 9:48 ` Andrew Cooper
0 siblings, 1 reply; 37+ messages in thread
From: Chao Peng @ 2015-04-10 7:33 UTC (permalink / raw)
To: Andrew Cooper
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 10:45:41PM +0100, Andrew Cooper wrote:
> On 09/04/2015 10:18, Chao Peng wrote:
> > Some CAT resource/registers exist in socket level and they must be
> > accessed from the CPU of the corresponding socket. It's common to pick
> > an arbitrary CPU from the socket. To make the picking easy, it's useful
> > to maintain a reference to the cpu_core_mask which contains all the
> > siblings of a CPU in the same socket. The reference needs to be
> > synchronized with the CPU up/down.
> >
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > ---
> > xen/arch/x86/psr.c | 24 ++++++++++++++++++++++++
> > 1 file changed, 24 insertions(+)
> >
> > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> > index 4aff5f6..7de2504 100644
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -32,6 +32,7 @@ struct psr_cat_socket_info {
> > unsigned int cbm_len;
> > unsigned int cos_max;
> > struct psr_cat_cbm *cos_cbm_map;
> > + cpumask_t *socket_cpu_mask;
>
> At a pinch, you could get away with just "cpus" as a variable name, as
> it is part of a structure, and instanced in an array, with "socket" in
> the name.
I like 'cpus' either.
>
> > };
> >
> > struct psr_assoc {
> > @@ -234,6 +235,8 @@ static void cat_cpu_init(unsigned int cpu)
> > ASSERT(socket < nr_sockets);
> >
> > info = cat_socket_info + socket;
> > + if ( info->socket_cpu_mask == NULL )
> > + info->socket_cpu_mask = per_cpu(cpu_core_mask, cpu);
>
> Surely after the test_and_set_bool() ?
OK.
>
> >
> > /* Avoid initializing more than one times for the same socket. */
> > if ( test_and_set_bool(info->initialized) )
> > @@ -274,6 +277,24 @@ static void psr_cpu_init(unsigned int cpu)
> > psr_assoc_init(cpu);
> > }
> >
> > +static void psr_cpu_fini(unsigned int cpu)
>
> cat_cpu_fini() to mirror cat_cpu_init() or perhaps both?
How about: Change this one to cat_cpu_fini() and add helper
psr_cpu_fini() to call cat_cpu_fini()?
>
> > +{
> > + unsigned int socket, next;
> > + cpumask_t *cpu_mask;
> > +
> > + if ( cat_socket_info )
> > + {
> > + socket = cpu_to_socket(cpu);
> > + cpu_mask = cat_socket_info[socket].socket_cpu_mask;
> > +
> > + if ( (next = cpumask_cycle(cpu, cpu_mask)) == cpu )
> > + cat_socket_info[socket].socket_cpu_mask = NULL;
> > + else
> > + cat_socket_info[socket].socket_cpu_mask =
> > + per_cpu(cpu_core_mask, next);
>
> Might it be easier to copy cpu_core_mask rather than playing these games
> to avoid pointing into a stale per_cpu() area?
I didn't quite catch up with you on this. Even with copied
cpu_core_mask, we still need to sync it with cpu online/offline.
Otherwise its value may not be correct. The key point here, I think,
is: We should always trust per_cpu(cpu_core_mask, cpu) in a cpu's life
cycle, Otherwise copy or point to it should both be wrong.
Chao
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 06/12] x86: add COS information for each domain
2015-04-09 21:54 ` Andrew Cooper
@ 2015-04-10 7:35 ` Chao Peng
0 siblings, 0 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-10 7:35 UTC (permalink / raw)
To: Andrew Cooper
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 10:54:00PM +0100, Andrew Cooper wrote:
> On 09/04/2015 10:18, Chao Peng wrote:
> > + {
> > + d->arch.psr_cos_ids = xzalloc_array(unsigned int, nr_sockets);
>
> It is perhaps worth leaving a comment in patch 3 stating that nr_sockets
> must never change after domains have been created.
>
> That, or whomever implements CAT/socket hotplug support has to fix this
> issue. (I think the code is fine to leave in its current state, but
> nr_sockets changing under the feet of Xen will be a subtle bug for
> someone to track down)
>
> Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
That's right, keep nr_sockets as constant at runtime is the basic
assumption.
Chao
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 08/12] x86: dynamically get/set CBM for a domain
2015-04-09 22:06 ` Andrew Cooper
@ 2015-04-10 7:37 ` Chao Peng
0 siblings, 0 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-10 7:37 UTC (permalink / raw)
To: Andrew Cooper
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 11:06:45PM +0100, Andrew Cooper wrote:
> On 09/04/2015 10:18, Chao Peng wrote:
> > + find->cbm = cbm;
> > + }
> > + find->ref++;
> > + map[old_cos].ref--;
>
> This can race with psr_free_cos() leading to corruption. It is possible
> for two different domains to be holding their own spinlock but using the
> same cos index.
>
> You definitely do need a psr/cos spinlock for safety.
Indeed. Thanks.
Chao
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 09/12] x86: add scheduling support for Intel CAT
2015-04-09 22:12 ` Andrew Cooper
@ 2015-04-10 7:41 ` Chao Peng
0 siblings, 0 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-10 7:41 UTC (permalink / raw)
To: Andrew Cooper
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 11:12:24PM +0100, Andrew Cooper wrote:
> On 09/04/2015 10:18, Chao Peng wrote:
> > On context switch, write the the domain's Class of Service(COS) to MSR
> > IA32_PQR_ASSOC, to notify hardware to use the new COS.
> >
> > For performance reason, the socket number and COS mask for current cpu
> > is also cached in the local per-CPU variable.
> >
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > ---
> > Changes in v2:
> > * merge common scheduling changes into scheduling improvement patch.
> > * use readable expr for psra->cos_mask.
> > ---
> > xen/arch/x86/psr.c | 33 ++++++++++++++++++++++++++++++++-
> > 1 file changed, 32 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> > index 5247bcd..046229d 100644
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -37,6 +37,8 @@ struct psr_cat_socket_info {
> >
> > struct psr_assoc {
> > uint64_t val;
> > + unsigned int socket;
>
> psr_assoc is per-cpu. Why does it need to cache its own socket like this?
>
> If it does, please reorder socket and cos_mask to avoid the padding space.
Just want to eliminate the need to perform
'cpu_to_socket(smp_processor_id())' for each context switch.
Since it's cheap, I have no problem to not cache it here.
Chao
>
> > + uint64_t cos_mask;
> > };
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 12/12] docs: add xl-psr.markdown
2015-04-09 11:29 ` Andrew Cooper
@ 2015-04-10 7:45 ` Chao Peng
0 siblings, 0 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-10 7:45 UTC (permalink / raw)
To: Andrew Cooper
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 09, 2015 at 12:29:38PM +0100, Andrew Cooper wrote:
> On 09/04/15 10:18, Chao Peng wrote:
> > Add document to introduce basic concepts and terms in PSR family
> > techonologies and the xl/libxl interfaces.
>
> Very nice! A few minor comments...
>
> >
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > ---
> > docs/man/xl.pod.1 | 7 +++
> > docs/misc/xl-psr.markdown | 111 ++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 118 insertions(+)
> > create mode 100644 docs/misc/xl-psr.markdown
> >
> > +## Cache Monitoring Technology (CMT)
> > +
> > +Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
> > +and later server platforms that allows an OS or Hypervisor/VMM to determine
> > +the usage of cache(currently only L3 cache supported)
>
> L3, or LLC? This appears to be used ambiguously, but does have a
> material impact for system with L4 caches.
As there is no CMT+L4 working actually, I'd keep it as L3 right now.
>
> > by applications running
> > +on the platform. A Resource Monitoring ID (RMID) is the abstraction of the
> > +application(s) that will be monitored for its cache usage. The CMT hardware
> > +tracks cache utilization of memory accesses according to the RMID and reports
> > +monitored data via a counter register.
> > +
> > +Detailed information please refer to Intel SDM chapter 17.14.
>
> Please put the chapter title as well, as the numbering does alter slowly
> over time.
Sure.
>
> > +
> > +In Xen's implementation, each domain in the system can be assigned a RMID
> > +independently, while RMID=0 is reserved for monitoring domains that doesn't
> > +enable CMT service. RMID is opaque for xl/libxl and is only used in
> > +hypervisor.
> > +
> > +### xl interfaces
> > +
> > +A domain is assigned a RMID implicitly by attaching it to CMT service:
> > +
> > +xl psr-cmt-attach domid
> > +
> > +After that, cache usage for the domain can be showed by:
> > +
> > +xl psr-cmt-show cache_occupancy <domid>
> > +
> > +Once monitoring is not needed any more, the domain can be detached from the
> > +CMT service by:
> > +
> > +xl psr-cmt-detach domid
> > +
> > +The attaching may fail because of no free RMID available. In such case
> > +unused RMID(s) can be freed by detaching corresponding domains from CMT
> > +services. Maximum COS number in the system can also be obtained by:
>
> You have not yet introduced COS as a term. Perhaps this bit is better
> moving down to the CAT section?
Good catch. Thanks.
>
> > +
> > +xl psr_cmt-show
>
> "psr-cmt-show"
>
> I am not sure how wise it is to dump information like max rmid/max cos
> into cmt-show.
>
> Is it perhaps worth having an `xl psr-hwinfo` (or equivalent) which will
> dump the hardware capabilities, per-socket limits etc, as a consise way
> to obtain all relevant information?
Sounds reasonable. I will do it.
>
> > +
> > +## Memory Bandwidth Monitoring (MBM)
> > +
> > +Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel
> > +Broadwell and later server platforms which builds on the CMT infrastructure to
> > +allow monitoring of system memory bandwidth. It introduces two new monitoring
> > +event type to monitor system total/local memory bandwidth. The same RMID can
> > +be used to monitor both cache usage and memory bandwidth at the same time.
> > +
> > +Detailed information please refer to Intel SDM chapter 17.14.
> > +
> > +In Xen's implementation, MBM shares the same set of underlying monitoring
> > +service with CMT and can be used to monitor memory bandwidth on domain basis.
> > +
> > +The xl/libxl interface is the same with that of CMT. The difference is the
> > +monitor type is corresponding memory monitoring type(local_mem_bandwidth/
> > +total_mem_bandwidth) but not cache_occupancy.
> > +
> > +## Cache Allocation Technology (CAT)
> > +
> > +Cache Allocation Technology (CAT) is a new feature available on Intel
> > +Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
> > +partition cache allocation(i.e. L3 cache) based on application priority or
> > +Class of Service(COS). Each COS is configured using capacity bitmasks (CBM)
> > +which represent cache capacity and indicate the degree of overlap and
> > +isolation between classes. System cache resource is divided into numbers of
> > +minimum portions which is then made up into subset for cache partition. Each
> > +portion corresponds to a bit in CBM and the set bit represents the
> > +corresponding cache portion is available.
> > +
> > +Detailed information please refer to Intel SDM chapter 17.15.
> > +
> > +In Xen's implementation, CBM can be set/get with libxl/xl interfaces but COS
>
> Strictly speaking that should be "set/got" in english, but "configured"
> would be a better alternative.
Exactly, thanks.
Chao
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 02/12] x86: improve psr scheduling code
2015-04-10 7:24 ` Chao Peng
@ 2015-04-10 9:28 ` Andrew Cooper
0 siblings, 0 replies; 37+ messages in thread
From: Andrew Cooper @ 2015-04-10 9:28 UTC (permalink / raw)
To: Chao Peng
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On 10/04/15 08:24, Chao Peng wrote:
> On Thu, Apr 09, 2015 at 10:01:53PM +0100, Andrew Cooper wrote:
>> On 09/04/2015 10:18, Chao Peng wrote:
>>> +static inline void psr_assoc_init(unsigned int cpu)
>>> +{
>>> + struct psr_assoc *psra = &per_cpu(psr_assoc, cpu);
>>> +
>>> + if ( psr_cmt_enabled() )
>>> + rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
>>> +}
>> On further consideration, this would probably be better as a void
>> function which used this_cpu() rather than per_cpu().
>>
>> Absolutely nothing good can come of calling it with cpu !=
>> smp_processor_id(), so we should avoid that situation arising in the
>> first place.
> Agreed.
>
>>> +static void psr_cpu_init(unsigned int cpu)
>>> +{
>>> + psr_assoc_init(cpu);
>>> +}
>> This can also turn into a void helper.
> This is, however, a little different. The next patch will add
> cat_cpu_init() which will make use of this 'cpu' parameter. So do you
> mean calling smp_processor_id() in cat_cpu_init() as well?
Yes - that is probably best. While cat_cpu_init() doesn't actually
interact with any MSRs, we still never want to be in a position to
execute it against the "wrong" cpu.
~Andrew
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 05/12] x86: maintain socket CPU mask for CAT
2015-04-10 7:33 ` Chao Peng
@ 2015-04-10 9:48 ` Andrew Cooper
0 siblings, 0 replies; 37+ messages in thread
From: Andrew Cooper @ 2015-04-10 9:48 UTC (permalink / raw)
To: Chao Peng
Cc: keir, Ian.Campbell, stefano.stabellini, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On 10/04/15 08:33, Chao Peng wrote:
> On Thu, Apr 09, 2015 at 10:45:41PM +0100, Andrew Cooper wrote:
>> On 09/04/2015 10:18, Chao Peng wrote:
>>> Some CAT resource/registers exist in socket level and they must be
>>> accessed from the CPU of the corresponding socket. It's common to pick
>>> an arbitrary CPU from the socket. To make the picking easy, it's useful
>>> to maintain a reference to the cpu_core_mask which contains all the
>>> siblings of a CPU in the same socket. The reference needs to be
>>> synchronized with the CPU up/down.
>>>
>>> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
>>> ---
>>> xen/arch/x86/psr.c | 24 ++++++++++++++++++++++++
>>> 1 file changed, 24 insertions(+)
>>>
>>> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
>>> index 4aff5f6..7de2504 100644
>>> --- a/xen/arch/x86/psr.c
>>> +++ b/xen/arch/x86/psr.c
>>> @@ -32,6 +32,7 @@ struct psr_cat_socket_info {
>>> unsigned int cbm_len;
>>> unsigned int cos_max;
>>> struct psr_cat_cbm *cos_cbm_map;
>>> + cpumask_t *socket_cpu_mask;
>> At a pinch, you could get away with just "cpus" as a variable name, as
>> it is part of a structure, and instanced in an array, with "socket" in
>> the name.
> I like 'cpus' either.
>
>>> };
>>>
>>> struct psr_assoc {
>>> @@ -234,6 +235,8 @@ static void cat_cpu_init(unsigned int cpu)
>>> ASSERT(socket < nr_sockets);
>>>
>>> info = cat_socket_info + socket;
>>> + if ( info->socket_cpu_mask == NULL )
>>> + info->socket_cpu_mask = per_cpu(cpu_core_mask, cpu);
>> Surely after the test_and_set_bool() ?
> OK.
>
>>>
>>> /* Avoid initializing more than one times for the same socket. */
>>> if ( test_and_set_bool(info->initialized) )
>>> @@ -274,6 +277,24 @@ static void psr_cpu_init(unsigned int cpu)
>>> psr_assoc_init(cpu);
>>> }
>>>
>>> +static void psr_cpu_fini(unsigned int cpu)
>> cat_cpu_fini() to mirror cat_cpu_init() or perhaps both?
> How about: Change this one to cat_cpu_fini() and add helper
> psr_cpu_fini() to call cat_cpu_fini()?
Yes - looks best.
>
>>> +{
>>> + unsigned int socket, next;
>>> + cpumask_t *cpu_mask;
>>> +
>>> + if ( cat_socket_info )
>>> + {
>>> + socket = cpu_to_socket(cpu);
>>> + cpu_mask = cat_socket_info[socket].socket_cpu_mask;
>>> +
>>> + if ( (next = cpumask_cycle(cpu, cpu_mask)) == cpu )
>>> + cat_socket_info[socket].socket_cpu_mask = NULL;
>>> + else
>>> + cat_socket_info[socket].socket_cpu_mask =
>>> + per_cpu(cpu_core_mask, next);
>> Might it be easier to copy cpu_core_mask rather than playing these games
>> to avoid pointing into a stale per_cpu() area?
> I didn't quite catch up with you on this. Even with copied
> cpu_core_mask, we still need to sync it with cpu online/offline.
> Otherwise its value may not be correct. The key point here, I think,
> is: We should always trust per_cpu(cpu_core_mask, cpu) in a cpu's life
> cycle, Otherwise copy or point to it should both be wrong.
I suppose that this is all working around the fact that we don't
currently maintain "socket" state like we maintain node and core
information when cpus go offline/come online.
It would probably be best to fix that up properly, rather than to try
and retrofit part of into the CAT infrastructure. I think you can get
away with simply introducing socket variants of "cpu_to_node" and
"node_to_cpumask". The underlying cpumasks themselves need not change.
~Andrew
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 11/12] tools: add tools support for Intel CAT
2015-04-09 9:18 ` [PATCH v4 11/12] tools: add tools support for Intel CAT Chao Peng
2015-04-09 10:50 ` Wei Liu
@ 2015-04-16 11:20 ` Ian Campbell
1 sibling, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2015-04-16 11:20 UTC (permalink / raw)
To: Chao Peng
Cc: keir, stefano.stabellini, andrew.cooper3, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, 2015-04-09 at 17:18 +0800, Chao Peng wrote:
> +=item B<psr-cat-cbm-set> [I<OPTIONS>] [I<domain-id>] [I<cbm>]
> +
> +Set cache capacity bitmasks(CBM) for a domain.
I can see from the example in the commit log that I<cbm> is a number,
but a) I can't tell that from these docs and b) I have no idea what that
number actually does based on these docs.
Also, are I<domain-id> and I<cmb> really optional to this command as
implied by the surrounding []'s
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 5eec092..3800738 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -718,6 +718,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
> * If this is defined, the Memory Bandwidth Monitoring feature is supported.
> */
> #define LIBXL_HAVE_PSR_MBM 1
> +
> +/*
> + * LIBXL_HAVE_PSR_CAT
> + *
> + * If this is defined, the Cache Allocation Technology feature is supported.
> + */
> +#define LIBXL_HAVE_PSR_CAT 1
> #endif
>
> typedef char **libxl_string_list;
> @@ -1513,6 +1520,25 @@ int libxl_psr_cmt_get_sample(libxl_ctx *ctx,
> uint64_t *tsc_r);
> #endif
>
> +#ifdef LIBXL_HAVE_PSR_CAT
> +
> +#define LIBXL_PSR_TARGET_ALL (~0U)
> +int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
> + libxl_psr_cbm_type type, uint32_t target,
> + uint64_t cbm);
> +int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
> + libxl_psr_cbm_type type, uint32_t target,
> + uint64_t *cbm_r);
> +
> +/*
> + * On success, the function returns an array of elements in 'info',
> + * and the length in 'nr'.
> 'info' is from malloc so it must be freed
> + * by the caller.
This is true of all libxl interfaces and is documented generically near
the top, so no need to repeat it here.
It's also strictly speaking required to call libxl_psr_cat_info_dispose
on each list element. We usually provide a libxl_psr_cat_info_list_free
helper which does all of the disposes and then frees the containing
array. See libxl_vcpuinfo_list_free (and a bunch of others near it in
libxl_utils.c) for example.
> + */
> +int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
> + uint32_t *nr);
> @@ -247,6 +308,99 @@ out:
> return rc;
> }
>
> +int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
> + libxl_psr_cbm_type type, uint32_t target,
> + uint64_t cbm)
> +{
> + GC_INIT(ctx);
> + int rc;
> + uint32_t i, nr_sockets;
> +
> + if (target != LIBXL_PSR_TARGET_ALL) {
> + rc = xc_psr_cat_set_domain_data(ctx->xch, domid, type, target, cbm);
Per CODING_STYLE a variable called rc must only ever contain a libxl
error code, which xc_* doesn't return. You should use a second variable
called r or some such.
> + if (rc < 0) {
> + libxl__psr_cat_log_err_msg(gc, errno);
> + rc = ERROR_FAIL;
> + }
> + } else {
> + rc = libxl__count_physical_sockets(gc, &nr_sockets);
> + if (rc) {
> + LOGE(ERROR, "failed to get system socket count");
> + rc = ERROR_FAIL;
In this case rc should already (correctly) contain a libxl error code
from the call to libxl__count_physical_sockets, which should be
propagated.
> + goto out;
> + }
> + for (i = 0; i < nr_sockets; i++) {
> + rc = xc_psr_cat_set_domain_data(ctx->xch, domid, type, i, cbm);
But here again you should use a variable other than rc.
> + if (rc < 0) {
> + libxl__psr_cat_log_err_msg(gc, errno);
> + rc = ERROR_FAIL;
> + goto out;
> + }
> + }
> + }
> +
> +out:
> + GC_FREE;
> + return rc;
> +}
> +
> +int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
> + libxl_psr_cbm_type type, uint32_t target,
> + uint64_t *cbm_r)
> +{
> + GC_INIT(ctx);
> + int rc;
> +
> + rc = xc_psr_cat_get_domain_data(ctx->xch, domid, type, target, cbm_r);
Don't use rc here.
> + if (rc < 0) {
> + libxl__psr_cat_log_err_msg(gc, errno);
> + rc = ERROR_FAIL;
> + }
> +
> + GC_FREE;
> + return rc;
> +}
> +
> +int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
> + uint32_t *nr)
> +{
> + GC_INIT(ctx);
> + int rc;
> + uint32_t i, nr_sockets;
> + libxl_psr_cat_info *ptr;
> +
> + rc = libxl__count_physical_sockets(gc, &nr_sockets);
> + if (rc) {
> + LOGE(ERROR, "failed to get system socket count");
> + rc = ERROR_FAIL;
Don't overwrite rc here.
> + goto out;
> + }
> +
> + ptr = malloc(nr_sockets * sizeof(libxl_psr_cat_info));
> + if (!ptr) {
As Wei says use libxl__malloc with NOGC and then you don't need to check
for failure.
> + LOGE(ERROR, "failed to allocate cat info");
> + rc = ERROR_FAIL;
> + goto out;
> + }
> +
> + for (i = 0; i < nr_sockets; i++) {
> + rc = xc_psr_cat_get_l3_info(ctx->xch, i, &ptr[i].cos_max,
> + &ptr[i].cbm_len);
Don't use rc here.
> + if (rc) {
> + libxl__psr_cat_log_err_msg(gc, errno);
> + rc = ERROR_FAIL;
> + free(ptr);
> + goto out;
> + }
> + }
> +
> + *info = ptr;
> + *nr = nr_sockets;
> +out:
> + GC_FREE;
> + return rc;
> +}
> +
> /*
> * Local variables:
> * mode: C
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 04faf98..0a5f436 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> +
> + printf("\n");
> +}
> +
> +static int psr_cat_print_domain_cbm(uint32_t domid, uint32_t socket)
> +{
> + int i, nr_domains;
> + libxl_dominfo *list;
> +
> + if (domid != INVALID_DOMID) {
> + psr_cat_print_one_domain_cbm(domid, socket);
> + return 0;
> + }
> +
> + if (!(list = libxl_list_domain(ctx, &nr_domains))) {
> + fprintf(stderr, "Failed to get domain list for cbm display\n");
> + return -1;
> + }
> +
> + for (i = 0; i < nr_domains; i++)
> + psr_cat_print_one_domain_cbm(list[i].domid, socket);
> + libxl_dominfo_list_free(list, nr_domains);
> +
> + printf("\n");
Since psr_cat_print_one_domain_cbm ends with this too do you not end up
with an extra blank line?
> +static int psr_cat_show(uint32_t domid)
> +{
> + uint32_t socket, nr_sockets;
> + int rc;
> + libxl_psr_cat_info *info;
> +
> + rc = libxl_psr_cat_get_l3_info(ctx, &info, &nr_sockets);
> + if (rc) {
> + fprintf(stderr, "Failed to get cat info\n");
> + return rc;
> + }
> +
> + for (socket = 0; socket < nr_sockets; socket++) {
> + rc = psr_cat_print_socket(domid, socket, info + socket);
> + if (rc)
> + goto out;
> + }
> +
> +out:
> + free(info);
Should use (to be added) libxl_psr_cat_info_list_free.
> + if (strlen(ptr) > 2 && ptr[0] == '0' && ptr[1] == 'x')
> + cbm = strtoll(ptr, NULL , 16);
> + else
> + cbm = strtoll(ptr, NULL , 10);
Passing 0 as the base (third) parameter to strtoll does the right thing
wrt 0x prefixes etc. No need to replicate manually.
A bunch of pretty minor comments but it is mostly looking good, thanks.
Ian.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 12/12] docs: add xl-psr.markdown
2015-04-09 9:18 ` [PATCH v4 12/12] docs: add xl-psr.markdown Chao Peng
2015-04-09 11:29 ` Andrew Cooper
@ 2015-04-16 11:58 ` Ian Campbell
2015-04-17 14:39 ` Chao Peng
1 sibling, 1 reply; 37+ messages in thread
From: Ian Campbell @ 2015-04-16 11:58 UTC (permalink / raw)
To: Chao Peng
Cc: keir, stefano.stabellini, andrew.cooper3, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, 2015-04-09 at 17:18 +0800, Chao Peng wrote:
> Add document to introduce basic concepts and terms in PSR family
> techonologies and the xl/libxl interfaces.
"technologies"
>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
> docs/man/xl.pod.1 | 7 +++
> docs/misc/xl-psr.markdown | 111 ++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 118 insertions(+)
> create mode 100644 docs/misc/xl-psr.markdown
>
> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> index dfab921..b71d6e6 100644
> --- a/docs/man/xl.pod.1
> +++ b/docs/man/xl.pod.1
> @@ -1472,6 +1472,9 @@ occupancy monitoring share the same set of underlying monitoring service. Once
> a domain is attached to the monitoring service, monitoring data can be showed
> for any of these monitoring types.
>
> +See L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html> for more
> +informations.
> +
> =over 4
>
> =item B<psr-cmt-attach> [I<domain-id>]
> @@ -1501,6 +1504,9 @@ applications. In Xen implementation, CAT is used to control cache allocation
> on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
> (CBM) for the domain.
>
> +See L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html> for more
> +informations.
Aha, this is what I was missing in the previous patch, thanks.
I think it would still be useful to either briefly explain what the
bitmap contains here or to explicitly refer people to this new document
for that bit of information.
information should be singular not plural both here and above.
> +
> =over 4
>
> =item B<psr-cat-cbm-set> [I<OPTIONS>] [I<domain-id>] [I<cbm>]
> @@ -1546,6 +1552,7 @@ And the following documents on the xen.org website:
> L<http://xenbits.xen.org/docs/unstable/misc/xl-network-configuration.html>
> L<http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt>
> L<http://xenbits.xen.org/docs/unstable/misc/xsm-flask.txt>
> +L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>
>
> For systems that don't automatically bring CPU online:
>
> diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
> new file mode 100644
> index 0000000..44f6f8c
> --- /dev/null
> +++ b/docs/misc/xl-psr.markdown
> @@ -0,0 +1,111 @@
> +# Intel Platform Shared Resource Monitoring/Control in xl/libxl
> +
> +This document introduces Intel Platform Shared Resource Monitoring/Control
> +technologies, their basic concepts and the xl/libxl interfaces.
AFAICT this document only covers xl and not libxl, which I think is fine
(this doc is for users, not developers of other toolstacks), but I think
you should omit mention of libxl.
Whether or not other users of libxl (i.e. the libvirt developers) would
appreciate a dev focused doc covering the libxl interface I'm not sure.
Given the libxl.h prototypes and this doc I suppose it will be pretty
straight forwards.
BTW, do you know if someone is planning to work on libvirt integration
for this stuff?
> +
> +## Cache Monitoring Technology (CMT)
> +
> +Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
> +and later server platforms that allows an OS or Hypervisor/VMM to determine
> +the usage of cache(currently only L3 cache supported) by applications running
> +on the platform. A Resource Monitoring ID (RMID) is the abstraction of the
> +application(s) that will be monitored for its cache usage. The CMT hardware
> +tracks cache utilization of memory accesses according to the RMID and reports
> +monitored data via a counter register.
> +
> +Detailed information please refer to Intel SDM chapter 17.14.
> +
> +In Xen's implementation, each domain in the system can be assigned a RMID
> +independently, while RMID=0 is reserved for monitoring domains that doesn't
> +enable CMT service. RMID is opaque for xl/libxl and is only used in
> +hypervisor.
> +
> +### xl interfaces
> +
> +A domain is assigned a RMID implicitly by attaching it to CMT service:
> +
> +xl psr-cmt-attach domid
<domid> to match the syntax used in the next example?
If you wrap the command line examples in backticks (like `xl
do-a-thing`)
> +
> +After that, cache usage for the domain can be showed by:
> +
> +xl psr-cmt-show cache_occupancy <domid>
(Aside: "cache-occupancy" would be more in keeping with the interfaces,
oh well, you can fix if you feel like it, or not bother if you like).
> +Once monitoring is not needed any more, the domain can be detached from the
> +CMT service by:
> +
> +xl psr-cmt-detach domid
<domid> again?
> +
> +The attaching may fail because of no free RMID available. In such case
I think "An attach may fail..." here.
> +unused RMID(s) can be freed by detaching corresponding domains from CMT
> +services. Maximum COS number in the system can also be obtained by:
I think COS hasn't been defined at this point.
> +
> +xl psr_cmt-show
psr-cmt-show? (Hopefully, if the interface is actually inconsistent we
should fix it)
> +
> +## Memory Bandwidth Monitoring (MBM)
> +
> +Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel
> +Broadwell and later server platforms which builds on the CMT infrastructure to
> +allow monitoring of system memory bandwidth. It introduces two new monitoring
> +event type to monitor system total/local memory bandwidth. The same RMID can
> +be used to monitor both cache usage and memory bandwidth at the same time.
> +
> +Detailed information please refer to Intel SDM chapter 17.14.
> +
> +In Xen's implementation, MBM shares the same set of underlying monitoring
> +service with CMT and can be used to monitor memory bandwidth on domain basis.
"...on a per domain basis"
> +
> +The xl/libxl interface is the same with that of CMT. The difference is the
> +monitor type is corresponding memory monitoring type(local_mem_bandwidth/
^ missing space
> +total_mem_bandwidth) but not cache_occupancy.
I think add: e.g. After a `xl psr-attach`:
`xl psr-cmt-show local_mem_bandwidth <domid>`
`xl psr-cmt-show total_mem_bandwidth <domid>`
To make it clear that the paragraph refers to the argument to
psr-cmt-show and that psr-attach is needed.
> +
> +## Cache Allocation Technology (CAT)
> +
> +Cache Allocation Technology (CAT) is a new feature available on Intel
> +Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
> +partition cache allocation(i.e. L3 cache) based on application priority or
^ missing space
> +Class of Service(COS).
^ missing space
Here is the definition of COS I was looking for.
I'm not sure if the previous mention of "xl psr_cmt-show" getting the
maximum COS is relevant in the context of CMT where it is now, if not
then maybe move that here?
If it is then perhaps some reordering of the sections would allow COS to
be defined first?
> Each COS is configured using capacity bitmasks (CBM)
> +which represent cache capacity and indicate the degree of overlap and
> +isolation between classes. System cache resource is divided into numbers of
> +minimum portions which is then made up into subset for cache partition. Each
> +portion corresponds to a bit in CBM and the set bit represents the
> +corresponding cache portion is available.
> +
> +Detailed information please refer to Intel SDM chapter 17.15.
Perhaps a few simple examples, would make the basics clearer without
having to hit the SDM for the full gory detail e.g.
For example, assuming a system with 8 portions and 3 domains:
A CBM of 0xff for every domain means each domain can access the
whole cache. This is the default.
Giving one domain a CBM of 0x0F and the other two domain's 0xF0
means that the first domain gets exclusive access to half of the
cache (half of the portions) and the other two will share the
other half.
Giving one domain a CBM of 0x0F, one 0x30 and the last 0xc0
would give the first domain exclusive access to half the cache,
and the other two exclusive access to one quarter each.
Then have the reference the SDM for more detailed stuff.
> +
> +In Xen's implementation, CBM can be set/get with libxl/xl interfaces but COS
> +is maintained in hypervisor only. The cache partition granularity is per
> +domain, each domain has COS=0 assigned by default, the corresponding CBM is
> +all-ones, which means all the cache resource can be used by default.
> +
> +### xl interfaces
> +
> +The simplest way to change a domain's CBM from its default is running:
> +
> +xl psr-cat-cbm-set [OPTIONS] <domid> <cbm>
> +
> +where cbm is a decimal/hexadecimal number to represent the corresponding cache
> +subset can be used.
> +
> +A cbm is valid only when:
> +
> + * Set bits only exist in the range of [0, cbm_len), where cbm_len can be
> + obtained with 'xl psr-cat-show'.
> + * All the set bits is contiguous.
"are contiguous".
> + * Is not the same with the current cbm of the domain.
Can't we just implement this as a NOP and avoid this restriction?
If not then the text should "Is not the same as the current..."
> +In multi-sockets system, the same cbm will be set to each socket by default.
"In a multi-socket system, the same cbm will be set on each socket...".
> +Per socket cbm can be specified with '--socket SOCKET' option.
"specified with the `--socket SOCKET`.
> +
> +The cbm may be not set successfully because of no enough COS available.
Setting the CBM may not be successful if insufficient COS are available.
> In such
> +case unused COS(es) may be freed by setting CBM of all related domains to its
> +default value(all-ones).
> +
> +System CAT information(such as maximum COS and CBM length) and per domain CBM
^space
> +settings can be showed by:
"shown"
> +
> +xl psr-cat-show
> +
> +## Reference
> +
> +[1] Intel SDM
> +(http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html).
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v4 12/12] docs: add xl-psr.markdown
2015-04-16 11:58 ` Ian Campbell
@ 2015-04-17 14:39 ` Chao Peng
0 siblings, 0 replies; 37+ messages in thread
From: Chao Peng @ 2015-04-17 14:39 UTC (permalink / raw)
To: Ian Campbell
Cc: keir, stefano.stabellini, andrew.cooper3, Ian.Jackson, xen-devel,
will.auld, JBeulich, wei.liu2, dgdegra
On Thu, Apr 16, 2015 at 12:58:16PM +0100, Ian Campbell wrote:
> On Thu, 2015-04-09 at 17:18 +0800, Chao Peng wrote:
>
> BTW, do you know if someone is planning to work on libvirt integration
> for this stuff?
As I know, there are people from Intel will take care of this.
> (Aside: "cache-occupancy" would be more in keeping with the interfaces,
> oh well, you can fix if you feel like it, or not bother if you like).
It's fixed in v5.
> > +
> > +Detailed information please refer to Intel SDM chapter 17.15.
>
> Perhaps a few simple examples, would make the basics clearer without
> having to hit the SDM for the full gory detail e.g.
>
> For example, assuming a system with 8 portions and 3 domains:
>
> A CBM of 0xff for every domain means each domain can access the
> whole cache. This is the default.
>
> Giving one domain a CBM of 0x0F and the other two domain's 0xF0
> means that the first domain gets exclusive access to half of the
> cache (half of the portions) and the other two will share the
> other half.
>
> Giving one domain a CBM of 0x0F, one 0x30 and the last 0xc0
> would give the first domain exclusive access to half the cache,
> and the other two exclusive access to one quarter each.
>
> Then have the reference the SDM for more detailed stuff.
Thank you, for typing so much ...
Chao
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2015-04-17 14:39 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-09 9:18 [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Chao Peng
2015-04-09 9:18 ` [PATCH v4 01/12] x86: clean up psr boot parameter parsing Chao Peng
2015-04-09 20:38 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 02/12] x86: improve psr scheduling code Chao Peng
2015-04-09 21:01 ` Andrew Cooper
2015-04-10 7:24 ` Chao Peng
2015-04-10 9:28 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 03/12] x86: detect and initialize Intel CAT feature Chao Peng
2015-04-09 21:30 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 04/12] x86: maintain COS to CBM mapping for each socket Chao Peng
2015-04-09 21:35 ` Andrew Cooper
2015-04-10 7:26 ` Chao Peng
2015-04-09 9:18 ` [PATCH v4 05/12] x86: maintain socket CPU mask for CAT Chao Peng
2015-04-09 21:45 ` Andrew Cooper
2015-04-10 7:33 ` Chao Peng
2015-04-10 9:48 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 06/12] x86: add COS information for each domain Chao Peng
2015-04-09 21:54 ` Andrew Cooper
2015-04-10 7:35 ` Chao Peng
2015-04-09 9:18 ` [PATCH v4 07/12] x86: expose CBM length and COS number information Chao Peng
2015-04-09 21:54 ` Andrew Cooper
2015-04-09 9:18 ` [PATCH v4 08/12] x86: dynamically get/set CBM for a domain Chao Peng
2015-04-09 22:06 ` Andrew Cooper
2015-04-10 7:37 ` Chao Peng
2015-04-09 9:18 ` [PATCH v4 09/12] x86: add scheduling support for Intel CAT Chao Peng
2015-04-09 22:12 ` Andrew Cooper
2015-04-10 7:41 ` Chao Peng
2015-04-09 9:18 ` [PATCH v4 10/12] xsm: add CAT related xsm policies Chao Peng
2015-04-09 9:18 ` [PATCH v4 11/12] tools: add tools support for Intel CAT Chao Peng
2015-04-09 10:50 ` Wei Liu
2015-04-16 11:20 ` Ian Campbell
2015-04-09 9:18 ` [PATCH v4 12/12] docs: add xl-psr.markdown Chao Peng
2015-04-09 11:29 ` Andrew Cooper
2015-04-10 7:45 ` Chao Peng
2015-04-16 11:58 ` Ian Campbell
2015-04-17 14:39 ` Chao Peng
2015-04-09 22:15 ` [PATCH v4 00/12] enable Cache Allocation Technology (CAT) for VMs Andrew Cooper
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.