xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Yi Sun <yi.y.sun@linux.intel.com>
To: xen-devel@lists.xenproject.org
Cc: kevin.tian@intel.com, wei.liu2@citrix.com,
	andrew.cooper3@citrix.com, dario.faggioli@citrix.com,
	ian.jackson@eu.citrix.com, Yi Sun <yi.y.sun@linux.intel.com>,
	julien.grall@arm.com, mengxu@cis.upenn.edu, jbeulich@suse.com,
	chao.p.peng@linux.intel.com, roger.pau@citrix.com
Subject: [PATCH v1 01/13] docs: create Memory Bandwidth Allocation (MBA) feature document
Date: Wed,  9 Aug 2017 15:41:40 +0800	[thread overview]
Message-ID: <1502264512-4648-2-git-send-email-yi.y.sun@linux.intel.com> (raw)
In-Reply-To: <1502264512-4648-1-git-send-email-yi.y.sun@linux.intel.com>

This patch creates MBA feature document in doc/features/. It describes
key points to implement MBA which is described in details in Intel SDM
"Introduction to Memory Bandwidth Allocation".

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v1:
    - remove a special character to avoid the error when building pandoc.
---
 docs/features/intel_psr_mba.pandoc | 247 +++++++++++++++++++++++++++++++++++++
 1 file changed, 247 insertions(+)
 create mode 100644 docs/features/intel_psr_mba.pandoc

diff --git a/docs/features/intel_psr_mba.pandoc b/docs/features/intel_psr_mba.pandoc
new file mode 100644
index 0000000..7a42edf
--- /dev/null
+++ b/docs/features/intel_psr_mba.pandoc
@@ -0,0 +1,247 @@
+% Intel Memory Bandwidth Allocation (MBA) Feature
+% Revision 1.2
+
+\clearpage
+
+# Basics
+
+---------------- ----------------------------------------------------
+         Status: **Tech Preview**
+
+Architecture(s): Intel x86
+
+   Component(s): Hypervisor, toolstack
+
+       Hardware: MBA is supported on Skylake Server and beyond
+---------------- ----------------------------------------------------
+
+# Terminology
+
+* CAT         Cache Allocation Technology
+* CBM         Capacity BitMasks
+* CDP         Code and Data Prioritization
+* COS/CLOS    Class of Service
+* MBA         Memory Bandwidth Allocation
+* MSRs        Machine Specific Registers
+* PSR         Intel Platform Shared Resource
+* THRTL       Throttle value or delay value
+
+# Overview
+
+The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate
+control over memory bandwidth available per-core. This feature provides OS/
+hypervisor the ability to slow misbehaving apps/domains or create advanced
+closed-loop control system via exposing control over a credit-based throttling
+mechanism.
+
+# User details
+
+* Feature Enabling:
+
+  Add "psr=mba" to boot line parameter to enable MBA feature.
+
+* xl interfaces:
+
+  1. `psr-mba-show [domain-id]`:
+
+     Show memory bandwidth throttling for domain.
+
+  2. `psr-mba-set [OPTIONS] domain-id throttling`:
+
+     Set memory bandwidth throttling for domain.
+
+     Options:
+     '-s': Specify the socket to process, otherwise all sockets are processed.
+
+     Throttling value set in register implies memory bandwidth blocked, i.e.
+     higher throttling value results in lower bandwidth. The max throttling
+     value can be got through CPUID.
+
+     The response of the throttling value could be linear mode or non-linear
+     mode.
+
+     Linear mode: the input precision is defined as 100-(MBA_MAX). For instance,
+     if the MBA_MAX value is 90, the input precision is 10%. Values not an even
+     multiple of the precision (e.g., 12%) will be rounded down (e.g., to 10%
+     delay applied) by HW automatically.
+
+     Non-linear mode: input delay values are powers-of-two from zero to the
+     MBA_MAX value from CPUID. In this case any values not a power of two will
+     be rounded down the next nearest power of two by HW automatically.
+
+# Technical details
+
+MBA is a member of Intel PSR features, it shares the base PSR infrastructure
+in Xen.
+
+## Hardware perspective
+
+  MBA defines a range of MSRs to support specifying a delay value (Thrtl) per
+  COS, with details below.
+
+  ```
+   +----------------------------+----------------+
+   | MSR (per socket)           |    Address     |
+   +----------------------------+----------------+
+   | IA32_L2_QOS_Ext_BW_Thrtl_0 |     0xD50      |
+   +----------------------------+----------------+
+   | ...                        |  ...           |
+   +----------------------------+----------------+
+   | IA32_L2_QOS_Ext_BW_Thrtl_n | 0xD50+n (n<64) |
+   +----------------------------+----------------+
+  ```
+
+  When context switch happens, the COS ID of VCPU is written to per-thread MSR
+  `IA32_PQR_ASSOC`, and then hardware enforces bandwidth allocation according
+  to the throttling value stored in the COS register.
+
+## The relationship between MBA and CAT/CDP
+
+  Generally speaking, MBA is completely independent of CAT/CDP, and any
+  combination may be applied at any time, e.g. enabling MBA with CAT
+  disabled.
+
+  But it needs to be noticed that MBA shares COS infrastructure with CAT,
+  although MBA is enumerated by different CPUID leaf from CAT (which
+  indicates that the max COS of MBA may be different from CAT). In some
+  cases, a domain is permitted to have a COS that is beyond one (or more)
+  of PSR features but within the others. For instance, let's assume the max
+  COS of MBA is 8 but the max COS of L3 CAT is 16, when a domain is assigned
+  9 as COS, the L3 CAT CBM associated to COS 9 would be enforced, but for MBA,
+  the HW works as default value is set since COS 9 is beyond the max COS (8)
+  of MBA.
+
+## Design Overview
+
+* Core COS/Thrtl association
+
+  When enforcing Memory Bandwidth Allocation, all cores of domains have
+  the same default COS (COS0) which stores the same Thrtl (0). The default
+  COS is used only in hypervisor and is transparent to tool stack and user.
+
+  System administrator can change PSR allocation policy at runtime by
+  tool stack. Since MBA shares COS with CAT/CDP, a COS corresponds to a
+  2-tuple, like [CBM, Thrtl] with only-CAT enalbed, when CDP is enabled,
+  the COS corresponds to a 3-tuple, like [Code_CBM, Data_CBM, Thrtl]. If
+  neither CAT nor CDP is enabled, things would be easier, one COS
+  corresponds to one Thrtl.
+
+* VCPU schedule
+
+  This part reuses CAT COS infrastructure.
+
+* Multi-sockets
+
+  Different sockets may have different MBA ability (like max COS)
+  although it is consistent on the same socket. So the capability
+  of per-socket MBA is specified.
+
+  This part reuses CAT COS infrastructure.
+
+## Implementation Description
+
+* Hypervisor interfaces:
+
+  1. Boot line param: "psr=mba" to enable the feature.
+
+  2. SYSCTL:
+          - XEN_SYSCTL_PSR_MBA_get_info: Get system MBA information.
+
+  3. DOMCTL:
+          - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a domain.
+          - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a domain.
+
+* xl interfaces:
+
+  1. psr-mba-show [domain-id]
+          Show system/domain runtime MBA throttling value.
+          => XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL
+
+  2. psr-mba-set [OPTIONS] domain-id throttling
+          Set bandwidth throttling for a domain.
+          => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL
+
+  3. psr-hwinfo
+          Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA.
+          => XEN_SYSCTL_PSR_MBA_get_info
+
+* Key data structure:
+
+  1. Feature HW info
+
+     ```
+     struct {
+         unsigned int thrtl_max;
+         unsigned int linear;
+     } mba_info;
+
+     - Member `thrtl_max`
+
+       `thrtl_max` is the max throttling value to be set.
+
+     - Member `linear`
+
+       `linear` means the response of delay value is linear or not.
+
+     As mentioned above, MBA is a member of Intel PSR features, it would
+     share the base PSR infrastructure in Xen. For example, the 'cos_max'
+     is a common HW property for all features. So, for other data structure
+     details, please refer 'intel_psr_cat_cdp.pandoc'.
+
+# Limitations
+
+MBA can only work on HW which enables it (check by CPUID).
+
+# Testing
+
+We can execute these commands to verify MBA on different HWs supporting them.
+
+For example:
+    root@:~$ xl psr-hwinfo --mba
+    Memory Bandwidth Allocation (MBA):
+    Socket ID       : 0
+    Linear Mode     : Enabled
+    Maximum COS     : 7
+    Maximum Throttling Value: 90
+    Default Throttling Value: 0
+
+    root@:~$ xl psr-mba-set 1 0xa
+
+    root@:~$ xl psr-mba-show 1
+    Socket ID       : 0
+    Default THRTL   : 0
+       ID                     NAME            THRTL
+        1                 ubuntu14             0xa
+
+# Areas for improvement
+
+A hexadecimal number is used to show THRTL for a domain now. It may not be user-
+friendly.
+
+To improve this, the libxl interfaces can be wrapped in libvirt to provide more
+usr-friendly interfaces to user, e.g. a percentage number to show for linear
+mode.
+
+# Known issues
+
+N/A
+
+# References
+
+"INTEL RESOURCE DIRECTOR TECHNOLOGY (INTEL RDT) ALLOCATION FEATURES" [Intel 64 and IA-32 Architectures Software Developer Manuals, vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
+
+# History
+
+------------------------------------------------------------------------
+Date       Revision Version  Notes
+---------- -------- -------- -------------------------------------------
+2017-01-10 1.0      Xen 4.9  Design document written
+2017-07-10 1.1      Xen 4.10 Changes:
+                             1. Modify data structure according to latest
+                                codes;
+                             2. Add content for 'Areas for improvement';
+                             3. Other minor changes.
+2017-08-09 1.2      Xen 4.10 Changes:
+                             1. Remove a special character to avoid error when
+                                building pandoc.
+---------- -------- -------- -------------------------------------------
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-08-09  7:57 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-09  7:41 [PATCH v1 00/13] Enable Memory Bandwidth Allocation in Xen Yi Sun
2017-08-09  7:41 ` Yi Sun [this message]
2017-08-14  7:35   ` [PATCH v1 01/13] docs: create Memory Bandwidth Allocation (MBA) feature document Chao Peng
2017-08-14  8:23     ` Yi Sun
2017-08-14  9:36       ` Chao Peng
2017-08-15 10:08   ` Wei Liu
2017-08-16  2:51     ` Yi Sun
2017-08-09  7:41 ` [PATCH v1 02/13] Rename PSR sysctl/domctl interfaces and xsm policy to make them be general Yi Sun
2017-08-15 10:12   ` Wei Liu
2017-08-16  2:48     ` Yi Sun
2017-08-15 14:03   ` Daniel De Graaf
2017-08-09  7:41 ` [PATCH v1 03/13] x86: rename 'cbm_type' to 'psr_val_type' to make it general Yi Sun
2017-08-15 10:13   ` Wei Liu
2017-08-16  2:17   ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 04/13] x86: implement data structure and CPU init flow for MBA Yi Sun
2017-08-15 10:50   ` Wei Liu
2017-08-16  7:18     ` Yi Sun
2017-08-17  9:49       ` Wei Liu
2017-08-16  3:14   ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 05/13] x86: implement get hw info " Yi Sun
2017-08-16  3:23   ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 06/13] x86: implement get value interface " Yi Sun
2017-08-16  6:38   ` Chao Peng
2017-08-16  6:43     ` Yi Sun
2017-08-17  7:51       ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 07/13] x86: implement set value flow " Yi Sun
2017-08-18  3:32   ` Chao Peng
2017-08-18  9:25     ` Yi Sun
2017-08-21  7:54       ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 08/13] tools: create general interfaces to support psr allocation features Yi Sun
2017-08-21 10:12   ` Chao Peng
2017-08-22  2:38     ` Yi Sun
2017-08-22  6:42       ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 09/13] tools: implement the new get hw info interface suitable to all " Yi Sun
2017-08-15 11:14   ` Wei Liu
2017-08-21 10:13   ` Chao Peng
2017-08-22  2:38     ` Yi Sun
2017-08-09  7:41 ` [PATCH v1 10/13] tools: rename 'xc_psr_cat_type' to 'xc_psr_val_type' Yi Sun
2017-08-15 11:15   ` Wei Liu
2017-08-21 10:13   ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 11/13] tools: implemet new get value interface suitable for all psr allocation features Yi Sun
2017-08-15 11:24   ` Wei Liu
2017-08-21 10:14   ` Chao Peng
2017-08-22  2:24     ` Yi Sun
2017-08-22  6:44       ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 12/13] tools: implemet new set " Yi Sun
2017-08-15 11:25   ` Wei Liu
2017-08-21 10:15     ` Chao Peng
2017-08-09  7:41 ` [PATCH v1 13/13] docs: add MBA description in docs Yi Sun
2017-08-15 11:26   ` Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1502264512-4648-2-git-send-email-yi.y.sun@linux.intel.com \
    --to=yi.y.sun@linux.intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dario.faggioli@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien.grall@arm.com \
    --cc=kevin.tian@intel.com \
    --cc=mengxu@cis.upenn.edu \
    --cc=roger.pau@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).