From: Vikas Shivappa <vikas.shivappa@linux.intel.com>
To: vikas.shivappa@intel.com
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, hpa@zytor.com,
tglx@linutronix.de, mingo@kernel.org, tj@kernel.org,
peterz@infradead.org, matt.fleming@intel.com,
will.auld@intel.com, peter.zijlstra@intel.com,
h.peter.anvin@intel.com, kanaka.d.juvva@intel.com,
mtosatti@redhat.com, vikas.shivappa@linux.intel.com
Subject: [PATCH 7/7] x86/intel_rdt: Add Cache Allocation documentation and usage guide
Date: Mon, 11 May 2015 12:02:56 -0700 [thread overview]
Message-ID: <1431370976-31115-8-git-send-email-vikas.shivappa@linux.intel.com> (raw)
In-Reply-To: <1431370976-31115-1-git-send-email-vikas.shivappa@linux.intel.com>
Adds a description of Cache allocation technology, overview
of kernel implementation and usage of Cache Allocation cgroup interface.
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
Documentation/cgroups/rdt.txt | 206 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 206 insertions(+)
create mode 100644 Documentation/cgroups/rdt.txt
diff --git a/Documentation/cgroups/rdt.txt b/Documentation/cgroups/rdt.txt
new file mode 100644
index 0000000..1af77d5
--- /dev/null
+++ b/Documentation/cgroups/rdt.txt
@@ -0,0 +1,206 @@
+ RDT
+ ---
+
+Copyright (C) 2014 Intel Corporation
+Written by vikas.shivappa@linux.intel.com
+(based on contents and format from cpusets.txt)
+
+CONTENTS:
+=========
+
+1. Cache Allocation Technology
+ 1.1 What is RDT and Cache allocation ?
+ 1.2 Why is Cache allocation needed ?
+ 1.3 Cache allocation implementation overview
+ 1.4 Assignment of CBM and CLOS
+ 1.5 Scheduling and Context Switch
+2. Usage Examples and Syntax
+
+1. Cache Allocation Technology(Cache allocation)
+===================================
+
+1.1 What is RDT and Cache allocation
+-----------------------
+
+Cache allocation is a part of Resource Director Technology(RDT) or
+Platform Shared resource control which provides support to control
+Platform shared resources like L3 cache. Currently Cache is the only
+resource that is supported in RDT.
+More information can be found in the Intel SDM, Volume 3, section 17.15.
+
+Cache Allocation Technology provides a way for the Software (OS/VMM)
+to restrict cache allocation to a defined 'subset' of cache which may
+be overlapping with other 'subsets'. This feature is used when
+allocating a line in cache ie when pulling new data into the cache.
+The programming of the h/w is done via programming MSRs.
+
+The different cache subsets are identified by CLOS identifier (class
+of service) and each CLOS has a CBM (cache bit mask). The CBM is a
+contiguous set of bits which defines the amount of cache resource that
+is available for each 'subset'.
+
+1.2 Why is Cache allocation needed
+----------------------------------
+
+In todays new processors the number of cores is continuously increasing,
+especially in large scale usage models where VMs are used like
+webservers and datacenters. The number of cores increase the number
+of threads or workloads that can simultaneously be run. When
+multi-threaded-applications, VMs, workloads run concurrently they
+compete for shared resources including L3 cache.
+
+The Cache allocation enables more cache resources to be made available
+for higher priority applications based on guidance from the execution
+environment.
+
+The architecture also allows dynamically changing these subsets during
+runtime to further optimize the performance of the higher priority
+application with minimal degradation to the low priority app.
+Additionally, resources can be rebalanced for system throughput
+benefit. (Refer to Section 17.15 in the Intel SDM)
+
+This technique may be useful in managing large computer systems which
+large L3 cache. Examples may be large servers running instances of
+webservers or database servers. In such complex systems, these subsets
+can be used for more careful placing of the available cache
+resources.
+
+1.3 Cache allocation implementation Overview
+--------------------------------------------
+
+Kernel implements a cgroup subsystem to support cache allocation.
+
+Each cgroup has a CLOSid <-> CBM(cache bit mask) mapping.
+A CLOS(Class of service) is represented by a CLOSid.CLOSid is internal
+to the kernel and not exposed to user. Each cgroup would have one CBM
+and would just represent one cache 'subset'.
+
+The cgroup follows cgroup hierarchy ,mkdir and adding tasks to the
+cgroup never fails. When a child cgroup is created it inherits the
+CLOSid and the CBM from its parent. When a user changes the default
+CBM for a cgroup, a new CLOSid may be allocated if the CBM was not
+used before. The changing of 'cache_mask' may fail with -ENOSPC once
+the kernel runs out of maximum CLOSids it can support.
+User can create as many cgroups as he wants but having different CBMs
+at the same time is restricted by the maximum number of CLOSids
+(multiple cgroups can have the same CBM).
+Kernel maintains a CLOSid<->cbm mapping which keeps reference counter
+for each cgroup using a CLOSid.
+
+The tasks in the cgroup would get to fill the L3 cache represented by
+the cgroup's 'cache_mask' file.
+
+Root directory would have all available bits set in 'cache_mask' file
+by default.
+
+1.4 Assignment of CBM,CLOS
+--------------------------
+
+The 'cache_mask' needs to be a subset of the parent node's
+'cache_mask'. Any contiguous subset of these bits(with a minimum of 2
+bits on hsw SKUs) maybe set to indicate the cache mapping desired. The
+'cache_mask' between 2 directories can overlap. The 'cache_mask' would
+represent the cache 'subset' of the Cache allocation cgroup. For ex: on
+a system with 16 bits of max cbm bits, if the directory has the least
+significant 4 bits set in its 'cache_mask' file(meaning the 'cache_mask'
+is just 0xf), it would be allocated the right quarter of the Last level
+cache which means the tasks belonging to this Cache allocation cgroup
+can use the right quarter of the cache to fill. If it
+has the most significant 8 bits set ,it would be allocated the left
+half of the cache(8 bits out of 16 represents 50%).
+
+The cache portion defined in the CBM file is available to all tasks
+within the cgroup to fill and these task are not allowed to allocate
+space in other parts of the cache.
+
+1.5 Scheduling and Context Switch
+---------------------------------
+
+During context switch kernel implements this by writing the
+CLOSid (internally maintained by kernel) of the cgroup to which the
+task belongs to the CPU's IA32_PQR_ASSOC MSR. The MSR is only written
+when there is a change in the CLOSid for the CPU in order to minimize
+the latency incurred during context switch.
+
+The following considerations are done for the PQR MSR write so that it
+has minimal impact on scheduling hot path:
+- This path doesnt exist on any non-intel platforms.
+- On Intel platforms, this would not exist by default unless CGROUP_RDT
+is enabled.
+- remains a no-op when CGROUP_RDT is enabled and intel hardware does not
+support the feature.
+- When feature is available, still remains a no-op till the user
+manually creates a cgroup *and* assigns a new cache mask. Since the
+child node inherits the parents cache mask , by cgroup creation there is
+no scheduling hot path impact from the new cgroup.
+- per cpu PQR values are cached and the MSR write is only done when
+there is a task with different PQR is scheduled on the CPU. Typically if
+the task groups are bound to be scheduled on a set of CPUs , the number
+of MSR writes is greatly reduced.
+
+2. Usage examples and syntax
+============================
+
+To check if Cache allocation was enabled on your system
+
+dmesg | grep -i intel_rdt
+should output : intel_rdt: Max bitmask length: xx,Max ClosIds: xx
+the length of cache_mask and CLOS should depend on the system you use.
+
+Following would mount the cache allocation cgroup subsystem and create
+2 directories. Please refer to Documentation/cgroups/cgroups.txt on
+details about how to use cgroups.
+
+ cd /sys/fs/cgroup
+ mkdir rdt
+ mount -t cgroup -ointel_rdt intel_rdt /sys/fs/cgroup/rdt
+ cd rdt
+
+Create 2 rdt cgroups
+
+ mkdir group1
+ mkdir group2
+
+Following are some of the Files in the directory
+
+ ls
+ rdt.cache_mask
+ tasks
+
+Say if the cache is 2MB and cbm supports 16 bits, then setting the
+below allocates the 'right 1/4th(512KB)' of the cache to group2
+
+Edit the CBM for group2 to set the least significant 4 bits. This
+allocates 'right quarter' of the cache.
+
+ cd group2
+ /bin/echo 0xf > rdt.cache_mask
+
+
+Edit the CBM for group2 to set the least significant 8 bits.This
+allocates the right half of the cache to 'group2'.
+
+ cd group2
+ /bin/echo 0xff > rdt.cache_mask
+
+Assign tasks to the group2
+
+ /bin/echo PID1 > tasks
+ /bin/echo PID2 > tasks
+
+ Meaning now threads
+ PID1 and PID2 get to fill the 'right half' of
+ the cache as the belong to cgroup group2.
+
+Create a group under group2
+
+ cd group2
+ mkdir group21
+ cat rdt.cache_mask
+ 0xff - inherits parents mask.
+
+ /bin/echo 0xfff > rdt.cache_mask - throws error as mask has to parent's mask's subset
+
+In order to restrict RDT cgroups to specific set of CPUs rdt can be comounted
+with cpusets.
+
--
1.9.1
next prev parent reply other threads:[~2015-05-11 19:05 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-11 19:02 [PATCH V7 0/7] x86/intel_rdt: Intel Cache Allocation support Vikas Shivappa
2015-05-11 19:02 ` [PATCH 1/7] x86/intel_rdt: Intel Cache Allocation detection Vikas Shivappa
2015-05-11 19:02 ` [PATCH 2/7] x86/intel_rdt: Adds support for Class of service management Vikas Shivappa
2015-05-15 19:18 ` Thomas Gleixner
2015-05-18 17:59 ` Vikas Shivappa
2015-05-18 18:41 ` Thomas Gleixner
2015-05-18 19:20 ` Borislav Petkov
2015-05-19 17:33 ` Vikas Shivappa
2015-05-19 20:35 ` Borislav Petkov
2015-05-18 19:44 ` Vikas Shivappa
2015-05-18 18:52 ` Thomas Gleixner
2015-05-18 19:27 ` Vikas Shivappa
2015-05-11 19:02 ` [PATCH 3/7] x86/intel_rdt: Add support for cache bit mask management Vikas Shivappa
2015-05-15 19:25 ` Thomas Gleixner
2015-05-18 19:17 ` Vikas Shivappa
2015-05-18 20:15 ` Thomas Gleixner
2015-05-18 21:09 ` Vikas Shivappa
2015-05-20 17:22 ` Vikas Shivappa
2015-05-20 19:02 ` Thomas Gleixner
2015-05-21 0:54 ` Thomas Gleixner
2015-05-21 16:36 ` Vikas Shivappa
2015-05-11 19:02 ` [PATCH 4/7] x86/intel_rdt: Implement scheduling support for Intel RDT Vikas Shivappa
2015-05-15 19:39 ` Thomas Gleixner
2015-05-18 18:01 ` Vikas Shivappa
2015-05-18 18:45 ` Thomas Gleixner
2015-05-18 19:18 ` Vikas Shivappa
2015-05-11 19:02 ` [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR Vikas Shivappa
2015-05-15 20:15 ` Thomas Gleixner
2015-05-20 17:18 ` Vikas Shivappa
2015-05-20 18:50 ` Thomas Gleixner
2015-05-20 20:43 ` Vikas Shivappa
2015-05-20 21:14 ` Thomas Gleixner
2015-05-20 22:51 ` Vikas Shivappa
2015-05-11 19:02 ` [PATCH 6/7] x86/intel_rdt: Intel haswell Cache Allocation enumeration Vikas Shivappa
2015-05-11 19:02 ` Vikas Shivappa [this message]
2015-05-13 16:52 ` [PATCH V7 0/7] x86/intel_rdt: Intel Cache Allocation support Vikas Shivappa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1431370976-31115-8-git-send-email-vikas.shivappa@linux.intel.com \
--to=vikas.shivappa@linux.intel.com \
--cc=h.peter.anvin@intel.com \
--cc=hpa@zytor.com \
--cc=kanaka.d.juvva@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt.fleming@intel.com \
--cc=mingo@kernel.org \
--cc=mtosatti@redhat.com \
--cc=peter.zijlstra@intel.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vikas.shivappa@intel.com \
--cc=will.auld@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).