public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V7 00/41] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling
@ 2018-06-22 22:41 Reinette Chatre
  2018-06-22 22:41 ` [PATCH V7 01/41] x86/intel_rdt: Provide pseudo-locking hooks within rdt_mount Reinette Chatre
                   ` (42 more replies)
  0 siblings, 43 replies; 98+ messages in thread
From: Reinette Chatre @ 2018-06-22 22:41 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, vikas.shivappa
  Cc: gavin.hindman, jithu.joseph, dave.hansen, mingo, hpa, x86,
	linux-kernel, Reinette Chatre, David Howells

Dear Maintainers,

The Cache Pseudo-Locking enabling series that was recently merged to the
x86/cache branch of tip was found to conflict with the new kernfs support
for mounting with fs_context.

In preparation for a conflict-free merge between the two repos some no-op
hooks are created within the RDT mount function being changed by
the two features. The goal is for this commit to be placed on a minimal
no-rebase branch to be consumed by both features.

In an attempt to simplify this series it is named v7 and is a rebase
of x86/cache on top of the new no-op hooks. The
patches were taken from x86/cache and contains all signatures
(Signed-off-by:, Acked-by:, Cc:, Link:, etc) as they were merged. Only one
of these, "[PATCH V7 36/41] x86/intel_rdt: Create character device
exposing pseudo-locked region" needed to be amended and the link within
the commit description thus does not point to the actual content of the
patch. You will find that the current content of the x86/cache branch
is exactly the same as what will be produced by this patch series.

Please let me know if there is a way in which I can modify the series to be
easier to manage.

The intention is for the first patch from this series to be placed on a
separate, minimal, never rebase branch. x86/cache can include that branch
and together with the rest of the series obtain Cache Pseudo-Locking support.
The mount series can also include the minimal branch with the single commit to
complete the fs_context support to kernfs.

Cc: David Howells <dhowells@redhat.com>

No changes below. It is verbatim from previous submission (except for
diffstat at the end that reflects v7).

Changes since v4:
- All occurrences of seq_puts(f, one_char_string) replaced with
  seq_putc(f, one_char).
- Do not use unlikely() after debugfs_file_get().
- No error checking of debugfs directory and file creation return values.
  Specifically, do not let failures in debugging affect core
  functionality.
- Always include debug functionality. CONFIG_INTEL_RDT_DEBUGFS has been
  removed in this series.

Changes since v3:
- Rebase series on top of tip::x86/cache with HEAD:
  commit de73f38f768021610bd305cf74ef3702fcf6a1eb (tip/x86/cache)
  Author: Vikas Shivappa <vikas.shivappa@linux.intel.com>
  Date:   Fri Apr 20 15:36:21 2018 -0700

    x86/intel_rdt/mba_sc: Feedback loop to dynamically update mem bandwidth
- The final patch from the v3 submission is not included in this series.
  The large contiguous allocation work it depends on is actively discussed
  and patches now at v2:
  http://lkml.kernel.org/r/20180503232935.22539-1-mike.kravetz@oracle.com
  At this time it seems that the large contiguous allocation API may change
  in future versions so I plan to resubmit the final patch when that is
  finalized. Until then we are limited to Cache Pseudo-Locked regions of 4MB.
- rdtgroup_cbm_overlaps() now returns bool instead of int.
- rdtgroup_mode_test_exclusive() now returns bool instead of int.
- Respect the tabular formatting in rdt_cbm_parse_data struct declaration.
- In rdt_bit_usage_show() use test to exit loop earlier and thus spare an
  indentation level of code that follows.
- Follow recommendations of recent additions to checkpatch.pl:
  -- Prefer 'help' over '---help---' for new Kconfig help texts.
  -- Include SPDX-License-Identifier tag in new files.


The last patch of this series depends on the series:
"[RFC PATCH 0/3] Interface for higher order contiguous allocations"
submitted at:
http://lkml.kernel.org/r/20180212222056.9735-1-mike.kravetz@oracle.com
A new version of this was submitted recently and currently being discussed
at:
http://lkml.kernel.org/r/20180417020915.11786-1-mike.kravetz@oracle.com
Without this upstream MM work (and patch 39/39 of this series) it would
just not be possible to create pseudo-locked regions larger than 4MB. To
simplify this work we could temporarily drop the last patch of this
series until the upstream MM work is complete.

Changes since v2:
- Introduce resource group "modes" and a new resctrl file "mode" associated
  with each resource group that exposes the associated resource group's mode.
  A resource group's mode is used by the system administrator to enable or
  disable resource sharing between resource groups. A resource group in
  "shareable" mode allows its allocations to be shared with other resource
  groups. This is the default mode and reflects existing behavior. A resource
  group in "exclusive" mode does not allow any sharing of its allocated
  resources. When a schemata is written to any resource group it is not
  allowed to overlap with allocations of any resource group that is in
  "exclusive" mode. A resource group's allocations are not allowed to overlap
  at the time it is set to be "exclusive".  Cache pseudo-locking builds on
  "exclusive" mode and is supported using two new modes: "pseudo-locksetup"
  lets the user indicate that this resource group will be used by a
  pseudo-locked region. A subsequent write of a schemata to the "schemata"
  file will create the corresponding pseudo-locked region and the mode will
  then automatically change to "pseudo-locked".
- A resource group's mode can only be changed to "pseudo-locksetup" if the
  platform has been verified to support cache pseudo-locking and the
  resource group is unused. Unused means that, no monitoring is in progress,
  and no tasks or cpus are assigned to the resource group. Once a resource
  group enters "pseudo-locksetup" it becomes "locked down" such that no
  new tasks or cpus can be assigned to it. Neither can any new monitoring
  be started.
- Each resource group obtains a new "size" file that mirrors the schemata
  file to display the size in bytes of each allocation. There is a difference
  in the implementation from the review feedback. In the review feedback an
  example of output was:
     L2:0=128K;1=256K;
     L3:0=1M;1=2M;
  Within the kernel I could find many examples of support for user _input_ with
  mem suffixes. This is broadly supported with lib/cmdline.c:memparse().
  I was not able to find as clear support or usage of such flexible
  _output_ of size. My conclusion was that the output of size tends to always
  be using the same unit. I also found that printing the size in one unit, in
  this case bytes, does simplify validation.
- A new "bit_usage" file within the info/<resource> sub-directories contain
  annotated bitmaps of how the resources are used.
- Cache pseudo-locked regions are now associated 1:1 with a resource group.
- Do not make any changes to capacity bitmask (CBM) associated with the
  default class-of-service (CLOS). If a pseudo-locked region is requested its
  cache region has to be unused at the time of request.
- Second mutex removed.
- Tabular fashion respected when making struct changes.
- Lifetime of pseudo-locked region (by extension the resource group it
  belongs to) connected to mmap region.
- Do not call preempt_disable() and local_irq_save(). Only local_irq_disable().
- Improve comments in pseudo-locking loop to explain why prefetcher needs
  disabling.
- Ensure that possibility of pseudo-locked region success takes into
  account all levels of cache in the hierarchy, not just the level at which
  it is requested.
- Preloading of code was suggested in review to improve pseudo-locking
  success. We have since been able to connect a hardware debugger to our
  target platform and with current locking flow we are able to lock 100%
  of kernel memory into the cache of an Intel(R) Celeron(R) Processor J3455.
- Above testing with hardware debugger revealed that speculative execution
  of the loop loads data beyond the end of the buffer. Add a read barrier
  to the locking loops to prevent this speculation.
- The name of the debugfs file used to trigger measurements was changed
  from "measure_trigger" to "pseudo_lock_measure".

Changes since v1:
- Enable allocation of contiguous regions larger than what SLAB allocators
  can support. This removes the 4MB Cache Pseudo-Locking limitation
  documented in v1 submission.
  This depends on "mm: drop hotplug lock from lru_add_drain_all",
  now in v4.16-rc1 as 9852a7212324fd25f896932f4f4607ce47b0a22f.
- Convert to debugfs_file_get() and -put() from the now obsolete
  debugfs_use_file_start() and debugfs_use_file_finish() calls.
- Rebase on top of, and take into account, recent L2 CDP enabling.
- Simplify tracing output to print cache hits and miss counts on same line.


Dear Maintainers,

Cache Allocation Technology (CAT), part of Intel(R) Resource Director
Technology (Intel(R) RDT), enables a user to specify the amount of cache
space into which an application can fill. Cache pseudo-locking builds on
the fact that a CPU can still read and write data pre-allocated outside
its current allocated area on cache hit. With cache pseudo-locking data
can be preloaded into a reserved portion of cache that no application can
fill, and from that point on will only serve cache hits. The cache
pseudo-locked memory is made accessible to user space where an application
can map it into its virtual address space and thus have a region of
memory with reduced average read latency.

The cache pseudo-locking approach relies on generation-specific behavior
of processors. It may provide benefits on certain processor generations,
but is not guaranteed to be supported in the future. It is not a guarantee
that data will remain in the cache. It is not a guarantee that data will
remain in certain levels or certain regions of the cache. Rather, cache
pseudo-locking increases the probability that data will remain in a certain
level of the cache via carefully configuring the CAT feature and carefully
controlling application behavior.

Known limitations:
Instructions like INVD, WBINVD, CLFLUSH, etc. can still evict pseudo-locked
memory from the cache. Power management C-states may still shrink or power
off cache causing eviction of cache pseudo-locked memory. We utilize
PM QoS to prevent entering deeper C-states on cores associated with cache
pseudo-locked regions at the time they (the pseudo-locked regions) are
created.

Known software limitation (FIXED IN V2):
Cache pseudo-locked regions are currently limited to 4MB, even on
platforms that support larger cache sizes. Work is in progress to
support larger regions.

Graphs visualizing the benefits of cache pseudo-locking on an Intel(R)
NUC NUC6CAYS (it has an Intel(R) Celeron(R) Processor J3455) with the
default 2GB DDR3L-1600 memory are available. In these tests the patches
from this series were applied on the x86/cache branch of tip.git at the
time the HEAD was:

commit 87943db7dfb0c5ee5aa74a9ac06346fadd9695c8 (tip/x86/cache)
Author: Reinette Chatre <reinette.chatre@intel.com>
Date:   Fri Oct 20 02:16:59 2017 -0700
    x86/intel_rdt: Fix potential deadlock during resctrl mount

DISCLAIMER: Tests document performance of components on a particular test,
in specific systems. Differences in hardware, software, or configuration
will affect actual performance. Performance varies depending on system
configuration.

- https://github.com/rchatre/data/blob/master/cache_pseudo_locking/rfc_v1/perfcount.png
Above shows the few L2 cache misses possible with cache pseudo-locking
on the Intel(R) NUC with default configuration. Each test, which is
repeated 100 times, pseudo-locks schemata shown and then measure from
the kernel via precision counters the number of cache misses when
accessing the memory afterwards. This test is run on an idle system as
well as a system with significant noise (using stress-ng) from a
neighboring core associated with the same cache. This plot shows us that:
(1) the number of cache misses remain consistent irrespective of the size
of region being pseudo-locked, and (2) the number of cache misses for a
pseudo-locked region remains low when traversing memory regions ranging
in size from 256KB (4096 cache lines) to 896KB (14336 cache lines).

- https://github.com/rchatre/data/blob/master/cache_pseudo_locking/rfc_v1/userspace_malloc_with_load.png
Above shows the read latency experienced by an application running with
default CAT CLOS after it allocated 256KB memory with malloc() (and using
mlockall()). In this example the application reads randomly (to not trigger
hardware prefetcher) from its entire allocated region at 2 second intervals
while there is a noisy neighbor present. Each individual access is 32 bytes
in size and the latency of each access is measured using the rdtsc
instruction. In this visualization we can observe two groupings of data,
the group with lower latency indicating cache hits, and the group with
higher latency indicating cache misses. We can see a significant portion
of memory reads experience larger latencies.

- https://github.com/rchatre/data/blob/master/cache_pseudo_locking/rfc_v1/userspace_psl_with_load.png
Above plots a similar test as the previous, but instead of the application
reading from a 256KB malloc() region it reads from a 256KB pseudo-locked
region that was mmap()'ed into its address space. When comparing these
latencies to that of regular malloc() latencies we do see a significant
improvement in latencies experienced.

https://github.com/rchatre/data/blob/master/cache_pseudo_locking/rfc_v1/userspace_malloc_and_cat_with_load_clos0_fixed.png
Applications that are sensitive to latencies may use existing CAT
technology to isolate the sensitive application. In this plot we show an
application running with a dedicated CAT CLOS double the size (512KB) of
the memory being tested (256KB). A dedicated CLOS with CBM 0x0f is created and
the default CLOS changed to CBM 0xf0. We see in this plot that even though
the application runs within a dedicated portion of cache it still
experiences significant latency accessing its memory (when compared to
pseudo-locking).

Your feedback about this proposal for enabling of Cache Pseudo-Locking
will be greatly appreciated.

Regards,

Reinette




Ingo Molnar (1):
  x86/intel_rdt: Simplify index type

Reinette Chatre (40):
  x86/intel_rdt: Provide pseudo-locking hooks within rdt_mount
  x86/intel_rdt: Document new mode, size, and bit_usage
  x86/intel_rdt: Introduce RDT resource group mode
  x86/intel_rdt: Associate mode with each RDT resource group
  x86/intel_rdt: Introduce resource group's mode resctrl file
  x86/intel_rdt: Introduce test to determine if closid is in use
  x86/intel_rdt: Make useful functions available internally
  x86/intel_rdt: Initialize new resource group with sane defaults
  x86/intel_rdt: Introduce new "exclusive" mode
  x86/intel_rdt: Enable setting of exclusive mode
  x86/intel_rdt: Making CBM name and type more explicit
  x86/intel_rdt: Support flexible data to parsing callbacks
  x86/intel_rdt: Ensure requested schemata respects mode
  x86/intel_rdt: Introduce "bit_usage" to display cache allocations
    details
  x86/intel_rdt: Display resource groups' allocations' size in bytes
  x86/intel_rdt: Documentation for Cache Pseudo-Locking
  x86/intel_rdt: Introduce the Cache Pseudo-Locking modes
  x86/intel_rdt: Respect read and write access
  x86/intel_rdt: Add utility to test if tasks assigned to resource group
  x86/intel_rdt: Add utility to restrict/restore access to resctrl files
  x86/intel_rdt: Protect against resource group changes during locking
  x86/intel_rdt: Utilities to restrict/restore access to specific files
  x86/intel_rdt: Add check to determine if monitoring in progress
  x86/intel_rdt: Introduce pseudo-locked region
  x86/intel_rdt: Support enter/exit of locksetup mode
  x86/intel_rdt: Enable entering of pseudo-locksetup mode
  x86/intel_rdt: Split resource group removal in two
  x86/intel_rdt: Add utilities to test pseudo-locked region possibility
  x86/intel_rdt: Discover supported platforms via prefetch disable bits
  x86/intel_rdt: Pseudo-lock region creation/removal core
  x86/intel_rdt: Support creation/removal of pseudo-locked region
  x86/intel_rdt: Resctrl files reflect pseudo-locked information
  x86/intel_rdt: Ensure RDT cleanup on exit
  x86/intel_rdt: Create resctrl debug area
  x86/intel_rdt: Create debugfs files for pseudo-locking testing
  x86/intel_rdt: Create character device exposing pseudo-locked region
  x86/intel_rdt: More precise L2 hit/miss measurements
  x86/intel_rdt: Support L3 cache performance event of Broadwell
  x86/intel_rdt: Limit C-states dynamically when pseudo-locking active
  x86/intel_rdt: Fix passing of value to 32-bit register

 Documentation/x86/intel_rdt_ui.txt            |  377 ++++-
 arch/x86/kernel/cpu/Makefile                  |    4 +-
 arch/x86/kernel/cpu/intel_rdt.c               |   11 +
 arch/x86/kernel/cpu/intel_rdt.h               |  142 +-
 arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c   |  129 +-
 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c   | 1505 +++++++++++++++++
 .../kernel/cpu/intel_rdt_pseudo_lock_event.h  |   43 +
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c      |  798 ++++++++-
 8 files changed, 2936 insertions(+), 73 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
 create mode 100644 arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h

-- 
2.17.0


^ permalink raw reply	[flat|nested] 98+ messages in thread
* [PATCH V5 15/38] x86/intel_rdt: Documentation for Cache Pseudo-Locking
@ 2018-05-29 12:57 Reinette Chatre
  2018-06-20  0:20 ` [tip:x86/cache] " tip-bot for Reinette Chatre
  0 siblings, 1 reply; 98+ messages in thread
From: Reinette Chatre @ 2018-05-29 12:57 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, vikas.shivappa
  Cc: gavin.hindman, jithu.joseph, dave.hansen, mingo, hpa, x86,
	linux-kernel, Reinette Chatre

Add description of Cache Pseudo-Locking feature, its interface,
as well as an example of its usage.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 Documentation/x86/intel_rdt_ui.txt | 280 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 278 insertions(+), 2 deletions(-)

diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
index de913e00e922..bcd0a6d2fcf8 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -29,7 +29,11 @@ mount options are:
 L2 and L3 CDP are controlled seperately.
 
 RDT features are orthogonal. A particular system may support only
-monitoring, only control, or both monitoring and control.
+monitoring, only control, or both monitoring and control.  Cache
+pseudo-locking is a unique way of using cache control to "pin" or
+"lock" data in the cache. Details can be found in
+"Cache Pseudo-Locking".
+
 
 The mount succeeds if either of allocation or monitoring is present, but
 only those files and directories supported by the system will be created.
@@ -86,6 +90,8 @@ related to allocation:
 			      and available for sharing.
 			"E" - Corresponding region is used exclusively by
 			      one resource group. No sharing allowed.
+			"P" - Corresponding region is pseudo-locked. No
+			      sharing allowed.
 
 Memory bandwitdh(MB) subdirectory contains the following files
 with respect to allocation:
@@ -192,7 +198,12 @@ When control is enabled all CTRL_MON groups will also contain:
 "mode":
 	The "mode" of the resource group dictates the sharing of its
 	allocations. A "shareable" resource group allows sharing of its
-	allocations while an "exclusive" resource group does not.
+	allocations while an "exclusive" resource group does not. A
+	cache pseudo-locked region is created by first writing
+	"pseudo-locksetup" to the "mode" file before writing the cache
+	pseudo-locked region's schemata to the resource group's "schemata"
+	file. On successful pseudo-locked region creation the mode will
+	automatically change to "pseudo-locked".
 
 When monitoring is enabled all MON groups will also contain:
 
@@ -410,6 +421,170 @@ L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
 L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
 L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
 
+Cache Pseudo-Locking
+--------------------
+CAT enables a user to specify the amount of cache space that an
+application can fill. Cache pseudo-locking builds on the fact that a
+CPU can still read and write data pre-allocated outside its current
+allocated area on a cache hit. With cache pseudo-locking, data can be
+preloaded into a reserved portion of cache that no application can
+fill, and from that point on will only serve cache hits. The cache
+pseudo-locked memory is made accessible to user space where an
+application can map it into its virtual address space and thus have
+a region of memory with reduced average read latency.
+
+The creation of a cache pseudo-locked region is triggered by a request
+from the user to do so that is accompanied by a schemata of the region
+to be pseudo-locked. The cache pseudo-locked region is created as follows:
+- Create a CAT allocation CLOSNEW with a CBM matching the schemata
+  from the user of the cache region that will contain the pseudo-locked
+  memory. This region must not overlap with any current CAT allocation/CLOS
+  on the system and no future overlap with this cache region is allowed
+  while the pseudo-locked region exists.
+- Create a contiguous region of memory of the same size as the cache
+  region.
+- Flush the cache, disable hardware prefetchers, disable preemption.
+- Make CLOSNEW the active CLOS and touch the allocated memory to load
+  it into the cache.
+- Set the previous CLOS as active.
+- At this point the closid CLOSNEW can be released - the cache
+  pseudo-locked region is protected as long as its CBM does not appear in
+  any CAT allocation. Even though the cache pseudo-locked region will from
+  this point on not appear in any CBM of any CLOS an application running with
+  any CLOS will be able to access the memory in the pseudo-locked region since
+  the region continues to serve cache hits.
+- The contiguous region of memory loaded into the cache is exposed to
+  user-space as a character device.
+
+Cache pseudo-locking increases the probability that data will remain
+in the cache via carefully configuring the CAT feature and controlling
+application behavior. There is no guarantee that data is placed in
+cache. Instructions like INVD, WBINVD, CLFLUSH, etc. can still evict
+“locked” data from cache. Power management C-states may shrink or
+power off cache. It is thus recommended to limit the processor maximum
+C-state, for example, by setting the processor.max_cstate kernel parameter.
+
+It is required that an application using a pseudo-locked region runs
+with affinity to the cores (or a subset of the cores) associated
+with the cache on which the pseudo-locked region resides. A sanity check
+within the code will not allow an application to map pseudo-locked memory
+unless it runs with affinity to cores associated with the cache on which the
+pseudo-locked region resides. The sanity check is only done during the
+initial mmap() handling, there is no enforcement afterwards and the
+application self needs to ensure it remains affine to the correct cores.
+
+Pseudo-locking is accomplished in two stages:
+1) During the first stage the system administrator allocates a portion
+   of cache that should be dedicated to pseudo-locking. At this time an
+   equivalent portion of memory is allocated, loaded into allocated
+   cache portion, and exposed as a character device.
+2) During the second stage a user-space application maps (mmap()) the
+   pseudo-locked memory into its address space.
+
+Cache Pseudo-Locking Interface
+------------------------------
+A pseudo-locked region is created using the resctrl interface as follows:
+
+1) Create a new resource group by creating a new directory in /sys/fs/resctrl.
+2) Change the new resource group's mode to "pseudo-locksetup" by writing
+   "pseudo-locksetup" to the "mode" file.
+3) Write the schemata of the pseudo-locked region to the "schemata" file. All
+   bits within the schemata should be "unused" according to the "bit_usage"
+   file.
+
+On successful pseudo-locked region creation the "mode" file will contain
+"pseudo-locked" and a new character device with the same name as the resource
+group will exist in /dev/pseudo_lock. This character device can be mmap()'ed
+by user space in order to obtain access to the pseudo-locked memory region.
+
+An example of cache pseudo-locked region creation and usage can be found below.
+
+Cache Pseudo-Locking Debugging Interface
+---------------------------------------
+The pseudo-locking debugging interface is enabled by default (if
+CONFIG_DEBUG_FS is enabled) and can be found in /sys/kernel/debug/resctrl.
+
+There is no explicit way for the kernel to test if a provided memory
+location is present in the cache. The pseudo-locking debugging interface uses
+the tracing infrastructure to provide two ways to measure cache residency of
+the pseudo-locked region:
+1) Memory access latency using the pseudo_lock_mem_latency tracepoint. Data
+   from these measurements are best visualized using a hist trigger (see
+   example below). In this test the pseudo-locked region is traversed at
+   a stride of 32 bytes while hardware prefetchers and preemption
+   are disabled. This also provides a substitute visualization of cache
+   hits and misses.
+2) Cache hit and miss measurements using model specific precision counters if
+   available. Depending on the levels of cache on the system the pseudo_lock_l2
+   and pseudo_lock_l3 tracepoints are available.
+   WARNING: triggering this  measurement uses from two (for just L2
+   measurements) to four (for L2 and L3 measurements) precision counters on
+   the system, if any other measurements are in progress the counters and
+   their corresponding event registers will be clobbered.
+
+When a pseudo-locked region is created a new debugfs directory is created for
+it in debugfs as /sys/kernel/debug/resctrl/<newdir>. A single
+write-only file, pseudo_lock_measure, is present in this directory. The
+measurement on the pseudo-locked region depends on the number, 1 or 2,
+written to this debugfs file. Since the measurements are recorded with the
+tracing infrastructure the relevant tracepoints need to be enabled before the
+measurement is triggered.
+
+Example of latency debugging interface:
+In this example a pseudo-locked region named "newlock" was created. Here is
+how we can measure the latency in cycles of reading from this region and
+visualize this data with a histogram that is available if CONFIG_HIST_TRIGGERS
+is set:
+# :> /sys/kernel/debug/tracing/trace
+# echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/trigger
+# echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
+# echo 1 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
+# echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
+# cat /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/hist
+
+# event histogram
+#
+# trigger info: hist:keys=latency:vals=hitcount:sort=hitcount:size=2048 [active]
+#
+
+{ latency:        456 } hitcount:          1
+{ latency:         50 } hitcount:         83
+{ latency:         36 } hitcount:         96
+{ latency:         44 } hitcount:        174
+{ latency:         48 } hitcount:        195
+{ latency:         46 } hitcount:        262
+{ latency:         42 } hitcount:        693
+{ latency:         40 } hitcount:       3204
+{ latency:         38 } hitcount:       3484
+
+Totals:
+    Hits: 8192
+    Entries: 9
+   Dropped: 0
+
+Example of cache hits/misses debugging:
+In this example a pseudo-locked region named "newlock" was created on the L2
+cache of a platform. Here is how we can obtain details of the cache hits
+and misses using the platform's precision counters.
+
+# :> /sys/kernel/debug/tracing/trace
+# echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
+# echo 2 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
+# echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
+# cat /sys/kernel/debug/tracing/trace
+
+# tracer: nop
+#
+#                              _-----=> irqs-off
+#                             / _----=> need-resched
+#                            | / _---=> hardirq/softirq
+#                            || / _--=> preempt-depth
+#                            ||| /     delay
+#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
+#              | |       |   ||||       |         |
+ pseudo_lock_mea-1672  [002] ....  3132.860500: pseudo_lock_l2: hits=4097 miss=0
+
+
 Examples for RDT allocation usage:
 
 Example 1
@@ -596,6 +771,107 @@ A resource group cannot be forced to overlap with an exclusive resource group:
 # cat info/last_cmd_status
 overlaps with exclusive group
 
+Example of Cache Pseudo-Locking
+-------------------------------
+Lock portion of L2 cache from cache id 1 using CBM 0x3. Pseudo-locked
+region is exposed at /dev/pseudo_lock/newlock that can be provided to
+application for argument to mmap().
+
+# mount -t resctrl resctrl /sys/fs/resctrl/
+# cd /sys/fs/resctrl
+
+Ensure that there are bits available that can be pseudo-locked, since only
+unused bits can be pseudo-locked the bits to be pseudo-locked needs to be
+removed from the default resource group's schemata:
+# cat info/L2/bit_usage
+0=SSSSSSSS;1=SSSSSSSS
+# echo 'L2:1=0xfc' > schemata
+# cat info/L2/bit_usage
+0=SSSSSSSS;1=SSSSSS00
+
+Create a new resource group that will be associated with the pseudo-locked
+region, indicate that it will be used for a pseudo-locked region, and
+configure the requested pseudo-locked region capacity bitmask:
+
+# mkdir newlock
+# echo pseudo-locksetup > newlock/mode
+# echo 'L2:1=0x3' > newlock/schemata
+
+On success the resource group's mode will change to pseudo-locked, the
+bit_usage will reflect the pseudo-locked region, and the character device
+exposing the pseudo-locked region will exist:
+
+# cat newlock/mode
+pseudo-locked
+# cat info/L2/bit_usage
+0=SSSSSSSS;1=SSSSSSPP
+# ls -l /dev/pseudo_lock/newlock
+crw------- 1 root root 243, 0 Apr  3 05:01 /dev/pseudo_lock/newlock
+
+/*
+ * Example code to access one page of pseudo-locked cache region
+ * from user space.
+ */
+#define _GNU_SOURCE
+#include <fcntl.h>
+#include <sched.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+/*
+ * It is required that the application runs with affinity to only
+ * cores associated with the pseudo-locked region. Here the cpu
+ * is hardcoded for convenience of example.
+ */
+static int cpuid = 2;
+
+int main(int argc, char *argv[])
+{
+	cpu_set_t cpuset;
+	long page_size;
+	void *mapping;
+	int dev_fd;
+	int ret;
+
+	page_size = sysconf(_SC_PAGESIZE);
+
+	CPU_ZERO(&cpuset);
+	CPU_SET(cpuid, &cpuset);
+	ret = sched_setaffinity(0, sizeof(cpuset), &cpuset);
+	if (ret < 0) {
+		perror("sched_setaffinity");
+		exit(EXIT_FAILURE);
+	}
+
+	dev_fd = open("/dev/pseudo_lock/newlock", O_RDWR);
+	if (dev_fd < 0) {
+		perror("open");
+		exit(EXIT_FAILURE);
+	}
+
+	mapping = mmap(0, page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
+		       dev_fd, 0);
+	if (mapping == MAP_FAILED) {
+		perror("mmap");
+		close(dev_fd);
+		exit(EXIT_FAILURE);
+	}
+
+	/* Application interacts with pseudo-locked memory @mapping */
+
+	ret = munmap(mapping, page_size);
+	if (ret < 0) {
+		perror("munmap");
+		close(dev_fd);
+		exit(EXIT_FAILURE);
+	}
+
+	close(dev_fd);
+	exit(EXIT_SUCCESS);
+}
+
 Locking between applications
 ----------------------------
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2018-06-25 22:08 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-22 22:41 [PATCH V7 00/41] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 01/41] x86/intel_rdt: Provide pseudo-locking hooks within rdt_mount Reinette Chatre
2018-06-23 12:07   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 02/41] x86/intel_rdt: Document new mode, size, and bit_usage Reinette Chatre
2018-06-23 12:07   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 03/41] x86/intel_rdt: Introduce RDT resource group mode Reinette Chatre
2018-06-23 12:08   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 04/41] x86/intel_rdt: Associate mode with each RDT resource group Reinette Chatre
2018-06-23 12:08   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 05/41] x86/intel_rdt: Introduce resource group's mode resctrl file Reinette Chatre
2018-06-23 12:09   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 06/41] x86/intel_rdt: Introduce test to determine if closid is in use Reinette Chatre
2018-06-23 12:09   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 07/41] x86/intel_rdt: Make useful functions available internally Reinette Chatre
2018-06-23 12:10   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 08/41] x86/intel_rdt: Initialize new resource group with sane defaults Reinette Chatre
2018-06-23 12:10   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 09/41] x86/intel_rdt: Introduce new "exclusive" mode Reinette Chatre
2018-06-23 12:11   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 10/41] x86/intel_rdt: Enable setting of exclusive mode Reinette Chatre
2018-06-23 12:11   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 11/41] x86/intel_rdt: Making CBM name and type more explicit Reinette Chatre
2018-06-23 12:12   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 12/41] x86/intel_rdt: Support flexible data to parsing callbacks Reinette Chatre
2018-06-23 12:13   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 13/41] x86/intel_rdt: Ensure requested schemata respects mode Reinette Chatre
2018-06-23 12:13   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 14/41] x86/intel_rdt: Introduce "bit_usage" to display cache allocations details Reinette Chatre
2018-06-23 12:14   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 15/41] x86/intel_rdt: Display resource groups' allocations' size in bytes Reinette Chatre
2018-06-23 12:14   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 16/41] x86/intel_rdt: Documentation for Cache Pseudo-Locking Reinette Chatre
2018-06-23 12:15   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 17/41] x86/intel_rdt: Introduce the Cache Pseudo-Locking modes Reinette Chatre
2018-06-23 12:15   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 18/41] x86/intel_rdt: Respect read and write access Reinette Chatre
2018-06-23 12:16   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 19/41] x86/intel_rdt: Add utility to test if tasks assigned to resource group Reinette Chatre
2018-06-23 12:16   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 20/41] x86/intel_rdt: Add utility to restrict/restore access to resctrl files Reinette Chatre
2018-06-23 12:17   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 21/41] x86/intel_rdt: Protect against resource group changes during locking Reinette Chatre
2018-06-23 12:17   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 22/41] x86/intel_rdt: Utilities to restrict/restore access to specific files Reinette Chatre
2018-06-23 12:18   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 23/41] x86/intel_rdt: Add check to determine if monitoring in progress Reinette Chatre
2018-06-23 12:18   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 24/41] x86/intel_rdt: Introduce pseudo-locked region Reinette Chatre
2018-06-23 12:19   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 25/41] x86/intel_rdt: Support enter/exit of locksetup mode Reinette Chatre
2018-06-23 12:20   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 26/41] x86/intel_rdt: Enable entering of pseudo-locksetup mode Reinette Chatre
2018-06-23 12:20   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 27/41] x86/intel_rdt: Split resource group removal in two Reinette Chatre
2018-06-23 12:21   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 28/41] x86/intel_rdt: Add utilities to test pseudo-locked region possibility Reinette Chatre
2018-06-23 12:21   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 29/41] x86/intel_rdt: Discover supported platforms via prefetch disable bits Reinette Chatre
2018-06-23 12:22   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 30/41] x86/intel_rdt: Pseudo-lock region creation/removal core Reinette Chatre
2018-06-23 12:22   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 31/41] x86/intel_rdt: Support creation/removal of pseudo-locked region Reinette Chatre
2018-06-23 12:23   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 32/41] x86/intel_rdt: Resctrl files reflect pseudo-locked information Reinette Chatre
2018-06-23 12:23   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 33/41] x86/intel_rdt: Ensure RDT cleanup on exit Reinette Chatre
2018-06-23 12:24   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 34/41] x86/intel_rdt: Create resctrl debug area Reinette Chatre
2018-06-23 12:24   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 35/41] x86/intel_rdt: Create debugfs files for pseudo-locking testing Reinette Chatre
2018-06-23 12:25   ` [tip:x86/cache] " tip-bot for Reinette Chatre
     [not found]   ` <201806232005.zVl35hAb%fengguang.wu@intel.com>
2018-06-24  9:09     ` [PATCH V7 35/41] " Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 36/41] x86/intel_rdt: Create character device exposing pseudo-locked region Reinette Chatre
2018-06-23 12:25   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:39   ` tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 37/41] x86/intel_rdt: More precise L2 hit/miss measurements Reinette Chatre
2018-06-23 12:26   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:40   ` tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 38/41] x86/intel_rdt: Support L3 cache performance event of Broadwell Reinette Chatre
2018-06-23 12:27   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:40   ` tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 39/41] x86/intel_rdt: Limit C-states dynamically when pseudo-locking active Reinette Chatre
2018-06-23 12:27   ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:41   ` tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 40/41] x86/intel_rdt: Fix passing of value to 32-bit register Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 41/41] x86/intel_rdt: Simplify index type Reinette Chatre
2018-06-22 23:45 ` [PATCH V7 00/41] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling David Howells
2018-06-23  0:28   ` Reinette Chatre
2018-06-23 12:16 ` Thomas Gleixner
2018-06-23 12:38   ` Thomas Gleixner
2018-06-23 22:54   ` David Howells
2018-06-24  0:30     ` Thomas Gleixner
2018-06-23 23:14   ` David Howells
2018-06-24  0:28     ` Thomas Gleixner
2018-06-24  9:20   ` Reinette Chatre
2018-06-24  9:45     ` Thomas Gleixner
2018-06-25 22:08   ` Reinette Chatre
  -- strict thread matches above, loose matches on Subject: below --
2018-05-29 12:57 [PATCH V5 15/38] x86/intel_rdt: Documentation for Cache Pseudo-Locking Reinette Chatre
2018-06-20  0:20 ` [tip:x86/cache] " tip-bot for Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox