From: tip-bot for Reinette Chatre <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: tglx@linutronix.de, hpa@zytor.com, mingo@kernel.org,
linux-kernel@vger.kernel.org, reinette.chatre@intel.com
Subject: [tip:x86/cache] x86/intel_rdt: More precise L2 hit/miss measurements
Date: Sun, 24 Jun 2018 06:40:21 -0700 [thread overview]
Message-ID: <tip-8a2fc0e1bc0cd856101927188884d7c370b62188@git.kernel.org> (raw)
In-Reply-To: <06b1456da65b543479dac8d9493e41f92f175d6c.1529706536.git.reinette.chatre@intel.com>
Commit-ID: 8a2fc0e1bc0cd856101927188884d7c370b62188
Gitweb: https://git.kernel.org/tip/8a2fc0e1bc0cd856101927188884d7c370b62188
Author: Reinette Chatre <reinette.chatre@intel.com>
AuthorDate: Fri, 22 Jun 2018 15:42:28 -0700
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sun, 24 Jun 2018 15:35:48 +0200
x86/intel_rdt: More precise L2 hit/miss measurements
Intel Goldmont processors supports non-architectural precise events that
can be used to give us more insight into the success of L2 cache
pseudo-locking on these platforms.
Introduce a new measurement trigger that will enable two precise events,
MEM_LOAD_UOPS_RETIRED.L2_HIT and MEM_LOAD_UOPS_RETIRED.L2_MISS, while
accessing pseudo-locked data. A new tracepoint, pseudo_lock_l2, is
created to make these results visible to the user.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/06b1456da65b543479dac8d9493e41f92f175d6c.1529706536.git.reinette.chatre@intel.com
---
arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 145 ++++++++++++++++++++--
arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h | 10 ++
2 files changed, 146 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
index 1c5e7f28da5b..9e808c07a7f0 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
@@ -23,6 +23,7 @@
#include <asm/cacheflush.h>
#include <asm/intel-family.h>
#include <asm/intel_rdt_sched.h>
+#include <asm/perf_event.h>
#include "intel_rdt.h"
@@ -63,6 +64,9 @@ static struct class *pseudo_lock_class;
* hardware prefetch disable bits are included here as they are documented
* in the SDM.
*
+ * When adding a platform here also add support for its cache events to
+ * measure_cycles_perf_fn()
+ *
* Return:
* If platform is supported, the bits to disable hardware prefetchers, 0
* if platform is not supported.
@@ -101,6 +105,16 @@ static u64 get_prefetch_disable_bits(void)
return 0;
}
+/*
+ * Helper to write 64bit value to MSR without tracing. Used when
+ * use of the cache should be restricted and use of registers used
+ * for local variables avoided.
+ */
+static inline void pseudo_wrmsrl_notrace(unsigned int msr, u64 val)
+{
+ __wrmsr(msr, (u32)(val & 0xffffffffULL), (u32)(val >> 32));
+}
+
/**
* pseudo_lock_minor_get - Obtain available minor number
* @minor: Pointer to where new minor number will be stored
@@ -834,6 +848,107 @@ static int measure_cycles_lat_fn(void *_plr)
return 0;
}
+static int measure_cycles_perf_fn(void *_plr)
+{
+ struct pseudo_lock_region *plr = _plr;
+ unsigned long long l2_hits, l2_miss;
+ u64 l2_hit_bits, l2_miss_bits;
+ unsigned long i;
+#ifdef CONFIG_KASAN
+ /*
+ * The registers used for local register variables are also used
+ * when KASAN is active. When KASAN is active we use regular variables
+ * at the cost of including cache access latency to these variables
+ * in the measurements.
+ */
+ unsigned int line_size;
+ unsigned int size;
+ void *mem_r;
+#else
+ register unsigned int line_size asm("esi");
+ register unsigned int size asm("edi");
+#ifdef CONFIG_X86_64
+ register void *mem_r asm("rbx");
+#else
+ register void *mem_r asm("ebx");
+#endif /* CONFIG_X86_64 */
+#endif /* CONFIG_KASAN */
+
+ /*
+ * Non-architectural event for the Goldmont Microarchitecture
+ * from Intel x86 Architecture Software Developer Manual (SDM):
+ * MEM_LOAD_UOPS_RETIRED D1H (event number)
+ * Umask values:
+ * L1_HIT 01H
+ * L2_HIT 02H
+ * L1_MISS 08H
+ * L2_MISS 10H
+ */
+
+ /*
+ * Start by setting flags for IA32_PERFEVTSELx:
+ * OS (Operating system mode) 0x2
+ * INT (APIC interrupt enable) 0x10
+ * EN (Enable counter) 0x40
+ *
+ * Then add the Umask value and event number to select performance
+ * event.
+ */
+
+ switch (boot_cpu_data.x86_model) {
+ case INTEL_FAM6_ATOM_GOLDMONT:
+ case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ l2_hit_bits = (0x52ULL << 16) | (0x2 << 8) | 0xd1;
+ l2_miss_bits = (0x52ULL << 16) | (0x10 << 8) | 0xd1;
+ break;
+ default:
+ goto out;
+ }
+
+ local_irq_disable();
+ /*
+ * Call wrmsr direcly to avoid the local register variables from
+ * being overwritten due to reordering of their assignment with
+ * the wrmsr calls.
+ */
+ __wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0);
+ /* Disable events and reset counters */
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, 0x0);
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x0);
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0, 0x0);
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 1, 0x0);
+ /* Set and enable the L2 counters */
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, l2_hit_bits);
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, l2_miss_bits);
+ mem_r = plr->kmem;
+ size = plr->size;
+ line_size = plr->line_size;
+ for (i = 0; i < size; i += line_size) {
+ asm volatile("mov (%0,%1,1), %%eax\n\t"
+ :
+ : "r" (mem_r), "r" (i)
+ : "%eax", "memory");
+ }
+ /*
+ * Call wrmsr directly (no tracing) to not influence
+ * the cache access counters as they are disabled.
+ */
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0,
+ l2_hit_bits & ~(0x40ULL << 16));
+ pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1,
+ l2_miss_bits & ~(0x40ULL << 16));
+ l2_hits = native_read_pmc(0);
+ l2_miss = native_read_pmc(1);
+ wrmsr(MSR_MISC_FEATURE_CONTROL, 0x0, 0x0);
+ local_irq_enable();
+ trace_pseudo_lock_l2(l2_hits, l2_miss);
+
+out:
+ plr->thread_done = 1;
+ wake_up_interruptible(&plr->lock_thread_wq);
+ return 0;
+}
+
/**
* pseudo_lock_measure_cycles - Trigger latency measure to pseudo-locked region
*
@@ -844,12 +959,12 @@ static int measure_cycles_lat_fn(void *_plr)
*
* Return: 0 on success, <0 on failure
*/
-static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp)
+static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
{
struct pseudo_lock_region *plr = rdtgrp->plr;
struct task_struct *thread;
unsigned int cpu;
- int ret;
+ int ret = -1;
cpus_read_lock();
mutex_lock(&rdtgroup_mutex);
@@ -866,9 +981,19 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp)
goto out;
}
- thread = kthread_create_on_node(measure_cycles_lat_fn, plr,
- cpu_to_node(cpu),
- "pseudo_lock_measure/%u", cpu);
+ if (sel == 1)
+ thread = kthread_create_on_node(measure_cycles_lat_fn, plr,
+ cpu_to_node(cpu),
+ "pseudo_lock_measure/%u",
+ cpu);
+ else if (sel == 2)
+ thread = kthread_create_on_node(measure_cycles_perf_fn, plr,
+ cpu_to_node(cpu),
+ "pseudo_lock_measure/%u",
+ cpu);
+ else
+ goto out;
+
if (IS_ERR(thread)) {
ret = PTR_ERR(thread);
goto out;
@@ -897,19 +1022,21 @@ static ssize_t pseudo_lock_measure_trigger(struct file *file,
size_t buf_size;
char buf[32];
int ret;
- bool bv;
+ int sel;
buf_size = min(count, (sizeof(buf) - 1));
if (copy_from_user(buf, user_buf, buf_size))
return -EFAULT;
buf[buf_size] = '\0';
- ret = strtobool(buf, &bv);
- if (ret == 0 && bv) {
+ ret = kstrtoint(buf, 10, &sel);
+ if (ret == 0) {
+ if (sel != 1 && sel != 2)
+ return -EINVAL;
ret = debugfs_file_get(file->f_path.dentry);
if (ret)
return ret;
- ret = pseudo_lock_measure_cycles(rdtgrp);
+ ret = pseudo_lock_measure_cycles(rdtgrp, sel);
if (ret == 0)
ret = count;
debugfs_file_put(file->f_path.dentry);
diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h
index 3cd0fa27d5fe..efad50d2ee2f 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock_event.h
@@ -15,6 +15,16 @@ TRACE_EVENT(pseudo_lock_mem_latency,
TP_printk("latency=%u", __entry->latency)
);
+TRACE_EVENT(pseudo_lock_l2,
+ TP_PROTO(u64 l2_hits, u64 l2_miss),
+ TP_ARGS(l2_hits, l2_miss),
+ TP_STRUCT__entry(__field(u64, l2_hits)
+ __field(u64, l2_miss)),
+ TP_fast_assign(__entry->l2_hits = l2_hits;
+ __entry->l2_miss = l2_miss;),
+ TP_printk("hits=%llu miss=%llu",
+ __entry->l2_hits, __entry->l2_miss));
+
#endif /* _TRACE_PSEUDO_LOCK_H */
#undef TRACE_INCLUDE_PATH
next prev parent reply other threads:[~2018-06-24 13:40 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-22 22:41 [PATCH V7 00/41] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 01/41] x86/intel_rdt: Provide pseudo-locking hooks within rdt_mount Reinette Chatre
2018-06-23 12:07 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 02/41] x86/intel_rdt: Document new mode, size, and bit_usage Reinette Chatre
2018-06-23 12:07 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 03/41] x86/intel_rdt: Introduce RDT resource group mode Reinette Chatre
2018-06-23 12:08 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 04/41] x86/intel_rdt: Associate mode with each RDT resource group Reinette Chatre
2018-06-23 12:08 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 05/41] x86/intel_rdt: Introduce resource group's mode resctrl file Reinette Chatre
2018-06-23 12:09 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 06/41] x86/intel_rdt: Introduce test to determine if closid is in use Reinette Chatre
2018-06-23 12:09 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 07/41] x86/intel_rdt: Make useful functions available internally Reinette Chatre
2018-06-23 12:10 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:41 ` [PATCH V7 08/41] x86/intel_rdt: Initialize new resource group with sane defaults Reinette Chatre
2018-06-23 12:10 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 09/41] x86/intel_rdt: Introduce new "exclusive" mode Reinette Chatre
2018-06-23 12:11 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 10/41] x86/intel_rdt: Enable setting of exclusive mode Reinette Chatre
2018-06-23 12:11 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 11/41] x86/intel_rdt: Making CBM name and type more explicit Reinette Chatre
2018-06-23 12:12 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 12/41] x86/intel_rdt: Support flexible data to parsing callbacks Reinette Chatre
2018-06-23 12:13 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 13/41] x86/intel_rdt: Ensure requested schemata respects mode Reinette Chatre
2018-06-23 12:13 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 14/41] x86/intel_rdt: Introduce "bit_usage" to display cache allocations details Reinette Chatre
2018-06-23 12:14 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 15/41] x86/intel_rdt: Display resource groups' allocations' size in bytes Reinette Chatre
2018-06-23 12:14 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 16/41] x86/intel_rdt: Documentation for Cache Pseudo-Locking Reinette Chatre
2018-06-23 12:15 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 17/41] x86/intel_rdt: Introduce the Cache Pseudo-Locking modes Reinette Chatre
2018-06-23 12:15 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 18/41] x86/intel_rdt: Respect read and write access Reinette Chatre
2018-06-23 12:16 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 19/41] x86/intel_rdt: Add utility to test if tasks assigned to resource group Reinette Chatre
2018-06-23 12:16 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 20/41] x86/intel_rdt: Add utility to restrict/restore access to resctrl files Reinette Chatre
2018-06-23 12:17 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 21/41] x86/intel_rdt: Protect against resource group changes during locking Reinette Chatre
2018-06-23 12:17 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 22/41] x86/intel_rdt: Utilities to restrict/restore access to specific files Reinette Chatre
2018-06-23 12:18 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 23/41] x86/intel_rdt: Add check to determine if monitoring in progress Reinette Chatre
2018-06-23 12:18 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 24/41] x86/intel_rdt: Introduce pseudo-locked region Reinette Chatre
2018-06-23 12:19 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 25/41] x86/intel_rdt: Support enter/exit of locksetup mode Reinette Chatre
2018-06-23 12:20 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 26/41] x86/intel_rdt: Enable entering of pseudo-locksetup mode Reinette Chatre
2018-06-23 12:20 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 27/41] x86/intel_rdt: Split resource group removal in two Reinette Chatre
2018-06-23 12:21 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 28/41] x86/intel_rdt: Add utilities to test pseudo-locked region possibility Reinette Chatre
2018-06-23 12:21 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 29/41] x86/intel_rdt: Discover supported platforms via prefetch disable bits Reinette Chatre
2018-06-23 12:22 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 30/41] x86/intel_rdt: Pseudo-lock region creation/removal core Reinette Chatre
2018-06-23 12:22 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 31/41] x86/intel_rdt: Support creation/removal of pseudo-locked region Reinette Chatre
2018-06-23 12:23 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 32/41] x86/intel_rdt: Resctrl files reflect pseudo-locked information Reinette Chatre
2018-06-23 12:23 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 33/41] x86/intel_rdt: Ensure RDT cleanup on exit Reinette Chatre
2018-06-23 12:24 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 34/41] x86/intel_rdt: Create resctrl debug area Reinette Chatre
2018-06-23 12:24 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 35/41] x86/intel_rdt: Create debugfs files for pseudo-locking testing Reinette Chatre
2018-06-23 12:25 ` [tip:x86/cache] " tip-bot for Reinette Chatre
[not found] ` <201806232005.zVl35hAb%fengguang.wu@intel.com>
2018-06-24 9:09 ` [PATCH V7 35/41] " Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 36/41] x86/intel_rdt: Create character device exposing pseudo-locked region Reinette Chatre
2018-06-23 12:25 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:39 ` tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 37/41] x86/intel_rdt: More precise L2 hit/miss measurements Reinette Chatre
2018-06-23 12:26 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:40 ` tip-bot for Reinette Chatre [this message]
2018-06-22 22:42 ` [PATCH V7 38/41] x86/intel_rdt: Support L3 cache performance event of Broadwell Reinette Chatre
2018-06-23 12:27 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:40 ` tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 39/41] x86/intel_rdt: Limit C-states dynamically when pseudo-locking active Reinette Chatre
2018-06-23 12:27 ` [tip:x86/cache] " tip-bot for Reinette Chatre
2018-06-24 13:41 ` tip-bot for Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 40/41] x86/intel_rdt: Fix passing of value to 32-bit register Reinette Chatre
2018-06-22 22:42 ` [PATCH V7 41/41] x86/intel_rdt: Simplify index type Reinette Chatre
2018-06-22 23:45 ` [PATCH V7 00/41] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling David Howells
2018-06-23 0:28 ` Reinette Chatre
2018-06-23 12:16 ` Thomas Gleixner
2018-06-23 12:38 ` Thomas Gleixner
2018-06-23 22:54 ` David Howells
2018-06-24 0:30 ` Thomas Gleixner
2018-06-23 23:14 ` David Howells
2018-06-24 0:28 ` Thomas Gleixner
2018-06-24 9:20 ` Reinette Chatre
2018-06-24 9:45 ` Thomas Gleixner
2018-06-25 22:08 ` Reinette Chatre
-- strict thread matches above, loose matches on Subject: below --
2018-05-29 12:58 [PATCH V5 36/38] x86/intel_rdt: More precise L2 hit/miss measurements Reinette Chatre
2018-06-20 0:32 ` [tip:x86/cache] " tip-bot for Reinette Chatre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-8a2fc0e1bc0cd856101927188884d7c370b62188@git.kernel.org \
--to=tipbot@zytor.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=reinette.chatre@intel.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox