From: David Carrillo-Cisneros <davidcc@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Ingo Molnar <mingo@redhat.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>,
Matt Fleming <matt@codeblueprint.co.uk>,
Tony Luck <tony.luck@intel.com>,
Stephane Eranian <eranian@google.com>,
Paul Turner <pjt@google.com>,
David Carrillo-Cisneros <davidcc@google.com>,
x86@kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v2 25/32] perf/x86/intel/cqm: integrate CQM cgroups with scheduler
Date: Wed, 11 May 2016 16:02:25 -0700 [thread overview]
Message-ID: <1463007752-116802-26-git-send-email-davidcc@google.com> (raw)
In-Reply-To: <1463007752-116802-1-git-send-email-davidcc@google.com>
Allow monitored cgroups to update the PQR MSR during task switch even
without an associated perf_event. When a CQM perf_event exists for next
task, event's RMID takes precedence over the cgroup RMID, otherwise, the
cgroup's RMID is used. To discern the two cases, pqr_update_rmid is called
in one of two modes:
- PQR_RMID_MODE_NOEVENT: A RMID that do not correspond to an event.
e.g. the RMID of the root pmonr when no event is scheduled.
This is set by pmu:del in perf_event.
- PQR_RMID_MODE_EVENT: A RMID used by an event. Set during pmu::add
unset on pmu::del. This mode prevents from using a non-event
cgroup RMID.
The cgroup sched in code is called when the last RMID has mode
PQR_RMID_MODE_NOEVENT.
Reviewed-by: Stephane Eranian <eranian@google.com>
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
---
arch/x86/events/intel/cqm.c | 49 ++++++++++++++++++++++++++++++++++++---
arch/x86/include/asm/pqr_common.h | 47 +++++++++++++++++++++++++++++++------
arch/x86/include/asm/processor.h | 3 +++
arch/x86/kernel/cpu/pqr_common.c | 16 +++++++++++++
4 files changed, 105 insertions(+), 10 deletions(-)
diff --git a/arch/x86/events/intel/cqm.c b/arch/x86/events/intel/cqm.c
index 9038112..5928bdb 100644
--- a/arch/x86/events/intel/cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -2514,9 +2514,8 @@ static inline void __intel_cqm_event_start(
{
if (!(event->hw.state & PERF_HES_STOPPED))
return;
-
event->hw.state &= ~PERF_HES_STOPPED;
- pqr_update_rmid(summary.sched_rmid);
+ pqr_update_rmid(summary.sched_rmid, PQR_RMID_MODE_EVENT);
}
static void intel_cqm_event_start(struct perf_event *event, int mode)
@@ -2546,7 +2545,7 @@ static void intel_cqm_event_stop(struct perf_event *event, int mode)
/* Occupancy of CQM events is obtained at read. No need to read
* when event is stopped since read on inactive cpus succeed.
*/
- pqr_update_rmid(summary.sched_rmid);
+ pqr_update_rmid(summary.sched_rmid, PQR_RMID_MODE_NOEVENT);
}
static int intel_cqm_event_add(struct perf_event *event, int mode)
@@ -2963,6 +2962,7 @@ static void intel_cqm_cpu_starting(unsigned int cpu)
u16 pkg_id = topology_physical_package_id(cpu);
state->rmid = 0;
+ state->rmid_mode = PQR_RMID_MODE_NOEVENT;
state->closid = 0;
/* XXX: lock */
@@ -3145,6 +3145,12 @@ static int __init intel_cqm_init(void)
pr_info("Intel CQM monitoring enabled with at least %u rmids per package.\n",
min_max_rmid + 1);
+ /* Make sure pqr_common_enable_key is enabled after
+ * cqm_initialized_key.
+ */
+ barrier();
+
+ static_branch_enable(&pqr_common_enable_key);
return ret;
error_init_mutex:
@@ -3156,4 +3162,41 @@ error:
return ret;
}
+/* Schedule task without a CQM perf_event. */
+inline void __intel_cqm_no_event_sched_in(void)
+{
+#ifdef CONFIG_CGROUP_PERF
+ struct monr *monr;
+ struct pmonr *pmonr;
+ union prmid_summary summary;
+ u16 pkg_id = topology_physical_package_id(smp_processor_id());
+ struct pmonr *root_pmonr = monr_hrchy_root->pmonrs[pkg_id];
+
+ /* Assume CQM enabled is likely given that PQR is enabled. */
+ if (!static_branch_likely(&cqm_initialized_key))
+ return;
+
+ /* Safe to call from_task since we are in scheduler lock. */
+ monr = monr_from_perf_cgroup(perf_cgroup_from_task(current, NULL));
+ pmonr = monr->pmonrs[pkg_id];
+
+ /* Utilize most up to date pmonr summary. */
+ monr_hrchy_get_next_prmid_summary(pmonr);
+ summary.value = atomic64_read(&pmonr->prmid_summary_atomic);
+
+ if (!prmid_summary__is_mon_active(summary))
+ goto no_rmid;
+
+ if (WARN_ON_ONCE(!__valid_rmid(pkg_id, summary.sched_rmid)))
+ goto no_rmid;
+
+ pqr_update_rmid(summary.sched_rmid, PQR_RMID_MODE_NOEVENT);
+ return;
+
+no_rmid:
+ summary.value = atomic64_read(&root_pmonr->prmid_summary_atomic);
+ pqr_update_rmid(summary.sched_rmid, PQR_RMID_MODE_NOEVENT);
+#endif
+}
+
device_initcall(intel_cqm_init);
diff --git a/arch/x86/include/asm/pqr_common.h b/arch/x86/include/asm/pqr_common.h
index 854febe..0af04d2 100644
--- a/arch/x86/include/asm/pqr_common.h
+++ b/arch/x86/include/asm/pqr_common.h
@@ -3,6 +3,7 @@
#if defined(CONFIG_INTEL_RDT)
+#include <linux/jump_label.h>
#include <linux/types.h>
#include <asm/percpu.h>
#include <asm/msr.h>
@@ -10,35 +11,67 @@
#define MSR_IA32_PQR_ASSOC 0x0c8f
#define INVALID_RMID (-1)
+#define INVALID_CLOSID (-1)
+
+
+extern struct static_key_false pqr_common_enable_key;
+
+enum intel_pqr_rmid_mode {
+ /* RMID has no perf_event associated. */
+ PQR_RMID_MODE_NOEVENT = 0,
+ /* RMID has a perf_event associated. */
+ PQR_RMID_MODE_EVENT
+};
/**
* struct intel_pqr_state - State cache for the PQR MSR
- * @rmid: The cached Resource Monitoring ID
- * @closid: The cached Class Of Service ID
+ * @rmid: Last RMID written to hw.
+ * @rmid_mode: Next RMID's mode.
+ * @closid: The current Class Of Service ID
*
* The upper 32 bits of MSR_IA32_PQR_ASSOC contain closid and the
* lower 10 bits rmid. The update to MSR_IA32_PQR_ASSOC always
* contains both parts, so we need to cache them.
*
- * The cache also helps to avoid pointless updates if the value does
- * not change.
+ * The cache also helps to avoid pointless updates if the value does not
+ * change. It also keeps track of the type of RMID set (event vs no event)
+ * used to determine when a cgroup RMID is required.
*/
struct intel_pqr_state {
- u32 rmid;
- u32 closid;
+ u32 rmid;
+ enum intel_pqr_rmid_mode rmid_mode;
+ u32 closid;
};
DECLARE_PER_CPU(struct intel_pqr_state, pqr_state);
-static inline void pqr_update_rmid(u32 rmid)
+static inline void pqr_update_rmid(u32 rmid, enum intel_pqr_rmid_mode mode)
{
struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
+ state->rmid_mode = mode;
+
if (state->rmid == rmid)
return;
state->rmid = rmid;
wrmsr(MSR_IA32_PQR_ASSOC, rmid, state->closid);
}
+void __pqr_ctx_switch(void);
+
+inline void __intel_cqm_no_event_sched_in(void);
+
+static inline void pqr_ctx_switch(void)
+{
+ if (static_branch_unlikely(&pqr_common_enable_key))
+ __pqr_ctx_switch();
+}
+
+#else
+
+static inline void pqr_ctx_switch(void)
+{
+}
+
#endif
#endif
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c85fd82..da1d56f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -842,4 +842,7 @@ bool xen_set_default_idle(void);
void stop_this_cpu(void *dummy);
void df_debug(struct pt_regs *regs, long error_code);
+
+#define finish_arch_pre_lock_switch pqr_ctx_switch
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/pqr_common.c b/arch/x86/kernel/cpu/pqr_common.c
index dc6debc..e36702f 100644
--- a/arch/x86/kernel/cpu/pqr_common.c
+++ b/arch/x86/kernel/cpu/pqr_common.c
@@ -6,3 +6,19 @@
* must ensure interruptions are properly handled.
*/
DEFINE_PER_CPU(struct intel_pqr_state, pqr_state);
+
+DEFINE_STATIC_KEY_FALSE(pqr_common_enable_key);
+
+/* Update hw's RMID using cgroup's if perf_event did not.
+ * Sync pqr cache with MSR.
+ */
+inline void __pqr_ctx_switch(void)
+{
+ struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
+
+ /* If perf_event did set rmid that is used, do not try
+ * to obtain another one from current task.
+ */
+ if (state->rmid_mode == PQR_RMID_MODE_NOEVENT)
+ __intel_cqm_no_event_sched_in();
+}
--
2.8.0.rc3.226.g39d4020
next prev parent reply other threads:[~2016-05-11 23:05 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-11 23:02 [PATCH v2 00/32] 2nd Iteration of Cache QoS Monitoring support David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 01/32] perf/x86/intel/cqm: remove previous version of CQM and MBM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 02/32] perf/x86/intel/cqm: software cache for MSR_IA32_PQR_ASSOC David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 03/32] x86/intel,cqm: add CONFIG_INTEL_RDT configuration flag David Carrillo-Cisneros
2016-05-18 17:30 ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 04/32] perf/x86/intel/cqm: add constants for CQM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 05/32] perf/x86/intel/cqm: encapsulate per-package RMIDs David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 06/32] perf/x86/intel/cqm: add per-package RMIDs, data and locks David Carrillo-Cisneros
2016-05-18 16:08 ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 07/32] perf/x86/intel/cqm: add helpers for per-package locking David Carrillo-Cisneros
2016-05-18 17:35 ` Thomas Gleixner
2016-05-18 19:09 ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 08/32] perf/x86/intel/cqm: add pmu sysfs attribute David Carrillo-Cisneros
2016-05-18 17:38 ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 09/32] perf/x86/intel/cqm: basic RMID hierarchy with per package RMIDs David Carrillo-Cisneros
2016-05-18 19:51 ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 10/32] perf/x86/intel/cqm: introduce (I)state and limbo prmids David Carrillo-Cisneros
2016-05-18 20:36 ` Thomas Gleixner
2016-05-25 0:52 ` David Carrillo-Cisneros
2016-05-25 8:51 ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 11/32] perf/x86/intel/cqm: add per-package RMID rotation David Carrillo-Cisneros
2016-05-18 21:37 ` Thomas Gleixner
2016-05-24 21:01 ` David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 12/32] perf/x86/intel/cqm: schedule work for rotation task David Carrillo-Cisneros
2016-05-18 20:41 ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 13/32] perf/x86/intel/cqm: add polled update of RMID's llc_occupancy David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 14/32] perf/x86/intel/cqm: add preallocation of anodes David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 15/32] perf/core: add hooks to expose architecture specific features in perf_cgroup David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 16/32] perf/x86/intel/cqm: add cgroup support David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 17/32] perf/core,perf/x86/intel/cqm: add pmu::event_terminate David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 18/32] perf/core: introduce PMU event flag PERF_CGROUP_NO_RECURSION David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 19/32] x86/intel/cqm: use PERF_CGROUP_NO_RECURSION in CQM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 20/32] perf/x86/intel/cqm: handle inherit event and inherit_stat flag David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 21/32] perf/x86/intel/cqm: introduce read_subtree David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 22/32] perf/core: introduce PERF_INACTIVE_*_READ_* flags David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 23/32] perf/x86/intel/cqm: use PERF_INACTIVE_*_READ_* flags in CQM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 24/32] sched: introduce the finish_arch_pre_lock_switch() scheduler hook David Carrillo-Cisneros
2016-05-11 23:02 ` David Carrillo-Cisneros [this message]
2016-05-11 23:02 ` [PATCH v2 26/32] perf/x86/intel/cqm: make one write of PQR_ASSOC per ctx switch David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 27/32] perf/core: add perf_event cgroup hooks for subsystem attributes David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 28/32] perf/x86/intel/cqm: add CQM attributes to perf_event cgroup David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 29/32] perf,perf/x86,perf/powerpc,perf/arm,perf/*: add int error return to pmu::read David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 30/32] perf,perf/x86: add hook perf_event_arch_exec David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 31/32] perf/stat: fix bug in handling events in error state David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 32/32] perf/stat: revamp read error handling, snapshot and per_pkg events David Carrillo-Cisneros
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463007752-116802-26-git-send-email-davidcc@google.com \
--to=davidcc@google.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=eranian@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=tony.luck@intel.com \
--cc=vikas.shivappa@linux.intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.