From: David Carrillo-Cisneros <davidcc@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Ingo Molnar <mingo@redhat.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>,
Matt Fleming <matt.fleming@intel.com>,
Tony Luck <tony.luck@intel.com>,
Stephane Eranian <eranian@google.com>,
Paul Turner <pjt@google.com>,
David Carrillo-Cisneros <davidcc@google.com>,
x86@kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 00/32] 2nd Iteration of Cache QoS Monitoring support.
Date: Thu, 28 Apr 2016 21:43:06 -0700 [thread overview]
Message-ID: <1461905018-86355-1-git-send-email-davidcc@google.com> (raw)
This series introduces the next iteration of kernel support for the
Cache QoS Monitoring (CQM) technology available in Intel Xeon processors.
One of the main limitations of the previous version is the inability
to simultaneously monitor:
1) cpu event and any other event in that cpu.
2) cgroup events for cgroups in same descendancy line.
3) cgroup events and any thread event of a cgroup in the same
descendancy line.
Another limitation is that monitoring for a cgroup was enabled/disabled by
the existence of a perf event for that cgroup. Since the event
llc_occupancy measures changes in occupancy rather than total occupancy,
in order to read meaningful llc_occupancy values, an event should be
enabled for a long enough period of time. The overhead in context switches
caused by the perf events is undesired in some sensitive scenarios.
This series of patches addresses the shortcomings mentioned above and,
add some other improvements. The main changes are:
- No more potential conflicts between different events. New
version builds a hierarchy of RMIDs that captures the dependency
between monitored cgroups. llc_occupancy for cgroup is the sum of
llc_occupancies for that cgroup RMID and all other RMIDs in the
cgroups subtree (both monitored cgroups and threads).
- A cgroup integration that allows to monitor the a cgroup without
creating a perf event, decreasing the context switch overhead.
Monitoring is controlled by a boolean cgroup subsystem attribute
in each perf cgroup, this is:
echo 1 > cgroup_path/perf_event.cqm_cont_monitoring
starts CQM monitoring whether or not there is a perf_event
attached to the cgroup. Setting the attribute to 0 makes
monitoring dependent on the existence of a perf_event.
A perf_event is always required in order to read llc_occupancy.
This cgroup integration uses Intel's PQR code and is intended to
be used by upcoming versions of Intel's CAT.
- A more stable rotation algorithm: New algorithm uses SLOs that
guarantee:
- A minimum of enabled time for monitored cgroups and
threads.
- A maximum time disabled before error is introduced by
reusing dirty RMIDs.
- A minimum rate at which RMIDs recycling must progress.
- Reduced impact of stealing/rotation of RMIDs: The new algorithm
accounts the residual occupancy held by limbo RMIDs towards the
former owner of the limbo RMID, decreasing the error introduced
by RMID rotation.
It also allows a limbo RMID to be reused by its former owner when
appropriate, decreasing the potential error of reusing dirty RMIDs
and allowing to make progress even if most limbo RMIDs do not
drop occupancy fast enough.
- Elimination of pmu::count: perf generic's perf_event_count()
perform a quick add of atomic types. The introduction of
pmu::count in the previous CQM series to read occupancy for thread
events changed the behavior of perf_event_count() by performing a
potentially slow IPI and write/read to MSR. It also made pmu::read
to have different behaviors depending on whether the event was a
cpu/cgroup event or a thread. This patches serie removes the custom
pmu::count from CQM and provides a consistent behavior for all
calls of perf_event_read .
- Added error return for pmu::read: Reads to CQM events may fail
due to stealing of RMIDs, even after successfully adding an event
to a PMU. This patch series expands pmu::read with an int return
value and propagates the error to callers that can fail
(ie. perf_read).
The ability to fail of pmu::read is consistent with the recent
changes that allow perf_event_read to fail for transactional
reading of event groups.
- Introduces the field pmu_event_flags that contain flags set by
the PMU to signal variations on the default behavior to perf's
generic code. In this series, three flags are introduced:
- PERF_CGROUP_NO_RECURSION : Signals generic code to add
events of the cgroup ancestors of a cgroup.
- PERF_INACTIVE_CPU_READ_PKG: Signals generic coda that
this CPU event can be read in any CPU in its event::cpu's
package, even if the event is not active.
- PERF_INACTIVE_EV_READ_ANY_CPU: Signals generic code that
this event can be read in any CPU in any package in the
system even if the event is not active.
Using the above flags takes advantage of the CQM's hw ability to
read llc_occupancy even when the associated perf event is not
running in a CPU.
This patch series also updates the perf tool to fix error handling and to
better handle the idiosyncrasies of snapshot and per-pkg events.
David Carrillo-Cisneros (31):
perf/x86/intel/cqm: temporarily remove MBM from CQM and cleanup
perf/x86/intel/cqm: remove check for conflicting events
perf/x86/intel/cqm: remove all code for rotation of RMIDs
perf/x86/intel/cqm: make read of RMIDs per package (Temporal)
perf/core: remove unused pmu->count
x86/intel,cqm: add CONFIG_INTEL_RDT configuration flag and refactor
PQR
perf/x86/intel/cqm: separate CQM PMU's attributes from x86 PMU
perf/x86/intel/cqm: prepare for next patches
perf/x86/intel/cqm: add per-package RMIDs, data and locks
perf/x86/intel/cqm: basic RMID hierarchy with per package rmids
perf/x86/intel/cqm: (I)state and limbo prmids
perf/x86/intel/cqm: add per-package RMID rotation
perf/x86/intel/cqm: add polled update of RMID's llc_occupancy
perf/x86/intel/cqm: add preallocation of anodes
perf/core: add hooks to expose architecture specific features in
perf_cgroup
perf/x86/intel/cqm: add cgroup support
perf/core: adding pmu::event_terminate
perf/x86/intel/cqm: use pmu::event_terminate
perf/core: introduce PMU event flag PERF_CGROUP_NO_RECURSION
x86/intel/cqm: use PERF_CGROUP_NO_RECURSION in CQM
perf/x86/intel/cqm: handle inherit event and inherit_stat flag
perf/x86/intel/cqm: introduce read_subtree
perf/core: introduce PERF_INACTIVE_*_READ_* flags
perf/x86/intel/cqm: use PERF_INACTIVE_*_READ_* flags in CQM
sched: introduce the finish_arch_pre_lock_switch() scheduler hook
perf/x86/intel/cqm: integrate CQM cgroups with scheduler
perf/core: add perf_event cgroup hooks for subsystem attributes
perf/x86/intel/cqm: add CQM attributes to perf_event cgroup
perf,perf/x86,perf/powerpc,perf/arm,perf/*: add int error return to
pmu::read
perf,perf/x86: add hook perf_event_arch_exec
perf/stat: revamp error handling for snapshot and per_pkg events
Stephane Eranian (1):
perf/stat: fix bug in handling events in error state
arch/alpha/kernel/perf_event.c | 3 +-
arch/arc/kernel/perf_event.c | 3 +-
arch/arm64/include/asm/hw_breakpoint.h | 2 +-
arch/arm64/kernel/hw_breakpoint.c | 3 +-
arch/metag/kernel/perf/perf_event.c | 5 +-
arch/mips/kernel/perf_event_mipsxx.c | 3 +-
arch/powerpc/include/asm/hw_breakpoint.h | 2 +-
arch/powerpc/kernel/hw_breakpoint.c | 3 +-
arch/powerpc/perf/core-book3s.c | 11 +-
arch/powerpc/perf/core-fsl-emb.c | 5 +-
arch/powerpc/perf/hv-24x7.c | 5 +-
arch/powerpc/perf/hv-gpci.c | 3 +-
arch/s390/kernel/perf_cpum_cf.c | 5 +-
arch/s390/kernel/perf_cpum_sf.c | 3 +-
arch/sh/include/asm/hw_breakpoint.h | 2 +-
arch/sh/kernel/hw_breakpoint.c | 3 +-
arch/sparc/kernel/perf_event.c | 2 +-
arch/tile/kernel/perf_event.c | 3 +-
arch/x86/Kconfig | 6 +
arch/x86/events/amd/ibs.c | 2 +-
arch/x86/events/amd/iommu.c | 5 +-
arch/x86/events/amd/uncore.c | 3 +-
arch/x86/events/core.c | 3 +-
arch/x86/events/intel/Makefile | 3 +-
arch/x86/events/intel/bts.c | 3 +-
arch/x86/events/intel/cqm.c | 3847 +++++++++++++++++++++---------
arch/x86/events/intel/cqm.h | 519 ++++
arch/x86/events/intel/cstate.c | 3 +-
arch/x86/events/intel/pt.c | 3 +-
arch/x86/events/intel/rapl.c | 3 +-
arch/x86/events/intel/uncore.c | 3 +-
arch/x86/events/intel/uncore.h | 2 +-
arch/x86/events/msr.c | 3 +-
arch/x86/include/asm/hw_breakpoint.h | 2 +-
arch/x86/include/asm/perf_event.h | 41 +
arch/x86/include/asm/pqr_common.h | 74 +
arch/x86/include/asm/processor.h | 4 +
arch/x86/kernel/cpu/Makefile | 4 +
arch/x86/kernel/cpu/pqr_common.c | 43 +
arch/x86/kernel/hw_breakpoint.c | 3 +-
arch/x86/kvm/pmu.h | 10 +-
drivers/bus/arm-cci.c | 3 +-
drivers/bus/arm-ccn.c | 3 +-
drivers/perf/arm_pmu.c | 3 +-
include/linux/perf_event.h | 91 +-
kernel/events/core.c | 170 +-
kernel/sched/core.c | 1 +
kernel/sched/sched.h | 3 +
kernel/trace/bpf_trace.c | 5 +-
tools/perf/builtin-stat.c | 43 +-
tools/perf/util/counts.h | 19 +
tools/perf/util/evsel.c | 44 +-
tools/perf/util/evsel.h | 8 +-
tools/perf/util/stat.c | 35 +-
54 files changed, 3746 insertions(+), 1337 deletions(-)
create mode 100644 arch/x86/events/intel/cqm.h
create mode 100644 arch/x86/include/asm/pqr_common.h
create mode 100644 arch/x86/kernel/cpu/pqr_common.c
--
2.8.0.rc3.226.g39d4020
next reply other threads:[~2016-04-29 4:43 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-29 4:43 David Carrillo-Cisneros [this message]
2016-04-29 4:43 ` [PATCH 01/32] perf/x86/intel/cqm: temporarily remove MBM from CQM and cleanup David Carrillo-Cisneros
2016-04-29 20:19 ` Vikas Shivappa
2016-04-29 4:43 ` [PATCH 02/32] perf/x86/intel/cqm: remove check for conflicting events David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 03/32] perf/x86/intel/cqm: remove all code for rotation of RMIDs David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 04/32] perf/x86/intel/cqm: make read of RMIDs per package (Temporal) David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 05/32] perf/core: remove unused pmu->count David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 06/32] x86/intel,cqm: add CONFIG_INTEL_RDT configuration flag and refactor PQR David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 07/32] perf/x86/intel/cqm: separate CQM PMU's attributes from x86 PMU David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 08/32] perf/x86/intel/cqm: prepare for next patches David Carrillo-Cisneros
2016-04-29 9:18 ` Peter Zijlstra
2016-04-29 4:43 ` [PATCH 09/32] perf/x86/intel/cqm: add per-package RMIDs, data and locks David Carrillo-Cisneros
2016-04-29 20:56 ` Vikas Shivappa
2016-04-29 4:43 ` [PATCH 10/32] perf/x86/intel/cqm: basic RMID hierarchy with per package rmids David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 11/32] perf/x86/intel/cqm: (I)state and limbo prmids David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 12/32] perf/x86/intel/cqm: add per-package RMID rotation David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 13/32] perf/x86/intel/cqm: add polled update of RMID's llc_occupancy David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 14/32] perf/x86/intel/cqm: add preallocation of anodes David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 15/32] perf/core: add hooks to expose architecture specific features in perf_cgroup David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 16/32] perf/x86/intel/cqm: add cgroup support David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 17/32] perf/core: adding pmu::event_terminate David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 18/32] perf/x86/intel/cqm: use pmu::event_terminate David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 19/32] perf/core: introduce PMU event flag PERF_CGROUP_NO_RECURSION David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 20/32] x86/intel/cqm: use PERF_CGROUP_NO_RECURSION in CQM David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 21/32] perf/x86/intel/cqm: handle inherit event and inherit_stat flag David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 22/32] perf/x86/intel/cqm: introduce read_subtree David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 23/32] perf/core: introduce PERF_INACTIVE_*_READ_* flags David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 24/32] perf/x86/intel/cqm: use PERF_INACTIVE_*_READ_* flags in CQM David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 25/32] sched: introduce the finish_arch_pre_lock_switch() scheduler hook David Carrillo-Cisneros
2016-04-29 8:52 ` Peter Zijlstra
[not found] ` <CALcN6miyq9_4GQfO9=bjFb-X_2LSQdwfWnm+KvT=UrYRCAb6Og@mail.gmail.com>
2016-04-29 18:40 ` David Carrillo-Cisneros
2016-04-29 20:21 ` Vikas Shivappa
2016-04-29 20:50 ` David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 26/32] perf/x86/intel/cqm: integrate CQM cgroups with scheduler David Carrillo-Cisneros
2016-04-29 20:25 ` Vikas Shivappa
2016-04-29 20:48 ` David Carrillo-Cisneros
2016-04-29 21:01 ` Vikas Shivappa
2016-04-29 21:26 ` David Carrillo-Cisneros
2016-04-29 21:32 ` Vikas Shivappa
2016-04-29 21:49 ` David Carrillo-Cisneros
2016-04-29 23:49 ` Vikas Shivappa
2016-04-30 17:50 ` David Carrillo-Cisneros
2016-05-02 13:22 ` Thomas Gleixner
2016-04-29 4:43 ` [PATCH 27/32] perf/core: add perf_event cgroup hooks for subsystem attributes David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 28/32] perf/x86/intel/cqm: add CQM attributes to perf_event cgroup David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 29/32] perf,perf/x86,perf/powerpc,perf/arm,perf/*: add int error return to pmu::read David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 30/32] perf,perf/x86: add hook perf_event_arch_exec David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 31/32] perf/stat: fix bug in handling events in error state David Carrillo-Cisneros
2016-04-29 4:43 ` [PATCH 32/32] perf/stat: revamp error handling for snapshot and per_pkg events David Carrillo-Cisneros
2016-04-29 21:06 ` [PATCH 00/32] 2nd Iteration of Cache QoS Monitoring support Vikas Shivappa
2016-04-29 21:10 ` David Carrillo-Cisneros
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1461905018-86355-1-git-send-email-davidcc@google.com \
--to=davidcc@google.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=eranian@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt.fleming@intel.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=tony.luck@intel.com \
--cc=vikas.shivappa@linux.intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox