public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/32] 2nd Iteration of Cache QoS Monitoring support.
@ 2016-05-11 23:02 David Carrillo-Cisneros
  2016-05-11 23:02 ` [PATCH v2 01/32] perf/x86/intel/cqm: remove previous version of CQM and MBM David Carrillo-Cisneros
                   ` (31 more replies)
  0 siblings, 32 replies; 45+ messages in thread
From: David Carrillo-Cisneros @ 2016-05-11 23:02 UTC (permalink / raw)
  To: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar
  Cc: Vikas Shivappa, Matt Fleming, Tony Luck, Stephane Eranian,
	Paul Turner, David Carrillo-Cisneros, x86, linux-kernel

This series introduces the next iteration of kernel support for the
Cache QoS Monitoring (CQM) technology available in Intel Xeon processors.

One of the main limitations of the previous version is the inability
to simultaneously monitor:
  1) cpu event and any other event in that cpu.
  2) cgroup events for cgroups in same descendancy line.
  3) cgroup events and any thread event of a cgroup in the same
     descendancy line.

Another limitation is that monitoring for a cgroup was enabled/disabled by
the existence of a perf event for that cgroup. Since the event
llc_occupancy measures changes in occupancy rather than total occupancy,
in order to read meaningful llc_occupancy values, an event should be
enabled for a long enough period of time. The overhead in context switches
caused by the perf events is undesired in some sensitive scenarios.

This series of patches addresses the shortcomings mentioned above and,
add some other improvements. The main changes are:
	- No more potential conflicts between different events. New
	version	builds a hierarchy of RMIDs that captures the dependency
	between	monitored cgroups. llc_occupancy for cgroup is the sum of
	llc_occupancies for that cgroup RMID and all other RMIDs in the
	cgroups subtree (both monitored cgroups and threads).

	- A cgroup integration that allows to monitor the a cgroup without
	creating a perf event, decreasing the context switch overhead.
	Monitoring is controlled by a boolean cgroup subsystem attribute
	in each perf cgroup, this is:

		echo 1 > cgroup_path/perf_event.cqm_cont_monitoring

	starts CQM monitoring whether or not there is a perf_event
	attached to the cgroup. Setting the attribute to 0 makes
	monitoring dependent on the existence of a perf_event.
	A perf_event is always required in order to read llc_occupancy.
	This cgroup integration uses Intel's PQR code and is intended to
	be used by upcoming versions of Intel's CAT.
	
	- A more stable rotation algorithm: New algorithm uses SLOs that
	guarantee:
		- A minimum of enabled time for monitored cgroups and
		threads.
		- A maximum time disabled before error is introduced by
		reusing dirty RMIDs.
		- A minimum rate at which RMIDs recycling must progress.

	- Reduced impact of stealing/rotation of RMIDs: The new algorithm
	accounts the residual occupancy held by limbo RMIDs towards the
	former owner of the limbo RMID, decreasing the error introduced
	by RMID rotation.
	It also allows a limbo RMID to be reused by its former owner when
	appropriate, decreasing the potential error of reusing dirty RMIDs
	and allowing to make progress even if most limbo RMIDs do not
	drop occupancy fast enough.

	- Elimination of pmu::count: perf generic's perf_event_count()
	perform a quick add of atomic types. The introduction of
	pmu::count in the previous CQM series to read occupancy for thread
	events changed the behavior of perf_event_count() by performing a
	potentially slow IPI and write/read to MSR. It also made pmu::read
	to have different behaviors depending on whether the event was a
	cpu/cgroup event or a thread. This patches serie removes the custom
	pmu::count from CQM and provides a consistent behavior for all
	calls of perf_event_read .

	- Added error return for pmu::read: Reads to CQM events may fail
	due to stealing of RMIDs, even after successfully adding an event
	to a PMU. This patch series expands pmu::read with an int return
	value and propagates the error to callers that can fail
	(ie. perf_read).
	The ability to fail of pmu::read is consistent with the recent
	changes	that allow perf_event_read to fail for transactional
	reading of event groups.

	- Introduces the field pmu_event_flags that contain flags set by
	the PMU to signal variations on the default behavior to perf's
	generic	code. In this series, three flags are introduced:
		- PERF_CGROUP_NO_RECURSION : Signals generic code to add
		events of the cgroup ancestors of a cgroup.
		- PERF_INACTIVE_CPU_READ_PKG: Signals generic coda that
		this CPU event can be read in any CPU in its event::cpu's
		package, even if the event is not active.
		- PERF_INACTIVE_EV_READ_ANY_CPU: Signals generic code that
		this event can be read in any CPU in any package in the
		system even if the event is not active.
	Using the above flags takes advantage of the CQM's hw ability to
	read llc_occupancy even when the associated perf event is not
	running in a CPU.

This patch series also updates the perf tool to fix error handling and to
better handle the idiosyncrasies of snapshot and per-pkg events.


Changes in 2nd version:
  - As requested by Peter Z., redo commit history to completely remove
    old version of CQM in a single patch.
  - Use topology_max_packages and fix build errors reported by
  Vikas Shivappa.
  - Split largest patches, clean up.
  - Rebased to peterz/queue perf/core .


David Carrillo-Cisneros (31):
  perf/x86/intel/cqm: remove previous version of CQM and MBM
  perf/x86/intel/cqm: software cache for MSR_IA32_PQR_ASSOC
  x86/intel,cqm: add CONFIG_INTEL_RDT configuration flag
  perf/x86/intel/cqm: add constants for CQM
  perf/x86/intel/cqm: encapsulate per-package RMIDs
  perf/x86/intel/cqm: add per-package RMIDs, data and locks
  perf/x86/intel/cqm: add helpers for per-package locking
  perf/x86/intel/cqm: add pmu sysfs attribute
  perf/x86/intel/cqm: basic RMID hierarchy with per package RMIDs
  perf/x86/intel/cqm: introduce (I)state and limbo prmids
  perf/x86/intel/cqm: add per-package RMID rotation
  perf/x86/intel/cqm: schedule work for rotation task
  perf/x86/intel/cqm: add polled update of RMID's llc_occupancy
  perf/x86/intel/cqm: add preallocation of anodes
  perf/core: add hooks to expose architecture specific features in
    perf_cgroup
  perf/x86/intel/cqm: add cgroup support
  perf/core,perf/x86/intel/cqm: add pmu::event_terminate
  perf/core: introduce PMU event flag PERF_CGROUP_NO_RECURSION
  x86/intel/cqm: use PERF_CGROUP_NO_RECURSION in CQM
  perf/x86/intel/cqm: handle inherit event and inherit_stat flag
  perf/x86/intel/cqm: introduce read_subtree
  perf/core: introduce PERF_INACTIVE_*_READ_* flags
  perf/x86/intel/cqm: use PERF_INACTIVE_*_READ_* flags in CQM
  sched: introduce the finish_arch_pre_lock_switch() scheduler hook
  perf/x86/intel/cqm: integrate CQM cgroups with scheduler
  perf/x86/intel/cqm: make one write of PQR_ASSOC per ctx switch
  perf/core: add perf_event cgroup hooks for subsystem attributes
  perf/x86/intel/cqm: add CQM attributes to perf_event cgroup
  perf,perf/x86,perf/powerpc,perf/arm,perf/*: add int error return to
    pmu::read
  perf,perf/x86: add hook perf_event_arch_exec
  perf/stat: revamp read error handling, snapshot and per_pkg events

Stephane Eranian (1):
  perf/stat: fix bug in handling events in error state

 arch/alpha/kernel/perf_event.c           |    3 +-
 arch/arc/kernel/perf_event.c             |    3 +-
 arch/arm64/include/asm/hw_breakpoint.h   |    2 +-
 arch/arm64/kernel/hw_breakpoint.c        |    3 +-
 arch/metag/kernel/perf/perf_event.c      |    5 +-
 arch/mips/kernel/perf_event_mipsxx.c     |    3 +-
 arch/powerpc/include/asm/hw_breakpoint.h |    2 +-
 arch/powerpc/kernel/hw_breakpoint.c      |    3 +-
 arch/powerpc/perf/core-book3s.c          |   11 +-
 arch/powerpc/perf/core-fsl-emb.c         |    5 +-
 arch/powerpc/perf/hv-24x7.c              |    5 +-
 arch/powerpc/perf/hv-gpci.c              |    3 +-
 arch/s390/kernel/perf_cpum_cf.c          |    5 +-
 arch/s390/kernel/perf_cpum_sf.c          |    3 +-
 arch/sh/include/asm/hw_breakpoint.h      |    2 +-
 arch/sh/kernel/hw_breakpoint.c           |    3 +-
 arch/sparc/kernel/perf_event.c           |    2 +-
 arch/tile/kernel/perf_event.c            |    3 +-
 arch/x86/Kconfig                         |    7 +
 arch/x86/events/amd/ibs.c                |    2 +-
 arch/x86/events/amd/iommu.c              |    5 +-
 arch/x86/events/amd/uncore.c             |    3 +-
 arch/x86/events/core.c                   |    3 +-
 arch/x86/events/intel/Makefile           |    3 +-
 arch/x86/events/intel/bts.c              |    3 +-
 arch/x86/events/intel/cqm.c              | 3842 +++++++++++++++++++++---------
 arch/x86/events/intel/cqm.h              |  532 +++++
 arch/x86/events/intel/cstate.c           |    3 +-
 arch/x86/events/intel/pt.c               |    3 +-
 arch/x86/events/intel/rapl.c             |    3 +-
 arch/x86/events/intel/uncore.c           |    3 +-
 arch/x86/events/intel/uncore.h           |    2 +-
 arch/x86/events/msr.c                    |    3 +-
 arch/x86/include/asm/hw_breakpoint.h     |    2 +-
 arch/x86/include/asm/perf_event.h        |   44 +
 arch/x86/include/asm/pqr_common.h        |   84 +
 arch/x86/include/asm/processor.h         |    4 +
 arch/x86/kernel/cpu/Makefile             |    4 +
 arch/x86/kernel/cpu/pqr_common.c         |   33 +
 arch/x86/kernel/hw_breakpoint.c          |    3 +-
 arch/x86/kvm/pmu.h                       |   10 +-
 drivers/bus/arm-cci.c                    |    3 +-
 drivers/bus/arm-ccn.c                    |    3 +-
 drivers/perf/arm_pmu.c                   |    3 +-
 include/linux/perf_event.h               |   92 +-
 kernel/events/core.c                     |  160 +-
 kernel/sched/core.c                      |    1 +
 kernel/sched/sched.h                     |    3 +
 kernel/trace/bpf_trace.c                 |    5 +-
 tools/perf/builtin-stat.c                |   43 +-
 tools/perf/util/counts.h                 |   19 +
 tools/perf/util/evsel.c                  |   44 +-
 tools/perf/util/evsel.h                  |    8 +-
 tools/perf/util/stat.c                   |   35 +-
 54 files changed, 3760 insertions(+), 1326 deletions(-)
 create mode 100644 arch/x86/events/intel/cqm.h
 create mode 100644 arch/x86/include/asm/pqr_common.h
 create mode 100644 arch/x86/kernel/cpu/pqr_common.c

-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2016-05-25  8:53 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-11 23:02 [PATCH v2 00/32] 2nd Iteration of Cache QoS Monitoring support David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 01/32] perf/x86/intel/cqm: remove previous version of CQM and MBM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 02/32] perf/x86/intel/cqm: software cache for MSR_IA32_PQR_ASSOC David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 03/32] x86/intel,cqm: add CONFIG_INTEL_RDT configuration flag David Carrillo-Cisneros
2016-05-18 17:30   ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 04/32] perf/x86/intel/cqm: add constants for CQM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 05/32] perf/x86/intel/cqm: encapsulate per-package RMIDs David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 06/32] perf/x86/intel/cqm: add per-package RMIDs, data and locks David Carrillo-Cisneros
2016-05-18 16:08   ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 07/32] perf/x86/intel/cqm: add helpers for per-package locking David Carrillo-Cisneros
2016-05-18 17:35   ` Thomas Gleixner
2016-05-18 19:09     ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 08/32] perf/x86/intel/cqm: add pmu sysfs attribute David Carrillo-Cisneros
2016-05-18 17:38   ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 09/32] perf/x86/intel/cqm: basic RMID hierarchy with per package RMIDs David Carrillo-Cisneros
2016-05-18 19:51   ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 10/32] perf/x86/intel/cqm: introduce (I)state and limbo prmids David Carrillo-Cisneros
2016-05-18 20:36   ` Thomas Gleixner
2016-05-25  0:52     ` David Carrillo-Cisneros
2016-05-25  8:51       ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 11/32] perf/x86/intel/cqm: add per-package RMID rotation David Carrillo-Cisneros
2016-05-18 21:37   ` Thomas Gleixner
2016-05-24 21:01     ` David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 12/32] perf/x86/intel/cqm: schedule work for rotation task David Carrillo-Cisneros
2016-05-18 20:41   ` Thomas Gleixner
2016-05-11 23:02 ` [PATCH v2 13/32] perf/x86/intel/cqm: add polled update of RMID's llc_occupancy David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 14/32] perf/x86/intel/cqm: add preallocation of anodes David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 15/32] perf/core: add hooks to expose architecture specific features in perf_cgroup David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 16/32] perf/x86/intel/cqm: add cgroup support David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 17/32] perf/core,perf/x86/intel/cqm: add pmu::event_terminate David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 18/32] perf/core: introduce PMU event flag PERF_CGROUP_NO_RECURSION David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 19/32] x86/intel/cqm: use PERF_CGROUP_NO_RECURSION in CQM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 20/32] perf/x86/intel/cqm: handle inherit event and inherit_stat flag David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 21/32] perf/x86/intel/cqm: introduce read_subtree David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 22/32] perf/core: introduce PERF_INACTIVE_*_READ_* flags David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 23/32] perf/x86/intel/cqm: use PERF_INACTIVE_*_READ_* flags in CQM David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 24/32] sched: introduce the finish_arch_pre_lock_switch() scheduler hook David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 25/32] perf/x86/intel/cqm: integrate CQM cgroups with scheduler David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 26/32] perf/x86/intel/cqm: make one write of PQR_ASSOC per ctx switch David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 27/32] perf/core: add perf_event cgroup hooks for subsystem attributes David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 28/32] perf/x86/intel/cqm: add CQM attributes to perf_event cgroup David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 29/32] perf,perf/x86,perf/powerpc,perf/arm,perf/*: add int error return to pmu::read David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 30/32] perf,perf/x86: add hook perf_event_arch_exec David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 31/32] perf/stat: fix bug in handling events in error state David Carrillo-Cisneros
2016-05-11 23:02 ` [PATCH v2 32/32] perf/stat: revamp read error handling, snapshot and per_pkg events David Carrillo-Cisneros

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox