linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] tools/perf/x86: Use alternative format for AMD raw events
@ 2021-11-11 12:56 Sandipan Das
  2021-11-14 16:41 ` Jiri Olsa
  0 siblings, 1 reply; 9+ messages in thread
From: Sandipan Das @ 2021-11-11 12:56 UTC (permalink / raw)
  To: acme
  Cc: ravi.bangoria, ananth.narayan, kim.phillips, rrichter,
	linux-perf-users, jolsa, namhyung, alexander.shishkin,
	mark.rutland

AMD Zen processors have events with codes that are larger
than a byte. The EventSelect bits of the PERF_CTL MSRs are
not contiguous either and that can affect usability.

E.g. the core pmu has events with 12-bit event codes.

  $ cat /sys/bus/event_source/devices/cpu/format/event
  config:0-7,32-35

  $ cat /sys/bus/event_source/devices/cpu/format/umask
  config:8-15

The perf man pages describe the raw event format as:

  "(eventsel+umask) in the form of rNNN where NNN is a
   hexadecimal event descriptor."

Assume U and E are used to represent nibbles of unit mask
and event code respectively. For basic usage, most events
are usually encoded in the UUEE format (eventsel+umask) and
it works out fine since it matches the PERF_CTL register's
EventSelect and UnitMask bits.

For the "ic_tag_hit_miss.instruction_cache_accesses" event,
which has event code 0x18e and umask 0x1f, a simple combo
of eventsel+umask in UUEEE format no longer works. Instead,
it has to be programmed as:

  $ sudo perf stat -e r100001f8e ...

Although a user might find it easier to program it in the
UUEEE format as:

  $ sudo perf stat -e r1f18e ...

This adds a quirk to convert a UUEEE format encoded raw
event to the native register format for AMD processors.
This currently works only for events targeting the core
PMU. The PMU type and bitfield positions are hard-coded
right now although they should ideally be dervied from
perf_pmu_format structs.

The UUEEE format comes with some issues though.

  [1] Enabling edge-detect (bit 18 in PERF_CTL) is not
      possible since bits 19:16 are being used to specify
      the upper nibble of umask instead. Other settings
      like OS/User mode capture (bits 17:16) are unaffected
      as they are set via exclude_{user,kernel} and not via
      config within perf_event_attr. For pmu syntax, it is
      still possible to set this with "edge=1".

  [2] There is no way to distinguish between UEEE and UUEE
      and it will be assumed to be UEEE in such cases. To
      use events which could originally be programmed as
      UUEE, one has to use UU0EE instead. Alternatively,
      with some minor tweaks, UUEE can be the preferred
      format but then events with 4-bit umask and 12-bit
      event code i.e. UEEE will break unless specified
      in Exxxx0UEE format.

The effect of the changes introduced here can be seen below.
All data was captured on a system with an AMD EPYC 7713
processor.

Before:

  $ sudo perf --debug perf-event-open stat -e ic_tag_hit_miss.all_instruction_cache_accesses sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x100001f8e
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

  $ sudo perf --debug perf-event-open stat -e r1f18e sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x1f18e
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

  $ sudo perf --debug perf-event-open stat -e r100001f8e sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x100001f8e
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

After:

  $ sudo perf --debug perf-event-open stat -e ic_tag_hit_miss.all_instruction_cache_accesses sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x100001f8e
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

  $ sudo perf --debug perf-event-open stat -e r1f18e sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x100001f8e
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

  $ sudo perf --debug perf-event-open stat -e r100001f8e sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x100001f8e
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

It is understandable that raw events must follow the native
PERF_CTL register format but it is worthwhile to pursue such
alternative formats for usability?

UUEEE is something that works for the core pmu events. Data
fabric events have a 14-bit EventSelect field and will need
more work.

  $ cat /sys/bus/event_source/devices/amd_df/format/event
  config:0-7,32-35,59-60

More details on registers and events can be found in the
AMD Processor Programming Reference (PPR). The one relevant
to the EPYC 7713 processor used here can be found at:

https://bugzilla.kernel.org/attachment.cgi?id=296015

Infomration on PERF_CTL MSRs start at page 185 and PMU
events start at page 445.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
 tools/perf/arch/x86/util/Build          |  1 +
 tools/perf/arch/x86/util/parse-events.c | 54 +++++++++++++++++++++++++
 tools/perf/util/parse-events.c          | 13 ++++++
 tools/perf/util/parse-events.h          |  4 ++
 4 files changed, 72 insertions(+)
 create mode 100644 tools/perf/arch/x86/util/parse-events.c

diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index dbeb04cb336e..c6bddc5eea88 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -10,6 +10,7 @@ perf-y += evlist.o
 perf-y += mem-events.o
 perf-y += evsel.o
 perf-y += iostat.o
+perf-y += parse-events.o
 
 perf-$(CONFIG_DWARF) += dwarf-regs.o
 perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/parse-events.c b/tools/perf/arch/x86/util/parse-events.c
new file mode 100644
index 000000000000..d06e833c6d44
--- /dev/null
+++ b/tools/perf/arch/x86/util/parse-events.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/string.h>
+#include "util/env.h"
+#include "util/evlist.h"
+#include "util/parse-events.h"
+
+void arch_parse_events_fixup_raw(struct parse_events_state *parse_state,
+				 int type,
+				 struct perf_event_attr *attr)
+{
+	struct perf_env *env = parse_state->evlist->env;
+
+	/*
+	 * AMD Zen processors can have event codes and unit masks larger than
+	 * a byte; transform attr->config to resemble the PERF_CTLx registers
+	 */
+	if (strstarts(perf_env__cpuid(env), "AuthenticAMD")) {
+		switch (type) {
+		case 4:	/* cpu */
+		{
+			u64 event = attr->config & 0xfff;
+			u64 umask = attr->config & 0xff000;
+
+			/*
+			 * Skip in case of ExxxxxxUUEE, UUEE, EE and E formats
+			 * Choose UEEE over UUEE, use UU0EE instead for latter
+			 * Choose UUEEE over edge+UUEE, if desired edge-detect
+			 * can be set via pmu syntax with "edge=1"
+			 */
+			if (attr->config > 0xfffff || attr->config < 0x100)
+				return;
+
+			/*
+			 * Choose EEE over UEE format, use U0EE instead for
+			 * latter and if desired unit mask can be set via pmu
+			 * syntax with "umask=..."
+			 */
+			if (attr->config > 0xff && attr->config < 0x1000)
+				umask = 0;
+
+			/*
+			 * Move the upper nibble of the EventSelect field from
+			 * bits 19:16 to bits 35:32 and shift the Umask field
+			 * right by a nibble.
+			 */
+			attr->config = ((event & 0xf00) << 24) | (umask >> 4) | (event & 0xff);
+			break;
+		}
+
+		default:
+			return;
+		}
+	}
+}
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 5bfb6f892489..7179ed39d757 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1437,6 +1437,9 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
 	attr.type = type;
 	attr.config = config;
 
+	if (attr.type == PERF_TYPE_RAW && attr.config)
+		arch_parse_events_fixup_raw(parse_state, attr.type, &attr);
+
 	if (head_config) {
 		if (config_attr(&attr, head_config, parse_state->error,
 				config_term_common))
@@ -1608,6 +1611,10 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
 		return 0;
 	}
 
+	/* For raw pmu events, attr->config is already set at this point */
+	if (attr.config)
+		arch_parse_events_fixup_raw(parse_state, pmu->type, &attr);
+
 	if (!parse_state->fake_pmu && perf_pmu__config(pmu, &attr, head_config, parse_state->error)) {
 		free_config_terms(&config_terms);
 		return -EINVAL;
@@ -3402,3 +3409,9 @@ struct evsel *parse_events__add_event_hybrid(struct list_head *list, int *idx,
 			   pmu, config_terms, /*auto_merge_stats=*/false,
 			   /*cpu_list=*/NULL);
 }
+
+void __weak arch_parse_events_fixup_raw(struct parse_events_state *parse_state __maybe_unused,
+					int type __maybe_unused,
+					struct perf_event_attr *attr __maybe_unused)
+{
+}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index c7fc93f54577..1f183903c586 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -218,6 +218,10 @@ void parse_events_update_lists(struct list_head *list_event,
 void parse_events_evlist_error(struct parse_events_state *parse_state,
 			       int idx, const char *str);
 
+void arch_parse_events_fixup_raw(struct parse_events_state *parse_state,
+				 int type,
+				 struct perf_event_attr *attr);
+
 void print_events(const char *event_glob, bool name_only, bool quiet,
 		  bool long_desc, bool details_flag, bool deprecated,
 		  const char *pmu_name);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-11-28 15:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-11 12:56 [RFC PATCH] tools/perf/x86: Use alternative format for AMD raw events Sandipan Das
2021-11-14 16:41 ` Jiri Olsa
2021-11-15  9:58   ` Sandipan Das
2021-11-16  4:38     ` kajoljain
2021-11-16 16:01       ` Kim Phillips
2021-11-17 14:13         ` kajoljain
2021-11-18  7:28         ` Sandipan Das
2021-11-18 16:51           ` Kim Phillips
2021-11-28 15:36             ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).