* [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3)
@ 2024-09-05 20:24 Namhyung Kim
2024-09-05 20:24 ` [PATCH 01/10] perf tools: Add fallback for exclude_guest Namhyung Kim
` (9 more replies)
0 siblings, 10 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
Hello,
I found perf tools set exclude_guest bit inconsistently. It used to
set the bit but now the default event for perf record doesn't. So I'm
wondering why we want the bit in the first place.
Actually it's not good for PMUs don't support any exclusion like AMD
IBS because it disables new features after the exclude_guest due to
the missing feature detection logic.
v3 changes)
* move exclude_guest fallback to the front
* fix precise_max handling on AMD
* simplify the default event for perf record
v2) https://lore.kernel.org/lkml/20240904064131.2377873-1-namhyung@kernel.org/
* update the missing feature detection logic
* separate exclude_hv fallback
* add new fallback for exclude_guest
v1) https://lore.kernel.org/lkml/20240902014621.2002343-1-namhyung@kernel.org/
AFAIK it doesn't matter for the most cases but perf kvm. If users
need to set the bit, they can still use :H modifier. For vPMU pass-
through or Apple M1, it'd add the exclude_guest during the fallback
logic.
Also the kernel feature detection logic should be separated from the
exclude bit tests since it depends on the PMU implementation rather
than the core kernel features. So I changed it to use a software
event for the detection and factor out some hw-specific checks.
The code is available at 'perf/exclude-v3' branch in
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
Namhyung Kim (10):
perf tools: Add fallback for exclude_guest
perf tools: Don't set attr.exclude_guest by default
perf tools: Simplify evsel__add_modifier()
perf stat: Add --exclude-guest option
perf tools: Do not set exclude_guest for precise_ip
perf tools: Detect missing kernel features properly
perf tools: Separate exclude_hv fallback
perf tools: Move x86__is_amd_cpu() to util/env.c
perf tools: Check fallback error and order
perf record: Just use "cycles:P" as the default event
tools/perf/Documentation/perf-stat.txt | 7 +
tools/perf/arch/x86/util/Build | 1 -
tools/perf/arch/x86/util/env.c | 19 -
tools/perf/arch/x86/util/env.h | 7 -
tools/perf/arch/x86/util/pmu.c | 2 +-
tools/perf/builtin-kvm.c | 1 +
tools/perf/builtin-record.c | 4 +-
tools/perf/builtin-stat.c | 5 +-
tools/perf/dlfilters/dlfilter-test-api-v0.c | 2 +-
tools/perf/dlfilters/dlfilter-test-api-v2.c | 2 +-
tools/perf/tests/attr/test-record-dummy-C0 | 2 +-
tools/perf/tests/parse-events.c | 30 +-
tools/perf/util/env.c | 23 ++
tools/perf/util/env.h | 4 +
tools/perf/util/evsel.c | 410 +++++++++++++++-----
tools/perf/util/evsel.h | 1 -
tools/perf/util/parse-events.c | 6 +-
tools/perf/util/util.c | 10 +-
tools/perf/util/util.h | 3 +
19 files changed, 371 insertions(+), 168 deletions(-)
delete mode 100644 tools/perf/arch/x86/util/env.c
delete mode 100644 tools/perf/arch/x86/util/env.h
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 01/10] perf tools: Add fallback for exclude_guest
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-06 13:47 ` Liang, Kan
2024-09-05 20:24 ` [PATCH 02/10] perf tools: Don't set attr.exclude_guest by default Namhyung Kim
` (8 subsequent siblings)
9 siblings, 1 reply; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang, James Clark
Commit 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default")
changed to parse "cycles:P" event instead of creating a new cycles
event for perf record. But it also changed the way how modifiers are
handled so it doesn't set the exclude_guest bit by default.
It seems Apple M1 PMU requires exclude_guest set and returns EOPNOTSUPP
if not. Let's add a fallback so that it can work with default events.
Fixes: 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default")
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-stat.c | 3 +--
tools/perf/util/evsel.c | 21 +++++++++++++++++++++
2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index cf985cdb9a6ee588..d8315dae930184ba 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -639,8 +639,7 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
* (behavior changed with commit b0a873e).
*/
if (errno == EINVAL || errno == ENOSYS ||
- errno == ENOENT || errno == EOPNOTSUPP ||
- errno == ENXIO) {
+ errno == ENOENT || errno == ENXIO) {
if (verbose > 0)
ui__warning("%s event is not supported by the kernel.\n",
evsel__name(counter));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 49cc71511c0c8ce8..d59ad76b28758906 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3244,6 +3244,27 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
evsel->core.attr.exclude_kernel = 1;
evsel->core.attr.exclude_hv = 1;
+ return true;
+ } else if (err == EOPNOTSUPP && !evsel->core.attr.exclude_guest &&
+ !evsel->exclude_GH) {
+ const char *name = evsel__name(evsel);
+ char *new_name;
+ const char *sep = ":";
+
+ /* Is there already the separator in the name. */
+ if (strchr(name, '/') ||
+ (strchr(name, ':') && !evsel->is_libpfm_event))
+ sep = "";
+
+ if (asprintf(&new_name, "%s%sH", name, sep) < 0)
+ return false;
+
+ free(evsel->name);
+ evsel->name = new_name;
+ /* Apple M1 requires exclude_guest */
+ scnprintf(msg, msgsize, "trying to fall back to excluding guest samples");
+ evsel->core.attr.exclude_guest = 1;
+
return true;
}
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 02/10] perf tools: Don't set attr.exclude_guest by default
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
2024-09-05 20:24 ` [PATCH 01/10] perf tools: Add fallback for exclude_guest Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-06 14:10 ` Liang, Kan
2024-09-05 20:24 ` [PATCH 03/10] perf tools: Simplify evsel__add_modifier() Namhyung Kim
` (7 subsequent siblings)
9 siblings, 1 reply; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang, James Clark
The exclude_guest in the event attribute is to limit profiling in the
host environment. But I'm not sure why we want to set it by default
cause we don't care about it in most cases and I feel like it just
makes new PMU implementation complicated.
Of course it's useful for perf kvm command so I added the
exclude_GH_default variable to preserve the old behavior for perf kvm
and other commands like perf record and stat won't set the exclude bit.
This is helpful for AMD IBS case since having exclude_guest bit will
clear new feature bit due to the missing feature check logic.
$ sysctl kernel.perf_event_paranoid
kernel.perf_event_paranoid = 0
$ perf record -W -e ibs_op// -vv true 2>&1 | grep switching
switching off PERF_FORMAT_LOST support
switching off weight struct support
switching off bpf_event
switching off ksymbol
switching off cloexec flag
switching off mmap2
switching off exclude_guest, exclude_host
Intestingly, I found it sets the exclude_bit if "u" modifier is used.
I don't know why but it's neither intuitive nor consistent. Let's
remove the bit there too.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-kvm.c | 1 +
tools/perf/tests/attr/test-record-dummy-C0 | 2 +-
tools/perf/tests/parse-events.c | 18 +++++++++---------
tools/perf/util/parse-events.c | 2 +-
tools/perf/util/util.c | 10 ++++++++--
tools/perf/util/util.h | 3 +++
6 files changed, 23 insertions(+), 13 deletions(-)
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 55ea17c5ff02acf7..5e32d4e78fdf2424 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -2147,6 +2147,7 @@ int cmd_kvm(int argc, const char **argv)
"buildid-list", "stat", NULL };
const char *kvm_usage[] = { NULL, NULL };
+ exclude_HG_default = true;
perf_host = 0;
perf_guest = 1;
diff --git a/tools/perf/tests/attr/test-record-dummy-C0 b/tools/perf/tests/attr/test-record-dummy-C0
index 576ec48b3aafaa6a..8ce6f0a5df5b7013 100644
--- a/tools/perf/tests/attr/test-record-dummy-C0
+++ b/tools/perf/tests/attr/test-record-dummy-C0
@@ -37,7 +37,7 @@ precise_ip=0
mmap_data=0
sample_id_all=1
exclude_host=0
-exclude_guest=1
+exclude_guest=0
exclude_callchain_kernel=0
exclude_callchain_user=0
mmap2=1
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 639e65a9bf61801a..9077d581c0d05e62 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -933,7 +933,7 @@ static int test__group2(struct evlist *evlist)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
@@ -948,7 +948,7 @@ static int test__group2(struct evlist *evlist)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
if (evsel__has_leader(evsel, leader))
@@ -1073,7 +1073,7 @@ static int test__group3(struct evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
@@ -1223,7 +1223,7 @@ static int test__group5(struct evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
@@ -1438,7 +1438,7 @@ static int test__leader_sample1(struct evlist *evlist)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
@@ -1454,7 +1454,7 @@ static int test__leader_sample1(struct evlist *evlist)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
@@ -1469,7 +1469,7 @@ static int test__leader_sample1(struct evlist *evlist)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
@@ -1498,7 +1498,7 @@ static int test__leader_sample2(struct evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
@@ -1514,7 +1514,7 @@ static int test__leader_sample2(struct evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f5eb1af8302c7bac..8d9e2d4b73822d98 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1761,7 +1761,7 @@ static int parse_events__modifier_list(struct parse_events_state *parse_state,
if (mod.user) {
if (!exclude)
exclude = eu = ek = eh = 1;
- if (!exclude_GH && !perf_guest)
+ if (!exclude_GH && !perf_guest && exclude_HG_default)
eG = 1;
eu = 0;
}
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 9d55a13787ce3c05..7e3159faaa1991df 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -78,17 +78,23 @@ bool sysctl__nmi_watchdog_enabled(void)
bool test_attr__enabled;
+bool exclude_HG_default;
+
bool perf_host = true;
bool perf_guest = false;
void event_attr_init(struct perf_event_attr *attr)
{
+ /* to capture ABI version */
+ attr->size = sizeof(*attr);
+
+ if (!exclude_HG_default)
+ return;
+
if (!perf_host)
attr->exclude_host = 1;
if (!perf_guest)
attr->exclude_guest = 1;
- /* to capture ABI version */
- attr->size = sizeof(*attr);
}
int mkdir_p(char *path, mode_t mode)
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 9966c21aaf048479..d33ae883a54f2f2f 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -21,6 +21,9 @@ extern const char perf_more_info_string[];
extern const char *input_name;
+/* This will control if perf_{host,guest} will set attr.exclude_{host,guest}. */
+extern bool exclude_HG_default;
+
extern bool perf_host;
extern bool perf_guest;
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 03/10] perf tools: Simplify evsel__add_modifier()
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
2024-09-05 20:24 ` [PATCH 01/10] perf tools: Add fallback for exclude_guest Namhyung Kim
2024-09-05 20:24 ` [PATCH 02/10] perf tools: Don't set attr.exclude_guest by default Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 04/10] perf stat: Add --exclude-guest option Namhyung Kim
` (6 subsequent siblings)
9 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
Since it doesn't set the exclude_guest, no need to special handle the
bit and simply show only if one of host or guest bit is set. Now the
default event name might not have :H prefix anymore so change the
dlfilter test not to compare the ":" at the end.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/dlfilters/dlfilter-test-api-v0.c | 2 +-
tools/perf/dlfilters/dlfilter-test-api-v2.c | 2 +-
tools/perf/util/evsel.c | 5 +----
3 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/tools/perf/dlfilters/dlfilter-test-api-v0.c b/tools/perf/dlfilters/dlfilter-test-api-v0.c
index 4083b1abeaabe605..4ca2d7b2ea6c8200 100644
--- a/tools/perf/dlfilters/dlfilter-test-api-v0.c
+++ b/tools/perf/dlfilters/dlfilter-test-api-v0.c
@@ -220,7 +220,7 @@ static int check_sample(struct filter_data *d, const struct perf_dlfilter_sample
CHECK_SAMPLE(raw_callchain_nr);
CHECK(!sample->raw_callchain);
-#define EVENT_NAME "branches:"
+#define EVENT_NAME "branches"
CHECK(!strncmp(sample->event, EVENT_NAME, strlen(EVENT_NAME)));
return 0;
diff --git a/tools/perf/dlfilters/dlfilter-test-api-v2.c b/tools/perf/dlfilters/dlfilter-test-api-v2.c
index 32ff619e881caa50..00d73a16c4fdaece 100644
--- a/tools/perf/dlfilters/dlfilter-test-api-v2.c
+++ b/tools/perf/dlfilters/dlfilter-test-api-v2.c
@@ -235,7 +235,7 @@ static int check_sample(struct filter_data *d, const struct perf_dlfilter_sample
CHECK_SAMPLE(raw_callchain_nr);
CHECK(!sample->raw_callchain);
-#define EVENT_NAME "branches:"
+#define EVENT_NAME "branches"
CHECK(!strncmp(sample->event, EVENT_NAME, strlen(EVENT_NAME)));
return 0;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d59ad76b28758906..dadcaf6e37063dfa 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -548,7 +548,6 @@ static int evsel__add_modifiers(struct evsel *evsel, char *bf, size_t size)
{
int colon = 0, r = 0;
struct perf_event_attr *attr = &evsel->core.attr;
- bool exclude_guest_default = false;
#define MOD_PRINT(context, mod) do { \
if (!attr->exclude_##context) { \
@@ -560,17 +559,15 @@ static int evsel__add_modifiers(struct evsel *evsel, char *bf, size_t size)
MOD_PRINT(kernel, 'k');
MOD_PRINT(user, 'u');
MOD_PRINT(hv, 'h');
- exclude_guest_default = true;
}
if (attr->precise_ip) {
if (!colon)
colon = ++r;
r += scnprintf(bf + r, size - r, "%.*s", attr->precise_ip, "ppp");
- exclude_guest_default = true;
}
- if (attr->exclude_host || attr->exclude_guest == exclude_guest_default) {
+ if (attr->exclude_host || attr->exclude_guest) {
MOD_PRINT(host, 'H');
MOD_PRINT(guest, 'G');
}
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
` (2 preceding siblings ...)
2024-09-05 20:24 ` [PATCH 03/10] perf tools: Simplify evsel__add_modifier() Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-06 14:33 ` Liang, Kan
2024-09-05 20:24 ` [PATCH 05/10] perf tools: Do not set exclude_guest for precise_ip Namhyung Kim
` (5 subsequent siblings)
9 siblings, 1 reply; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
This option is to support the old behavior of setting exclude_guest by
default. Now it doesn't set the bit so users want the old behavior can
use this option.
$ perf stat true
Performance counter stats for 'true':
0.86 msec task-clock:u # 0.443 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
49 page-faults:u # 56.889 K/sec
...
$ perf stat --exclude-guest true
Performance counter stats for 'true':
0.79 msec task-clock:Hu # 0.490 CPUs utilized
0 context-switches:Hu # 0.000 /sec
0 cpu-migrations:Hu # 0.000 /sec
49 page-faults:Hu # 62.078 K/sec
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/Documentation/perf-stat.txt | 7 +++++++
tools/perf/builtin-stat.c | 2 ++
2 files changed, 9 insertions(+)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 2bc06367248691dd..d28d8370a856598f 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -382,6 +382,13 @@ color the metric's computed value.
Don't print output, warnings or messages. This is useful with perf stat
record below to only write data to the perf.data file.
+--exclude-guest::
+Don't count event in the guest mode. It was the old behavior but the
+default is changed to count guest events also. Use this option if you
+want the old behavior (host only). Note that this option needs to be
+before other events in case you added -e/--event option in the command
+line.
+
STAT RECORD
-----------
Stores stat data into perf data file.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d8315dae930184ba..4d47675af5cc3094 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2491,6 +2491,8 @@ int cmd_stat(int argc, const char **argv)
OPT_BOOLEAN_FLAG(0, "all-user", &stat_config.all_user,
"Configure all used events to run in user space.",
PARSE_OPT_EXCLUSIVE),
+ OPT_BOOLEAN(0, "exclude-guest", &exclude_HG_default,
+ "Don't count events in the guest mode"),
OPT_BOOLEAN(0, "percore-show-thread", &stat_config.percore_show_thread,
"Use with 'percore' event qualifier to show the event "
"counts of one hardware thread by sum up total hardware "
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 05/10] perf tools: Do not set exclude_guest for precise_ip
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
` (3 preceding siblings ...)
2024-09-05 20:24 ` [PATCH 04/10] perf stat: Add --exclude-guest option Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 06/10] perf tools: Detect missing kernel features properly Namhyung Kim
` (4 subsequent siblings)
9 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
It seems perf sets the exclude_guest bit because of Intel PEBS
implementation which uses a virtual address. IIUC now kernel disables
PEBS when it goes to the guest mode regardless of this bit so we don't
need to set it explicitly. At least for the other archs/vendors.
I found the commit 1342798cc13e set the exclude_guest for precise_ip
in the tool and the commit 20b279ddb38c added kernel side enforcement
which was reverted by commit a706d965dcfd later.
Actually it doesn't set the exclude_guest for the default event
(cycles:P) already.
$ grep -m1 vendor /proc/cpuinfo
vendor_id : GenuineIntel
$ perf record -e cycles:P true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (9 samples) ]
$ perf evlist -v | tr ',' '\n' | grep -e exclude -e precise
precise_ip: 3
But having lower 'p' modifier set the bit for some reason.
$ perf record -e cycles:pp true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (9 samples) ]
$ perf evlist -v | tr ',' '\n' | grep -e exclude -e precise
precise_ip: 2
exclude_guest: 1
Actually AMD IBS suffers from this because it doesn't support excludes
and having this bit effectively disables new features in the current
implementation (due to the missing feature check).
$ grep -m1 vendor /proc/cpuinfo
vendor_id : AuthenticAMD
$ perf record -W -e cycles:p -vv true 2>&1 | grep switching
switching off PERF_FORMAT_LOST support
switching off weight struct support
switching off bpf_event
switching off ksymbol
switching off cloexec flag
switching off mmap2
switching off exclude_guest, exclude_host
By not setting exclude_guest, we can fix this inconsistency and the
troubles.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/tests/parse-events.c | 12 ++++--------
tools/perf/util/parse-events.c | 4 ----
2 files changed, 4 insertions(+), 12 deletions(-)
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 9077d581c0d05e62..087bb13ed76ba7c5 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -899,8 +899,7 @@ static int test__group1(struct evlist *evlist)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- /* use of precise requires exclude_guest */
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip == 2);
TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
@@ -1017,9 +1016,8 @@ static int test__group3(struct evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude_kernel",
!evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- /* use of precise requires exclude_guest */
TEST_ASSERT_VAL("wrong exclude guest",
- evsel->core.attr.exclude_guest);
+ !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host",
!evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip",
@@ -1104,8 +1102,7 @@ static int test__group4(struct evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- /* use of precise requires exclude_guest */
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip == 1);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
@@ -1123,8 +1120,7 @@ static int test__group4(struct evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude_user", evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
- /* use of precise requires exclude_guest */
- TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
+ TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip == 2);
TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 8d9e2d4b73822d98..d5dcd47ce0b18c13 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1754,10 +1754,6 @@ static int parse_events__modifier_list(struct parse_events_state *parse_state,
int exclude = eu | ek | eh;
int exclude_GH = group ? evsel->exclude_GH : 0;
- if (mod.precise) {
- /* use of precise requires exclude_guest */
- eG = 1;
- }
if (mod.user) {
if (!exclude)
exclude = eu = ek = eh = 1;
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 06/10] perf tools: Detect missing kernel features properly
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
` (4 preceding siblings ...)
2024-09-05 20:24 ` [PATCH 05/10] perf tools: Do not set exclude_guest for precise_ip Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 07/10] perf tools: Separate exclude_hv fallback Namhyung Kim
` (3 subsequent siblings)
9 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
The evsel__detect_missing_features() is to check if the attributes of
the evsel is supported or not. But it checks the attribute based on the
given evsel, it might miss something if the attr doesn't have the bit or
give incorrect results if the event is special.
Also it maintains the order of the feature that was added to the kernel
which means it can assume older features should be supported once it
detects the current feature is working. To minimized the confusion and
to accurately check the kernel features, I think it's better to use a
software event and go through all the features at once.
Also make the function static since it's only used in evsel.c.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/evsel.c | 345 +++++++++++++++++++++++++++++-----------
tools/perf/util/evsel.h | 1 -
2 files changed, 249 insertions(+), 97 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dadcaf6e37063dfa..1a4f52767942e5ad 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -20,6 +20,7 @@
#include <linux/zalloc.h>
#include <sys/ioctl.h>
#include <sys/resource.h>
+#include <sys/syscall.h>
#include <sys/types.h>
#include <dirent.h>
#include <stdlib.h>
@@ -2135,120 +2136,272 @@ int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
return err;
}
-bool evsel__detect_missing_features(struct evsel *evsel)
+static bool has_attr_feature(struct perf_event_attr *attr, unsigned long flags)
{
+ int fd = syscall(SYS_perf_event_open, attr, /*pid=*/0, /*cpu=*/-1,
+ /*group_fd=*/-1, flags);
+ close(fd);
+
+ if (fd < 0) {
+ attr->exclude_kernel = 1;
+
+ fd = syscall(SYS_perf_event_open, attr, /*pid=*/0, /*cpu=*/-1,
+ /*group_fd=*/-1, flags);
+ close(fd);
+ }
+
+ if (fd < 0) {
+ attr->exclude_hv = 1;
+
+ fd = syscall(SYS_perf_event_open, attr, /*pid=*/0, /*cpu=*/-1,
+ /*group_fd=*/-1, flags);
+ close(fd);
+ }
+
+ if (fd < 0) {
+ attr->exclude_guest = 1;
+
+ fd = syscall(SYS_perf_event_open, attr, /*pid=*/0, /*cpu=*/-1,
+ /*group_fd=*/-1, flags);
+ close(fd);
+ }
+
+ attr->exclude_kernel = 0;
+ attr->exclude_guest = 0;
+ attr->exclude_hv = 0;
+
+ return fd >= 0;
+}
+
+static void evsel__detect_missing_brstack_features(struct evsel *evsel)
+{
+ static bool detection_done = false;
+ struct perf_event_attr attr = {
+ .type = evsel->core.attr.type,
+ .config = evsel->core.attr.config,
+ .disabled = 1,
+ .sample_type = PERF_SAMPLE_BRANCH_STACK,
+ .sample_period = 1000,
+ };
+ int old_errno;
+
+ if (detection_done)
+ return;
+
+ old_errno = errno;
+
/*
* Must probe features in the order they were added to the
- * perf_event_attr interface.
+ * perf_event_attr interface. These are PMU specific limitation
+ * so we can detect with the given hardware event and stop on the
+ * first one succeeded.
*/
- if (!perf_missing_features.branch_counters &&
- (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_COUNTERS)) {
- perf_missing_features.branch_counters = true;
- pr_debug2("switching off branch counters support\n");
- return true;
- } else if (!perf_missing_features.read_lost &&
- (evsel->core.attr.read_format & PERF_FORMAT_LOST)) {
- perf_missing_features.read_lost = true;
- pr_debug2("switching off PERF_FORMAT_LOST support\n");
+
+ /* Please add new feature detection here. */
+
+ attr.branch_sample_type = PERF_SAMPLE_BRANCH_COUNTERS;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.branch_counters = true;
+ pr_debug2("switching off branch counters support\n");
+
+ attr.branch_sample_type = PERF_SAMPLE_BRANCH_HW_INDEX;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.branch_hw_idx = true;
+ pr_debug2("switching off branch HW index support\n");
+
+ attr.branch_sample_type = PERF_SAMPLE_BRANCH_NO_CYCLES | PERF_SAMPLE_BRANCH_NO_FLAGS;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.lbr_flags = true;
+ pr_debug2_peo("switching off branch sample type no (cycles/flags)\n");
+
+found:
+ detection_done = true;
+ errno = old_errno;
+}
+
+static bool evsel__detect_missing_features(struct evsel *evsel)
+{
+ static bool detection_done = false;
+ struct perf_event_attr attr = {
+ .type = PERF_TYPE_SOFTWARE,
+ .config = PERF_COUNT_SW_TASK_CLOCK,
+ .disabled = 1,
+ };
+ int old_errno;
+
+ if (evsel__has_br_stack(evsel))
+ evsel__detect_missing_brstack_features(evsel);
+
+ if (detection_done)
+ goto check;
+
+ old_errno = errno;
+
+ /*
+ * Must probe features in the order they were added to the
+ * perf_event_attr interface. These are kernel core limitation
+ * not PMU-specific so we can detect with a software event and
+ * stop on the first one succeeded.
+ */
+
+ /* Please add new feature detection here. */
+
+ attr.read_format = PERF_FORMAT_LOST;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.read_lost = true;
+ pr_debug2("switching off PERF_FORMAT_LOST support\n");
+ attr.read_format = 0;
+
+ attr.sample_type = PERF_SAMPLE_WEIGHT_STRUCT;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.weight_struct = true;
+ pr_debug2("switching off weight struct support\n");
+ attr.sample_type = 0;
+
+ attr.sample_type = PERF_SAMPLE_CODE_PAGE_SIZE;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.code_page_size = true;
+ pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support\n");
+ attr.sample_type = 0;
+
+ attr.sample_type = PERF_SAMPLE_DATA_PAGE_SIZE;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.data_page_size = true;
+ pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support\n");
+ attr.sample_type = 0;
+
+ attr.cgroup = 1;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.cgroup = true;
+ pr_debug2_peo("Kernel has no cgroup sampling support\n");
+ attr.cgroup = 0;
+
+ attr.aux_output = 1;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.aux_output = true;
+ pr_debug2_peo("Kernel has no attr.aux_output support\n");
+ attr.aux_output = 0;
+
+ attr.bpf_event = 1;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.bpf = true;
+ pr_debug2_peo("switching off bpf_event\n");
+ attr.bpf_event = 0;
+
+ attr.ksymbol = 1;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.ksymbol = true;
+ pr_debug2_peo("switching off ksymbol\n");
+ attr.ksymbol = 0;
+
+ attr.write_backward = 1;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.write_backward = true;
+ pr_debug2_peo("switching off write_backward\n");
+ attr.write_backward = 0;
+
+ attr.use_clockid = 1;
+ attr.clockid = CLOCK_MONOTONIC;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.clockid = true;
+ pr_debug2_peo("switching off clockid\n");
+ attr.use_clockid = 0;
+ attr.clockid = 0;
+
+ if (has_attr_feature(&attr, /*flags=*/PERF_FLAG_FD_CLOEXEC))
+ goto found;
+ perf_missing_features.cloexec = true;
+ pr_debug2_peo("switching off cloexec flag\n");
+
+ attr.mmap2 = 1;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.mmap2 = true;
+ pr_debug2_peo("switching off mmap2\n");
+ attr.mmap2 = 0;
+
+ /* set this unconditionally? */
+ perf_missing_features.sample_id_all = true;
+ pr_debug2_peo("switching off sample_id_all\n");
+
+ attr.inherit = 1;
+ attr.read_format = PERF_FORMAT_GROUP;
+ if (has_attr_feature(&attr, /*flags=*/0))
+ goto found;
+ perf_missing_features.group_read = true;
+ pr_debug2_peo("switching off group read\n");
+ attr.inherit = 0;
+ attr.read_format = 0;
+
+found:
+ detection_done = true;
+ errno = old_errno;
+
+check:
+ if ((evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_COUNTERS) &&
+ perf_missing_features.branch_counters)
return true;
- } else if (!perf_missing_features.weight_struct &&
- (evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT)) {
- perf_missing_features.weight_struct = true;
- pr_debug2("switching off weight struct support\n");
+
+ if ((evsel->core.attr.read_format & PERF_FORMAT_LOST) &&
+ perf_missing_features.read_lost)
return true;
- } else if (!perf_missing_features.code_page_size &&
- (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) {
- perf_missing_features.code_page_size = true;
- pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n");
- return false;
- } else if (!perf_missing_features.data_page_size &&
- (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) {
- perf_missing_features.data_page_size = true;
- pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n");
- return false;
- } else if (!perf_missing_features.cgroup && evsel->core.attr.cgroup) {
- perf_missing_features.cgroup = true;
- pr_debug2_peo("Kernel has no cgroup sampling support, bailing out\n");
- return false;
- } else if (!perf_missing_features.branch_hw_idx &&
- (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX)) {
- perf_missing_features.branch_hw_idx = true;
- pr_debug2("switching off branch HW index support\n");
+
+ if ((evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT) &&
+ perf_missing_features.weight_struct)
return true;
- } else if (!perf_missing_features.aux_output && evsel->core.attr.aux_output) {
- perf_missing_features.aux_output = true;
- pr_debug2_peo("Kernel has no attr.aux_output support, bailing out\n");
- return false;
- } else if (!perf_missing_features.bpf && evsel->core.attr.bpf_event) {
- perf_missing_features.bpf = true;
- pr_debug2_peo("switching off bpf_event\n");
+
+ if (evsel->core.attr.use_clockid && evsel->core.attr.clockid != CLOCK_MONOTONIC &&
+ !perf_missing_features.clockid) {
+ perf_missing_features.clockid_wrong = true;
return true;
- } else if (!perf_missing_features.ksymbol && evsel->core.attr.ksymbol) {
- perf_missing_features.ksymbol = true;
- pr_debug2_peo("switching off ksymbol\n");
+ }
+
+ if (evsel->core.attr.use_clockid && perf_missing_features.clockid)
return true;
- } else if (!perf_missing_features.write_backward && evsel->core.attr.write_backward) {
- perf_missing_features.write_backward = true;
- pr_debug2_peo("switching off write_backward\n");
- return false;
- } else if (!perf_missing_features.clockid_wrong && evsel->core.attr.use_clockid) {
- perf_missing_features.clockid_wrong = true;
- pr_debug2_peo("switching off clockid\n");
+
+ if ((evsel->open_flags & PERF_FLAG_FD_CLOEXEC) &&
+ perf_missing_features.cloexec)
return true;
- } else if (!perf_missing_features.clockid && evsel->core.attr.use_clockid) {
- perf_missing_features.clockid = true;
- pr_debug2_peo("switching off use_clockid\n");
+
+ if (evsel->core.attr.mmap2 && perf_missing_features.mmap2)
return true;
- } else if (!perf_missing_features.cloexec && (evsel->open_flags & PERF_FLAG_FD_CLOEXEC)) {
- perf_missing_features.cloexec = true;
- pr_debug2_peo("switching off cloexec flag\n");
+
+ if ((evsel->core.attr.branch_sample_type & (PERF_SAMPLE_BRANCH_NO_FLAGS |
+ PERF_SAMPLE_BRANCH_NO_CYCLES)) &&
+ perf_missing_features.lbr_flags)
return true;
- } else if (!perf_missing_features.mmap2 && evsel->core.attr.mmap2) {
- perf_missing_features.mmap2 = true;
- pr_debug2_peo("switching off mmap2\n");
+
+ if (evsel->core.attr.inherit && (evsel->core.attr.read_format & PERF_FORMAT_GROUP) &&
+ perf_missing_features.group_read)
return true;
- } else if (evsel->core.attr.exclude_guest || evsel->core.attr.exclude_host) {
- if (evsel->pmu == NULL)
- evsel->pmu = evsel__find_pmu(evsel);
-
- if (evsel->pmu)
- evsel->pmu->missing_features.exclude_guest = true;
- else {
- /* we cannot find PMU, disable attrs now */
- evsel->core.attr.exclude_host = false;
- evsel->core.attr.exclude_guest = false;
- }
- if (evsel->exclude_GH) {
- pr_debug2_peo("PMU has no exclude_host/guest support, bailing out\n");
- return false;
- }
- if (!perf_missing_features.exclude_guest) {
- perf_missing_features.exclude_guest = true;
- pr_debug2_peo("switching off exclude_guest, exclude_host\n");
- }
+ if (evsel->core.attr.ksymbol && perf_missing_features.ksymbol)
return true;
- } else if (!perf_missing_features.sample_id_all) {
- perf_missing_features.sample_id_all = true;
- pr_debug2_peo("switching off sample_id_all\n");
+
+ if (evsel->core.attr.bpf_event && perf_missing_features.bpf)
return true;
- } else if (!perf_missing_features.lbr_flags &&
- (evsel->core.attr.branch_sample_type &
- (PERF_SAMPLE_BRANCH_NO_CYCLES |
- PERF_SAMPLE_BRANCH_NO_FLAGS))) {
- perf_missing_features.lbr_flags = true;
- pr_debug2_peo("switching off branch sample type no (cycles/flags)\n");
+
+ if ((evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX) &&
+ perf_missing_features.branch_hw_idx)
return true;
- } else if (!perf_missing_features.group_read &&
- evsel->core.attr.inherit &&
- (evsel->core.attr.read_format & PERF_FORMAT_GROUP) &&
- evsel__is_group_leader(evsel)) {
- perf_missing_features.group_read = true;
- pr_debug2_peo("switching off group read\n");
+
+ if (evsel->core.attr.sample_id_all && perf_missing_features.sample_id_all)
return true;
- } else {
- return false;
- }
+
+ return false;
}
static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 15acf293e12af713..098aa0d8eeb65466 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -363,7 +363,6 @@ int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
void evsel__close(struct evsel *evsel);
int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads);
-bool evsel__detect_missing_features(struct evsel *evsel);
bool evsel__precise_ip_fallback(struct evsel *evsel);
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 07/10] perf tools: Separate exclude_hv fallback
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
` (5 preceding siblings ...)
2024-09-05 20:24 ` [PATCH 06/10] perf tools: Detect missing kernel features properly Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-06 15:21 ` Liang, Kan
2024-09-05 20:24 ` [PATCH 08/10] perf tools: Move x86__is_amd_cpu() to util/env.c Namhyung Kim
` (2 subsequent siblings)
9 siblings, 1 reply; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
The exclude_hv was added in the evsel__fallback() in the commit
4ec8d984895fef43a ("perf record: Fix priv level with branch sampling
for paranoid=2") to address branch stack samples on Intel PMUs.
As some other PMUs might not support that, let's separate the bit from
exclude_kernel to make sure it can add the bit only if required.
Technically it should change the modifier string at the end of the
event name. ":u" is for exclude_kernel + exclude_hv, so it should be
":uh" if it has exclude_kernel only. That means the default events for
regular users will looks like "cycles:Puh" (for perf record) or
"instructions:uh" (for perf stat). But I'm not sure if it's worth the
trouble so I didn't touch the name in this patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/evsel.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1a4f52767942e5ad..c5df45bb74dfc1b5 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3389,10 +3389,20 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
free(evsel->name);
evsel->name = new_name;
scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
- "to fall back to excluding kernel and hypervisor "
- " samples", paranoid);
+ "to fall back to excluding kernel samples", paranoid);
evsel->core.attr.exclude_kernel = 1;
- evsel->core.attr.exclude_hv = 1;
+
+ return true;
+ } else if (err == EACCES && !evsel->core.attr.exclude_hv &&
+ (paranoid = perf_event_paranoid()) > 1) {
+ /* If event has exclude user then don't exclude hv. */
+ if (evsel->core.attr.exclude_user)
+ return false;
+
+ /* Intel branch stack requires exclude_hv */
+ scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
+ "to fall back to excluding hypervisor samples", paranoid);
+ evsel->core.attr.exclude_hv = 1;
return true;
} else if (err == EOPNOTSUPP && !evsel->core.attr.exclude_guest &&
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 08/10] perf tools: Move x86__is_amd_cpu() to util/env.c
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
` (6 preceding siblings ...)
2024-09-05 20:24 ` [PATCH 07/10] perf tools: Separate exclude_hv fallback Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 09/10] perf tools: Check fallback error and order Namhyung Kim
2024-09-05 20:24 ` [PATCH 10/10] perf record: Just use "cycles:P" as the default event Namhyung Kim
9 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
It can be called from non-x86 platform so let's move it to the general
util directory. Also add a new helper perf_env__is_x86_amd_cpu() so
that it can be called with an existing perf_env as well.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/arch/x86/util/Build | 1 -
tools/perf/arch/x86/util/env.c | 19 -------------------
tools/perf/arch/x86/util/env.h | 7 -------
tools/perf/arch/x86/util/pmu.c | 2 +-
tools/perf/util/env.c | 23 +++++++++++++++++++++++
tools/perf/util/env.h | 4 ++++
6 files changed, 28 insertions(+), 28 deletions(-)
delete mode 100644 tools/perf/arch/x86/util/env.c
delete mode 100644 tools/perf/arch/x86/util/env.h
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 2607ed5c42966543..ce6d802a1381c5ab 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -10,7 +10,6 @@ perf-util-y += evlist.o
perf-util-y += mem-events.o
perf-util-y += evsel.o
perf-util-y += iostat.o
-perf-util-y += env.o
perf-util-$(CONFIG_DWARF) += dwarf-regs.o
perf-util-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/env.c b/tools/perf/arch/x86/util/env.c
deleted file mode 100644
index 3e537ffb1353aab2..0000000000000000
--- a/tools/perf/arch/x86/util/env.c
+++ /dev/null
@@ -1,19 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include "linux/string.h"
-#include "util/env.h"
-#include "env.h"
-
-bool x86__is_amd_cpu(void)
-{
- struct perf_env env = { .total_mem = 0, };
- static int is_amd; /* 0: Uninitialized, 1: Yes, -1: No */
-
- if (is_amd)
- goto ret;
-
- perf_env__cpuid(&env);
- is_amd = env.cpuid && strstarts(env.cpuid, "AuthenticAMD") ? 1 : -1;
- perf_env__exit(&env);
-ret:
- return is_amd >= 1 ? true : false;
-}
diff --git a/tools/perf/arch/x86/util/env.h b/tools/perf/arch/x86/util/env.h
deleted file mode 100644
index d78f080b6b3f889a..0000000000000000
--- a/tools/perf/arch/x86/util/env.h
+++ /dev/null
@@ -1,7 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _X86_ENV_H
-#define _X86_ENV_H
-
-bool x86__is_amd_cpu(void);
-
-#endif /* _X86_ENV_H */
diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
index c3d89d6ba1bf03ad..e0060dac2a9f9242 100644
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@@ -16,7 +16,7 @@
#include "../../../util/fncache.h"
#include "../../../util/pmus.h"
#include "mem-events.h"
-#include "env.h"
+#include "util/env.h"
void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
{
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index a459374d0a1a1dc8..6965f3d498e327cc 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -5,6 +5,7 @@
#include "util/header.h"
#include "linux/compiler.h"
#include <linux/ctype.h>
+#include <linux/string.h>
#include <linux/zalloc.h>
#include "cgroup.h"
#include <errno.h>
@@ -624,3 +625,25 @@ char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
free(cap_eq);
return NULL;
}
+
+bool perf_env__is_x86_amd_cpu(struct perf_env *env)
+{
+ static int is_amd; /* 0: Uninitialized, 1: Yes, -1: No */
+
+ if (is_amd == 0)
+ is_amd = env->cpuid && strstarts(env->cpuid, "AuthenticAMD") ? 1 : -1;
+
+ return is_amd >= 1 ? true : false;
+}
+
+bool x86__is_amd_cpu(void)
+{
+ struct perf_env env = { .total_mem = 0, };
+ bool is_amd;
+
+ perf_env__cpuid(&env);
+ is_amd = perf_env__is_x86_amd_cpu(&env);
+ perf_env__exit(&env);
+
+ return is_amd;
+}
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 2a2c37cc40b7828e..573ad9823d16d5cf 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -192,4 +192,8 @@ char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
const char *cap);
bool perf_env__has_pmu_mapping(struct perf_env *env, const char *pmu_name);
+
+bool x86__is_amd_cpu(void);
+bool perf_env__is_x86_amd_cpu(struct perf_env *env);
+
#endif /* __PERF_ENV_H */
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 09/10] perf tools: Check fallback error and order
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
` (7 preceding siblings ...)
2024-09-05 20:24 ` [PATCH 08/10] perf tools: Move x86__is_amd_cpu() to util/env.c Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 10/10] perf record: Just use "cycles:P" as the default event Namhyung Kim
9 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
The perf_event_open might fail due to various reasons, so blindly
reducing precise_ip level might not be the best way to deal with it.
It seems the kernel return -EOPNOTSUPP when PMU doesn't support the
given precise level. Let's try again with the correct error code.
This caused a problem on AMD, as it stops on precise_ip of 2 for IBS but
user events with exclude_kernel=1 cannot make progress. Let's add the
evsel__handle_error_quirks() to this case specially. I plan to work on
the kernel side to improve this situation but it'd still need some
special handling for IBS.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/evsel.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c5df45bb74dfc1b5..895f45921da30bb2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2404,6 +2404,20 @@ static bool evsel__detect_missing_features(struct evsel *evsel)
return false;
}
+static bool evsel__handle_error_quirks(struct evsel *evsel, int error)
+{
+ /* AMD IBS doesn't support exclude_kernel, forward it to core PMU */
+ if (error == -EINVAL && evsel->precise_max && evsel->core.attr.precise_ip &&
+ evsel->core.attr.exclude_kernel && x86__is_amd_cpu()) {
+ evsel->core.attr.precise_ip = 0;
+ pr_debug2_peo("removing precise_ip on AMD\n");
+ display_attr(&evsel->core.attr);
+ return true;
+ }
+
+ return false;
+}
+
static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads,
int start_cpu_map_idx, int end_cpu_map_idx)
@@ -2565,9 +2579,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
return 0;
try_fallback:
- if (evsel__precise_ip_fallback(evsel))
- goto retry_open;
-
if (evsel__ignore_missing_thread(evsel, perf_cpu_map__nr(cpus),
idx, threads, thread, err)) {
/* We just removed 1 thread, so lower the upper nthreads limit. */
@@ -2584,11 +2595,15 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
if (err == -EMFILE && rlimit__increase_nofile(&set_rlimit))
goto retry_open;
- if (err != -EINVAL || idx > 0 || thread > 0)
- goto out_close;
+ if (err == -EOPNOTSUPP && evsel__precise_ip_fallback(evsel))
+ goto retry_open;
- if (evsel__detect_missing_features(evsel))
+ if (err == -EINVAL && evsel__detect_missing_features(evsel))
goto fallback_missing_features;
+
+ if (evsel__handle_error_quirks(evsel, err))
+ goto retry_open;
+
out_close:
if (err)
threads->err_thread = thread;
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 10/10] perf record: Just use "cycles:P" as the default event
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
` (8 preceding siblings ...)
2024-09-05 20:24 ` [PATCH 09/10] perf tools: Check fallback error and order Namhyung Kim
@ 2024-09-05 20:24 ` Namhyung Kim
9 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-05 20:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
The fallback logic can add ":u" modifier if needed.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-record.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index adbaf80b398c1f4c..f8325247292112d7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -4157,9 +4157,7 @@ int cmd_record(int argc, const char **argv)
record.opts.tail_synthesize = true;
if (rec->evlist->core.nr_entries == 0) {
- bool can_profile_kernel = perf_event_paranoid_check(1);
-
- err = parse_event(rec->evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu");
+ err = parse_event(rec->evlist, "cycles:P");
if (err)
goto out;
}
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 01/10] perf tools: Add fallback for exclude_guest
2024-09-05 20:24 ` [PATCH 01/10] perf tools: Add fallback for exclude_guest Namhyung Kim
@ 2024-09-06 13:47 ` Liang, Kan
2024-09-30 20:36 ` Namhyung Kim
0 siblings, 1 reply; 23+ messages in thread
From: Liang, Kan @ 2024-09-06 13:47 UTC (permalink / raw)
To: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang, James Clark
On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> Commit 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default")
> changed to parse "cycles:P" event instead of creating a new cycles
> event for perf record. But it also changed the way how modifiers are
> handled so it doesn't set the exclude_guest bit by default.
>
> It seems Apple M1 PMU requires exclude_guest set and returns EOPNOTSUPP
> if not. Let's add a fallback so that it can work with default events.
>
> Fixes: 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default")
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: James Clark <james.clark@linaro.org>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/builtin-stat.c | 3 +--
> tools/perf/util/evsel.c | 21 +++++++++++++++++++++
> 2 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index cf985cdb9a6ee588..d8315dae930184ba 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -639,8 +639,7 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
> * (behavior changed with commit b0a873e).
> */
> if (errno == EINVAL || errno == ENOSYS ||
> - errno == ENOENT || errno == EOPNOTSUPP ||
> - errno == ENXIO) {
> + errno == ENOENT || errno == ENXIO) {
> if (verbose > 0)
> ui__warning("%s event is not supported by the kernel.\n",
> evsel__name(counter));
It seems the behavior for other reasons which trigger the 'EOPNOTSUPP'
is changed as well.
At least, it looks like we don't skip the member event with EOPNOTSUPP
anymore.
I'm not sure if it's a big deal. But I think we'd better mention it in
the change log or the comments.
Thanks,
Kan
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 49cc71511c0c8ce8..d59ad76b28758906 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -3244,6 +3244,27 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
> evsel->core.attr.exclude_kernel = 1;
> evsel->core.attr.exclude_hv = 1;
>
> + return true;
> + } else if (err == EOPNOTSUPP && !evsel->core.attr.exclude_guest &&
> + !evsel->exclude_GH) {
> + const char *name = evsel__name(evsel);
> + char *new_name;
> + const char *sep = ":";
> +
> + /* Is there already the separator in the name. */
> + if (strchr(name, '/') ||
> + (strchr(name, ':') && !evsel->is_libpfm_event))
> + sep = "";
> +
> + if (asprintf(&new_name, "%s%sH", name, sep) < 0)
> + return false;
> +
> + free(evsel->name);
> + evsel->name = new_name;
> + /* Apple M1 requires exclude_guest */
> + scnprintf(msg, msgsize, "trying to fall back to excluding guest samples");
> + evsel->core.attr.exclude_guest = 1;
> +
> return true;
> }
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 02/10] perf tools: Don't set attr.exclude_guest by default
2024-09-05 20:24 ` [PATCH 02/10] perf tools: Don't set attr.exclude_guest by default Namhyung Kim
@ 2024-09-06 14:10 ` Liang, Kan
0 siblings, 0 replies; 23+ messages in thread
From: Liang, Kan @ 2024-09-06 14:10 UTC (permalink / raw)
To: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang, James Clark
On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> The exclude_guest in the event attribute is to limit profiling in the
> host environment. But I'm not sure why we want to set it by default
> cause we don't care about it in most cases and I feel like it just
> makes new PMU implementation complicated.
>
> Of course it's useful for perf kvm command so I added the
> exclude_GH_default variable to preserve the old behavior for perf kvm
> and other commands like perf record and stat won't set the exclude bit.
> This is helpful for AMD IBS case since having exclude_guest bit will
> clear new feature bit due to the missing feature check logic.
>
> $ sysctl kernel.perf_event_paranoid
> kernel.perf_event_paranoid = 0
>
> $ perf record -W -e ibs_op// -vv true 2>&1 | grep switching
> switching off PERF_FORMAT_LOST support
> switching off weight struct support
> switching off bpf_event
> switching off ksymbol
> switching off cloexec flag
> switching off mmap2
> switching off exclude_guest, exclude_host
>
> Intestingly, I found it sets the exclude_bit if "u" modifier is used.
> I don't know why but it's neither intuitive nor consistent. Let's
> remove the bit there too.
>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: James Clark <james.clark@linaro.org>
> Cc: Ravi Bangoria <ravi.bangoria@amd.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/builtin-kvm.c | 1 +
> tools/perf/tests/attr/test-record-dummy-C0 | 2 +-
> tools/perf/tests/parse-events.c | 18 +++++++++---------
> tools/perf/util/parse-events.c | 2 +-
> tools/perf/util/util.c | 10 ++++++++--
> tools/perf/util/util.h | 3 +++
> 6 files changed, 23 insertions(+), 13 deletions(-)
>
> diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
> index 55ea17c5ff02acf7..5e32d4e78fdf2424 100644
> --- a/tools/perf/builtin-kvm.c
> +++ b/tools/perf/builtin-kvm.c
> @@ -2147,6 +2147,7 @@ int cmd_kvm(int argc, const char **argv)
> "buildid-list", "stat", NULL };
> const char *kvm_usage[] = { NULL, NULL };
>
> + exclude_HG_default = true;
exclude_GH_default?
Thanks,
Kan
> perf_host = 0;
> perf_guest = 1;
>
> diff --git a/tools/perf/tests/attr/test-record-dummy-C0 b/tools/perf/tests/attr/test-record-dummy-C0
> index 576ec48b3aafaa6a..8ce6f0a5df5b7013 100644
> --- a/tools/perf/tests/attr/test-record-dummy-C0
> +++ b/tools/perf/tests/attr/test-record-dummy-C0
> @@ -37,7 +37,7 @@ precise_ip=0
> mmap_data=0
> sample_id_all=1
> exclude_host=0
> -exclude_guest=1
> +exclude_guest=0
> exclude_callchain_kernel=0
> exclude_callchain_user=0
> mmap2=1
> diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
> index 639e65a9bf61801a..9077d581c0d05e62 100644
> --- a/tools/perf/tests/parse-events.c
> +++ b/tools/perf/tests/parse-events.c
> @@ -933,7 +933,7 @@ static int test__group2(struct evlist *evlist)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
> @@ -948,7 +948,7 @@ static int test__group2(struct evlist *evlist)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> if (evsel__has_leader(evsel, leader))
> @@ -1073,7 +1073,7 @@ static int test__group3(struct evlist *evlist __maybe_unused)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
> @@ -1223,7 +1223,7 @@ static int test__group5(struct evlist *evlist __maybe_unused)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
> @@ -1438,7 +1438,7 @@ static int test__leader_sample1(struct evlist *evlist)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
> @@ -1454,7 +1454,7 @@ static int test__leader_sample1(struct evlist *evlist)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
> @@ -1469,7 +1469,7 @@ static int test__leader_sample1(struct evlist *evlist)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
> @@ -1498,7 +1498,7 @@ static int test__leader_sample2(struct evlist *evlist __maybe_unused)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
> @@ -1514,7 +1514,7 @@ static int test__leader_sample2(struct evlist *evlist __maybe_unused)
> TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
> TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
> TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
> - TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
> + TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
> TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
> TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
> TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index f5eb1af8302c7bac..8d9e2d4b73822d98 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1761,7 +1761,7 @@ static int parse_events__modifier_list(struct parse_events_state *parse_state,
> if (mod.user) {
> if (!exclude)
> exclude = eu = ek = eh = 1;
> - if (!exclude_GH && !perf_guest)
> + if (!exclude_GH && !perf_guest && exclude_HG_default)
> eG = 1;
> eu = 0;
> }
> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
> index 9d55a13787ce3c05..7e3159faaa1991df 100644
> --- a/tools/perf/util/util.c
> +++ b/tools/perf/util/util.c
> @@ -78,17 +78,23 @@ bool sysctl__nmi_watchdog_enabled(void)
>
> bool test_attr__enabled;
>
> +bool exclude_HG_default;
> +
> bool perf_host = true;
> bool perf_guest = false;
>
> void event_attr_init(struct perf_event_attr *attr)
> {
> + /* to capture ABI version */
> + attr->size = sizeof(*attr);
> +
> + if (!exclude_HG_default)
> + return;
> +
> if (!perf_host)
> attr->exclude_host = 1;
> if (!perf_guest)
> attr->exclude_guest = 1;
> - /* to capture ABI version */
> - attr->size = sizeof(*attr);
> }
>
> int mkdir_p(char *path, mode_t mode)
> diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
> index 9966c21aaf048479..d33ae883a54f2f2f 100644
> --- a/tools/perf/util/util.h
> +++ b/tools/perf/util/util.h
> @@ -21,6 +21,9 @@ extern const char perf_more_info_string[];
>
> extern const char *input_name;
>
> +/* This will control if perf_{host,guest} will set attr.exclude_{host,guest}. */
> +extern bool exclude_HG_default;
> +
> extern bool perf_host;
> extern bool perf_guest;
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-05 20:24 ` [PATCH 04/10] perf stat: Add --exclude-guest option Namhyung Kim
@ 2024-09-06 14:33 ` Liang, Kan
2024-09-23 8:47 ` James Clark
0 siblings, 1 reply; 23+ messages in thread
From: Liang, Kan @ 2024-09-06 14:33 UTC (permalink / raw)
To: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> This option is to support the old behavior of setting exclude_guest by
> default. Now it doesn't set the bit so users want the old behavior can
> use this option.
>
> $ perf stat true
>
> Performance counter stats for 'true':
>
> 0.86 msec task-clock:u # 0.443 CPUs utilized
> 0 context-switches:u # 0.000 /sec
> 0 cpu-migrations:u # 0.000 /sec
> 49 page-faults:u # 56.889 K/sec
> ...
>
> $ perf stat --exclude-guest true
>
> Performance counter stats for 'true':
>
> 0.79 msec task-clock:Hu # 0.490 CPUs utilized
> 0 context-switches:Hu # 0.000 /sec
> 0 cpu-migrations:Hu # 0.000 /sec
> 49 page-faults:Hu # 62.078 K/sec
> ...
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/Documentation/perf-stat.txt | 7 +++++++
> tools/perf/builtin-stat.c | 2 ++
> 2 files changed, 9 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 2bc06367248691dd..d28d8370a856598f 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -382,6 +382,13 @@ color the metric's computed value.
> Don't print output, warnings or messages. This is useful with perf stat
> record below to only write data to the perf.data file.
>
> +--exclude-guest::
> +Don't count event in the guest mode. It was the old behavior but the
> +default is changed to count guest events also. Use this option if you
> +want the old behavior (host only). Note that this option needs to be
> +before other events in case you added -e/--event option in the command
> +line.
I'm not sure if we really need this option. I think it may bring more
trouble than what we get.
The name of the "--exclude-guest" sounds like a replacement of the event
modifier "H". But in fact, it's not. It should only affect the default.
It doesn't set the "H" for any events.
Except for the perf kvm user, I don't think there are many users which
care the exclude_guest. The behavior of the perf kvm is not changed. So
the option seems not that important. If we really want an option to
restore the old behavior, it's better to choose a better name and update
the description.
Thanks,
Kan
> +
> STAT RECORD
> -----------
> Stores stat data into perf data file.
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index d8315dae930184ba..4d47675af5cc3094 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -2491,6 +2491,8 @@ int cmd_stat(int argc, const char **argv)
> OPT_BOOLEAN_FLAG(0, "all-user", &stat_config.all_user,
> "Configure all used events to run in user space.",
> PARSE_OPT_EXCLUSIVE),
> + OPT_BOOLEAN(0, "exclude-guest", &exclude_HG_default,
> + "Don't count events in the guest mode"),
> OPT_BOOLEAN(0, "percore-show-thread", &stat_config.percore_show_thread,
> "Use with 'percore' event qualifier to show the event "
> "counts of one hardware thread by sum up total hardware "
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 07/10] perf tools: Separate exclude_hv fallback
2024-09-05 20:24 ` [PATCH 07/10] perf tools: Separate exclude_hv fallback Namhyung Kim
@ 2024-09-06 15:21 ` Liang, Kan
2024-09-30 20:37 ` Namhyung Kim
0 siblings, 1 reply; 23+ messages in thread
From: Liang, Kan @ 2024-09-06 15:21 UTC (permalink / raw)
To: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> The exclude_hv was added in the evsel__fallback() in the commit
> 4ec8d984895fef43a ("perf record: Fix priv level with branch sampling
> for paranoid=2") to address branch stack samples on Intel PMUs.
> As some other PMUs might not support that, let's separate the bit from
> exclude_kernel to make sure it can add the bit only if required.
>
> Technically it should change the modifier string at the end of the
> event name. ":u" is for exclude_kernel + exclude_hv, so it should be
> ":uh" if it has exclude_kernel only. That means the default events for
> regular users will looks like "cycles:Puh" (for perf record) or
> "instructions:uh" (for perf stat). But I'm not sure if it's worth the
> trouble so I didn't touch the name in this patch.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/util/evsel.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 1a4f52767942e5ad..c5df45bb74dfc1b5 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -3389,10 +3389,20 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
> free(evsel->name);
> evsel->name = new_name;
> scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
> - "to fall back to excluding kernel and hypervisor "
> - " samples", paranoid);
> + "to fall back to excluding kernel samples", paranoid);
> evsel->core.attr.exclude_kernel = 1;
> - evsel->core.attr.exclude_hv = 1;
> +
> + return true;
> + } else if (err == EACCES && !evsel->core.attr.exclude_hv &&
> + (paranoid = perf_event_paranoid()) > 1) {
> + /* If event has exclude user then don't exclude hv. */
> + if (evsel->core.attr.exclude_user)
> + return false;
> +
> + /* Intel branch stack requires exclude_hv */
I don't think it's an requirement for Intel branch stack. The HV is
ignored for all X86.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/core.c#n542
I think it's a generic request for branch on all arch.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c#n366
Thanks,
Kan
> + scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
> + "to fall back to excluding hypervisor samples", paranoid);
> + evsel->core.attr.exclude_hv = 1;
>
> return true;
> } else if (err == EOPNOTSUPP && !evsel->core.attr.exclude_guest &&
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-06 14:33 ` Liang, Kan
@ 2024-09-23 8:47 ` James Clark
2024-09-24 20:21 ` Namhyung Kim
0 siblings, 1 reply; 23+ messages in thread
From: James Clark @ 2024-09-23 8:47 UTC (permalink / raw)
To: Liang, Kan, Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers
Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
On 06/09/2024 3:33 pm, Liang, Kan wrote:
>
>
> On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
>> This option is to support the old behavior of setting exclude_guest by
>> default. Now it doesn't set the bit so users want the old behavior can
>> use this option.
>>
>> $ perf stat true
>>
>> Performance counter stats for 'true':
>>
>> 0.86 msec task-clock:u # 0.443 CPUs utilized
>> 0 context-switches:u # 0.000 /sec
>> 0 cpu-migrations:u # 0.000 /sec
>> 49 page-faults:u # 56.889 K/sec
>> ...
>>
>> $ perf stat --exclude-guest true
>>
>> Performance counter stats for 'true':
>>
>> 0.79 msec task-clock:Hu # 0.490 CPUs utilized
>> 0 context-switches:Hu # 0.000 /sec
>> 0 cpu-migrations:Hu # 0.000 /sec
>> 49 page-faults:Hu # 62.078 K/sec
>> ...
>>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>> tools/perf/Documentation/perf-stat.txt | 7 +++++++
>> tools/perf/builtin-stat.c | 2 ++
>> 2 files changed, 9 insertions(+)
>>
>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>> index 2bc06367248691dd..d28d8370a856598f 100644
>> --- a/tools/perf/Documentation/perf-stat.txt
>> +++ b/tools/perf/Documentation/perf-stat.txt
>> @@ -382,6 +382,13 @@ color the metric's computed value.
>> Don't print output, warnings or messages. This is useful with perf stat
>> record below to only write data to the perf.data file.
>>
>> +--exclude-guest::
>> +Don't count event in the guest mode. It was the old behavior but the
>> +default is changed to count guest events also. Use this option if you
>> +want the old behavior (host only). Note that this option needs to be
>> +before other events in case you added -e/--event option in the command
>> +line.
>
> I'm not sure if we really need this option. I think it may bring more
> trouble than what we get.
>
> The name of the "--exclude-guest" sounds like a replacement of the event
> modifier "H". But in fact, it's not. It should only affect the default.
> It doesn't set the "H" for any events.
>
> Except for the perf kvm user, I don't think there are many users which
> care the exclude_guest. The behavior of the perf kvm is not changed. So
> the option seems not that important. If we really want an option to
> restore the old behavior, it's better to choose a better name and update
> the description.
>
> Thanks,
> Kan
Do we not want to keep exclude_guest for record, but remove it for stat?
Because in record the addresses of guest samples don't make sense
without extra work, but for stat you might want to see an overview of
the whole system.
For Coresight tracing and SPE we would want to keep exclude_guest,
otherwise you generate a load of extra trace that you can't make use of.
Say you were doing PGO on your host you wouldn't be recompiling anything
the guests were running.
If we do change the defaults isn't ':H' already enough to go back to the
old behavior? I'm wondering why we need an argument when all the other
exclude rules are done with the letter modifiers?
James
>> +
>> STAT RECORD
>> -----------
>> Stores stat data into perf data file.
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index d8315dae930184ba..4d47675af5cc3094 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -2491,6 +2491,8 @@ int cmd_stat(int argc, const char **argv)
>> OPT_BOOLEAN_FLAG(0, "all-user", &stat_config.all_user,
>> "Configure all used events to run in user space.",
>> PARSE_OPT_EXCLUSIVE),
>> + OPT_BOOLEAN(0, "exclude-guest", &exclude_HG_default,
>> + "Don't count events in the guest mode"),
>> OPT_BOOLEAN(0, "percore-show-thread", &stat_config.percore_show_thread,
>> "Use with 'percore' event qualifier to show the event "
>> "counts of one hardware thread by sum up total hardware "
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-23 8:47 ` James Clark
@ 2024-09-24 20:21 ` Namhyung Kim
2024-09-25 8:36 ` James Clark
2024-09-25 13:49 ` Liang, Kan
0 siblings, 2 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-24 20:21 UTC (permalink / raw)
To: James Clark
Cc: Liang, Kan, Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa,
Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
Hello,
On Mon, Sep 23, 2024 at 09:47:17AM +0100, James Clark wrote:
>
>
> On 06/09/2024 3:33 pm, Liang, Kan wrote:
> >
> >
> > On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> > > This option is to support the old behavior of setting exclude_guest by
> > > default. Now it doesn't set the bit so users want the old behavior can
> > > use this option.
> > >
> > > $ perf stat true
> > >
> > > Performance counter stats for 'true':
> > >
> > > 0.86 msec task-clock:u # 0.443 CPUs utilized
> > > 0 context-switches:u # 0.000 /sec
> > > 0 cpu-migrations:u # 0.000 /sec
> > > 49 page-faults:u # 56.889 K/sec
> > > ...
> > >
> > > $ perf stat --exclude-guest true
> > >
> > > Performance counter stats for 'true':
> > >
> > > 0.79 msec task-clock:Hu # 0.490 CPUs utilized
> > > 0 context-switches:Hu # 0.000 /sec
> > > 0 cpu-migrations:Hu # 0.000 /sec
> > > 49 page-faults:Hu # 62.078 K/sec
> > > ...
> > >
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
> > > tools/perf/Documentation/perf-stat.txt | 7 +++++++
> > > tools/perf/builtin-stat.c | 2 ++
> > > 2 files changed, 9 insertions(+)
> > >
> > > diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> > > index 2bc06367248691dd..d28d8370a856598f 100644
> > > --- a/tools/perf/Documentation/perf-stat.txt
> > > +++ b/tools/perf/Documentation/perf-stat.txt
> > > @@ -382,6 +382,13 @@ color the metric's computed value.
> > > Don't print output, warnings or messages. This is useful with perf stat
> > > record below to only write data to the perf.data file.
> > > +--exclude-guest::
> > > +Don't count event in the guest mode. It was the old behavior but the
> > > +default is changed to count guest events also. Use this option if you
> > > +want the old behavior (host only). Note that this option needs to be
> > > +before other events in case you added -e/--event option in the command
> > > +line.
> >
> > I'm not sure if we really need this option. I think it may bring more
> > trouble than what we get.
> >
> > The name of the "--exclude-guest" sounds like a replacement of the event
> > modifier "H". But in fact, it's not. It should only affect the default.
> > It doesn't set the "H" for any events.
Well I think it's tricky but it'd set "H" modifier events after the
option. But I have to agree that it can bring more troubles.
> >
> > Except for the perf kvm user, I don't think there are many users which
> > care the exclude_guest. The behavior of the perf kvm is not changed. So
> > the option seems not that important. If we really want an option to
> > restore the old behavior, it's better to choose a better name and update
> > the description.
Personally I don't want to this option but just worried if there's a
case where exclude_guest is preferred.
>
> Do we not want to keep exclude_guest for record, but remove it for stat?
What I really want is to synchronize record and stat behavior in terms
of the default exclusion mode. If everyone is fine, I'd like to remove
exclude_guest from the default and set it only if needed (through the
fallback) like we do for exclude_kernel.
>
> Because in record the addresses of guest samples don't make sense without
> extra work, but for stat you might want to see an overview of the whole
> system.
I think it depends on the use case and (power) users should know about
their environment and requirement. The concern is what's the reasonable
default, but I think it should be the same both in perf record and stat
at least.
>
> For Coresight tracing and SPE we would want to keep exclude_guest, otherwise
> you generate a load of extra trace that you can't make use of. Say you were
> doing PGO on your host you wouldn't be recompiling anything the guests were
> running.
For the specific use case, I think we can guide users to add "H"
modifier (I guess it's not the default event for perf record).
Maybe we can consider per-PMU default attributes.
>
> If we do change the defaults isn't ':H' already enough to go back to the old
> behavior? I'm wondering why we need an argument when all the other exclude
> rules are done with the letter modifiers?
I'm not sure I follow this. But maybe we don't need this option at all.
We can add ":H" for every event but I'm too lazy to add them to all the
default events in perf data. :)
Thanks,
Namhyung
> > > +
> > > STAT RECORD
> > > -----------
> > > Stores stat data into perf data file.
> > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > > index d8315dae930184ba..4d47675af5cc3094 100644
> > > --- a/tools/perf/builtin-stat.c
> > > +++ b/tools/perf/builtin-stat.c
> > > @@ -2491,6 +2491,8 @@ int cmd_stat(int argc, const char **argv)
> > > OPT_BOOLEAN_FLAG(0, "all-user", &stat_config.all_user,
> > > "Configure all used events to run in user space.",
> > > PARSE_OPT_EXCLUSIVE),
> > > + OPT_BOOLEAN(0, "exclude-guest", &exclude_HG_default,
> > > + "Don't count events in the guest mode"),
> > > OPT_BOOLEAN(0, "percore-show-thread", &stat_config.percore_show_thread,
> > > "Use with 'percore' event qualifier to show the event "
> > > "counts of one hardware thread by sum up total hardware "
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-24 20:21 ` Namhyung Kim
@ 2024-09-25 8:36 ` James Clark
2024-09-30 20:13 ` Namhyung Kim
2024-09-25 13:49 ` Liang, Kan
1 sibling, 1 reply; 23+ messages in thread
From: James Clark @ 2024-09-25 8:36 UTC (permalink / raw)
To: Namhyung Kim
Cc: Liang, Kan, Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa,
Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
On 24/09/2024 9:21 pm, Namhyung Kim wrote:
> Hello,
>
> On Mon, Sep 23, 2024 at 09:47:17AM +0100, James Clark wrote:
>>
>>
>> On 06/09/2024 3:33 pm, Liang, Kan wrote:
>>>
>>>
>>> On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
>>>> This option is to support the old behavior of setting exclude_guest by
>>>> default. Now it doesn't set the bit so users want the old behavior can
>>>> use this option.
>>>>
>>>> $ perf stat true
>>>>
>>>> Performance counter stats for 'true':
>>>>
>>>> 0.86 msec task-clock:u # 0.443 CPUs utilized
>>>> 0 context-switches:u # 0.000 /sec
>>>> 0 cpu-migrations:u # 0.000 /sec
>>>> 49 page-faults:u # 56.889 K/sec
>>>> ...
>>>>
>>>> $ perf stat --exclude-guest true
>>>>
>>>> Performance counter stats for 'true':
>>>>
>>>> 0.79 msec task-clock:Hu # 0.490 CPUs utilized
>>>> 0 context-switches:Hu # 0.000 /sec
>>>> 0 cpu-migrations:Hu # 0.000 /sec
>>>> 49 page-faults:Hu # 62.078 K/sec
>>>> ...
>>>>
>>>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>>>> ---
>>>> tools/perf/Documentation/perf-stat.txt | 7 +++++++
>>>> tools/perf/builtin-stat.c | 2 ++
>>>> 2 files changed, 9 insertions(+)
>>>>
>>>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>>>> index 2bc06367248691dd..d28d8370a856598f 100644
>>>> --- a/tools/perf/Documentation/perf-stat.txt
>>>> +++ b/tools/perf/Documentation/perf-stat.txt
>>>> @@ -382,6 +382,13 @@ color the metric's computed value.
>>>> Don't print output, warnings or messages. This is useful with perf stat
>>>> record below to only write data to the perf.data file.
>>>> +--exclude-guest::
>>>> +Don't count event in the guest mode. It was the old behavior but the
>>>> +default is changed to count guest events also. Use this option if you
>>>> +want the old behavior (host only). Note that this option needs to be
>>>> +before other events in case you added -e/--event option in the command
>>>> +line.
>>>
>>> I'm not sure if we really need this option. I think it may bring more
>>> trouble than what we get.
>>>
>>> The name of the "--exclude-guest" sounds like a replacement of the event
>>> modifier "H". But in fact, it's not. It should only affect the default.
>>> It doesn't set the "H" for any events.
>
> Well I think it's tricky but it'd set "H" modifier events after the
> option. But I have to agree that it can bring more troubles.
>
>>>
>>> Except for the perf kvm user, I don't think there are many users which
>>> care the exclude_guest. The behavior of the perf kvm is not changed. So
>>> the option seems not that important. If we really want an option to
>>> restore the old behavior, it's better to choose a better name and update
>>> the description.
>
> Personally I don't want to this option but just worried if there's a
> case where exclude_guest is preferred.
>
>>
>> Do we not want to keep exclude_guest for record, but remove it for stat?
>
> What I really want is to synchronize record and stat behavior in terms
> of the default exclusion mode. If everyone is fine, I'd like to remove
> exclude_guest from the default and set it only if needed (through the
> fallback) like we do for exclude_kernel.
>
>>
>> Because in record the addresses of guest samples don't make sense without
>> extra work, but for stat you might want to see an overview of the whole
>> system.
>
> I think it depends on the use case and (power) users should know about
> their environment and requirement. The concern is what's the reasonable
> default, but I think it should be the same both in perf record and stat
> at least.
>
>>
>> For Coresight tracing and SPE we would want to keep exclude_guest, otherwise
>> you generate a load of extra trace that you can't make use of. Say you were
>> doing PGO on your host you wouldn't be recompiling anything the guests were
>> running.
>
> For the specific use case, I think we can guide users to add "H"
> modifier (I guess it's not the default event for perf record).
> Maybe we can consider per-PMU default attributes.
>
Yeah I was thinking about adding it as a default when the Coresight and
SPE events are configured, but I think it's too much to expect for
people to know all the edge cases about what the per PMU defaults will
be. Having one default for all PMUs makes more sense to me.
I suppose asking Coresight users to add :H when they need it might be a
less bad thing than this change to clean up the fragmentation in the
defaults.
I expect most of the time if :H is not added things continue to work,
just that more data will be recorded. One real concern is that in some
configurations there is a limited buffer size and once it's filled you
don't get any more data. If that buffer is filled by guest trace then
maybe some Auto FDO flow could break. But it's kind of an edge case of
an edge case and adding :H isn't really the end of the world.
>>
>> If we do change the defaults isn't ':H' already enough to go back to the old
>> behavior? I'm wondering why we need an argument when all the other exclude
>> rules are done with the letter modifiers?
>
> I'm not sure I follow this. But maybe we don't need this option at all.
> We can add ":H" for every event but I'm too lazy to add them to all the
> default events in perf data. :)
>
> Thanks,
> Namhyung
>
Oh I see yes, the argument is a shortcut to adding H on all events
and/or the default ones. I wasn't sure if was added because it couldn't
be done with :H, but I see it's an alternative instead. So adding it
makes sense.
>>>> +
>>>> STAT RECORD
>>>> -----------
>>>> Stores stat data into perf data file.
>>>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>>>> index d8315dae930184ba..4d47675af5cc3094 100644
>>>> --- a/tools/perf/builtin-stat.c
>>>> +++ b/tools/perf/builtin-stat.c
>>>> @@ -2491,6 +2491,8 @@ int cmd_stat(int argc, const char **argv)
>>>> OPT_BOOLEAN_FLAG(0, "all-user", &stat_config.all_user,
>>>> "Configure all used events to run in user space.",
>>>> PARSE_OPT_EXCLUSIVE),
>>>> + OPT_BOOLEAN(0, "exclude-guest", &exclude_HG_default,
>>>> + "Don't count events in the guest mode"),
Maybe this would be a good hint if it really is equivalent:
+ "Don't count events in the guest mode. (Add :H to all events)"),
>>>> OPT_BOOLEAN(0, "percore-show-thread", &stat_config.percore_show_thread,
>>>> "Use with 'percore' event qualifier to show the event "
>>>> "counts of one hardware thread by sum up total hardware "
>>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-24 20:21 ` Namhyung Kim
2024-09-25 8:36 ` James Clark
@ 2024-09-25 13:49 ` Liang, Kan
2024-09-30 20:07 ` Namhyung Kim
1 sibling, 1 reply; 23+ messages in thread
From: Liang, Kan @ 2024-09-25 13:49 UTC (permalink / raw)
To: Namhyung Kim, James Clark
Cc: Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
Ravi Bangoria, Mark Rutland, James Clark, Kajol Jain,
Thomas Richter, Atish Patra, Palmer Dabbelt, Mingwei Zhang
On 2024-09-24 4:21 p.m., Namhyung Kim wrote:
> On Mon, Sep 23, 2024 at 09:47:17AM +0100, James Clark wrote:
>>
>> On 06/09/2024 3:33 pm, Liang, Kan wrote:
>>>
>>> On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
>>>> This option is to support the old behavior of setting exclude_guest by
>>>> default. Now it doesn't set the bit so users want the old behavior can
>>>> use this option.
>>>>
>>>> $ perf stat true
>>>>
>>>> Performance counter stats for 'true':
>>>>
>>>> 0.86 msec task-clock:u # 0.443 CPUs utilized
>>>> 0 context-switches:u # 0.000 /sec
>>>> 0 cpu-migrations:u # 0.000 /sec
>>>> 49 page-faults:u # 56.889 K/sec
>>>> ...
>>>>
>>>> $ perf stat --exclude-guest true
>>>>
>>>> Performance counter stats for 'true':
>>>>
>>>> 0.79 msec task-clock:Hu # 0.490 CPUs utilized
>>>> 0 context-switches:Hu # 0.000 /sec
>>>> 0 cpu-migrations:Hu # 0.000 /sec
>>>> 49 page-faults:Hu # 62.078 K/sec
>>>> ...
>>>>
>>>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>>>> ---
>>>> tools/perf/Documentation/perf-stat.txt | 7 +++++++
>>>> tools/perf/builtin-stat.c | 2 ++
>>>> 2 files changed, 9 insertions(+)
>>>>
>>>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>>>> index 2bc06367248691dd..d28d8370a856598f 100644
>>>> --- a/tools/perf/Documentation/perf-stat.txt
>>>> +++ b/tools/perf/Documentation/perf-stat.txt
>>>> @@ -382,6 +382,13 @@ color the metric's computed value.
>>>> Don't print output, warnings or messages. This is useful with perf stat
>>>> record below to only write data to the perf.data file.
>>>> +--exclude-guest::
>>>> +Don't count event in the guest mode. It was the old behavior but the
>>>> +default is changed to count guest events also. Use this option if you
>>>> +want the old behavior (host only). Note that this option needs to be
>>>> +before other events in case you added -e/--event option in the command
>>>> +line.
>>> I'm not sure if we really need this option. I think it may bring more
>>> trouble than what we get.
>>>
>>> The name of the "--exclude-guest" sounds like a replacement of the event
>>> modifier "H". But in fact, it's not. It should only affect the default.
>>> It doesn't set the "H" for any events.
> Well I think it's tricky but it'd set "H" modifier events after the
> option. But I have to agree that it can bring more troubles.
I may have miss-read something before. After some simple tests, yes, the
"H" is applied with the option.
Since there is a limit for the "--exclude-guest" option, can we print a
warning if the option becomes invalid because of the order?
>
>>> Except for the perf kvm user, I don't think there are many users which
>>> care the exclude_guest. The behavior of the perf kvm is not changed. So
>>> the option seems not that important. If we really want an option to
>>> restore the old behavior, it's better to choose a better name and update
>>> the description.
> Personally I don't want to this option but just worried if there's a
> case where exclude_guest is preferred.
The only case I can imagine is that, with the new vPMU passthrough
introduced, some users may want to explicitly set the exclude_guest to
avoid the fallback. But I'm not sure how useful it is for them.
Thanks,
Kan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-25 13:49 ` Liang, Kan
@ 2024-09-30 20:07 ` Namhyung Kim
0 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-30 20:07 UTC (permalink / raw)
To: Liang, Kan
Cc: James Clark, Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa,
Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
On Wed, Sep 25, 2024 at 09:49:14AM -0400, Liang, Kan wrote:
>
>
> On 2024-09-24 4:21 p.m., Namhyung Kim wrote:
> > On Mon, Sep 23, 2024 at 09:47:17AM +0100, James Clark wrote:
> >>
> >> On 06/09/2024 3:33 pm, Liang, Kan wrote:
> >>>
> >>> On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> >>>> This option is to support the old behavior of setting exclude_guest by
> >>>> default. Now it doesn't set the bit so users want the old behavior can
> >>>> use this option.
> >>>>
> >>>> $ perf stat true
> >>>>
> >>>> Performance counter stats for 'true':
> >>>>
> >>>> 0.86 msec task-clock:u # 0.443 CPUs utilized
> >>>> 0 context-switches:u # 0.000 /sec
> >>>> 0 cpu-migrations:u # 0.000 /sec
> >>>> 49 page-faults:u # 56.889 K/sec
> >>>> ...
> >>>>
> >>>> $ perf stat --exclude-guest true
> >>>>
> >>>> Performance counter stats for 'true':
> >>>>
> >>>> 0.79 msec task-clock:Hu # 0.490 CPUs utilized
> >>>> 0 context-switches:Hu # 0.000 /sec
> >>>> 0 cpu-migrations:Hu # 0.000 /sec
> >>>> 49 page-faults:Hu # 62.078 K/sec
> >>>> ...
> >>>>
> >>>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> >>>> ---
> >>>> tools/perf/Documentation/perf-stat.txt | 7 +++++++
> >>>> tools/perf/builtin-stat.c | 2 ++
> >>>> 2 files changed, 9 insertions(+)
> >>>>
> >>>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> >>>> index 2bc06367248691dd..d28d8370a856598f 100644
> >>>> --- a/tools/perf/Documentation/perf-stat.txt
> >>>> +++ b/tools/perf/Documentation/perf-stat.txt
> >>>> @@ -382,6 +382,13 @@ color the metric's computed value.
> >>>> Don't print output, warnings or messages. This is useful with perf stat
> >>>> record below to only write data to the perf.data file.
> >>>> +--exclude-guest::
> >>>> +Don't count event in the guest mode. It was the old behavior but the
> >>>> +default is changed to count guest events also. Use this option if you
> >>>> +want the old behavior (host only). Note that this option needs to be
> >>>> +before other events in case you added -e/--event option in the command
> >>>> +line.
> >>> I'm not sure if we really need this option. I think it may bring more
> >>> trouble than what we get.
> >>>
> >>> The name of the "--exclude-guest" sounds like a replacement of the event
> >>> modifier "H". But in fact, it's not. It should only affect the default.
> >>> It doesn't set the "H" for any events.
> > Well I think it's tricky but it'd set "H" modifier events after the
> > option. But I have to agree that it can bring more troubles.
>
> I may have miss-read something before. After some simple tests, yes, the
> "H" is applied with the option.
Ok.
>
> Since there is a limit for the "--exclude-guest" option, can we print a
> warning if the option becomes invalid because of the order?
Well.. I'm thinking of removing this option for now.
>
> >
> >>> Except for the perf kvm user, I don't think there are many users which
> >>> care the exclude_guest. The behavior of the perf kvm is not changed. So
> >>> the option seems not that important. If we really want an option to
> >>> restore the old behavior, it's better to choose a better name and update
> >>> the description.
> > Personally I don't want to this option but just worried if there's a
> > case where exclude_guest is preferred.
>
> The only case I can imagine is that, with the new vPMU passthrough
> introduced, some users may want to explicitly set the exclude_guest to
> avoid the fallback. But I'm not sure how useful it is for them.
Because of overhead? They'll get exclude_guest eventually, right?
So I think I can drop this patch for now. And consider this later when
we can find a real usecase.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 04/10] perf stat: Add --exclude-guest option
2024-09-25 8:36 ` James Clark
@ 2024-09-30 20:13 ` Namhyung Kim
0 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-30 20:13 UTC (permalink / raw)
To: James Clark
Cc: Liang, Kan, Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa,
Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
linux-perf-users, Ravi Bangoria, Mark Rutland, James Clark,
Kajol Jain, Thomas Richter, Atish Patra, Palmer Dabbelt,
Mingwei Zhang
On Wed, Sep 25, 2024 at 09:36:15AM +0100, James Clark wrote:
>
>
> On 24/09/2024 9:21 pm, Namhyung Kim wrote:
> > Hello,
> >
> > On Mon, Sep 23, 2024 at 09:47:17AM +0100, James Clark wrote:
> > > For Coresight tracing and SPE we would want to keep exclude_guest, otherwise
> > > you generate a load of extra trace that you can't make use of. Say you were
> > > doing PGO on your host you wouldn't be recompiling anything the guests were
> > > running.
> >
> > For the specific use case, I think we can guide users to add "H"
> > modifier (I guess it's not the default event for perf record).
> > Maybe we can consider per-PMU default attributes.
> >
>
> Yeah I was thinking about adding it as a default when the Coresight and SPE
> events are configured, but I think it's too much to expect for people to
> know all the edge cases about what the per PMU defaults will be. Having one
> default for all PMUs makes more sense to me.
>
> I suppose asking Coresight users to add :H when they need it might be a less
> bad thing than this change to clean up the fragmentation in the defaults.
>
> I expect most of the time if :H is not added things continue to work, just
> that more data will be recorded. One real concern is that in some
> configurations there is a limited buffer size and once it's filled you don't
> get any more data. If that buffer is filled by guest trace then maybe some
> Auto FDO flow could break. But it's kind of an edge case of an edge case and
> adding :H isn't really the end of the world.
Ok, let's go with simple then.
>
> > >
> > > If we do change the defaults isn't ':H' already enough to go back to the old
> > > behavior? I'm wondering why we need an argument when all the other exclude
> > > rules are done with the letter modifiers?
> >
> > I'm not sure I follow this. But maybe we don't need this option at all.
> > We can add ":H" for every event but I'm too lazy to add them to all the
> > default events in perf data. :)
> >
> > Thanks,
> > Namhyung
>
> Oh I see yes, the argument is a shortcut to adding H on all events and/or
> the default ones. I wasn't sure if was added because it couldn't be done
> with :H, but I see it's an alternative instead. So adding it makes sense.
Yeah, but the current behavior is somewhat misleading so I'll drop it
now. If it turns out that we need something like this, I think we can
add more explicit options like --all-host and --all-guest later.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 01/10] perf tools: Add fallback for exclude_guest
2024-09-06 13:47 ` Liang, Kan
@ 2024-09-30 20:36 ` Namhyung Kim
0 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-30 20:36 UTC (permalink / raw)
To: Liang, Kan
Cc: Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
Ravi Bangoria, Mark Rutland, James Clark, Kajol Jain,
Thomas Richter, Atish Patra, Palmer Dabbelt, Mingwei Zhang,
James Clark
On Fri, Sep 06, 2024 at 09:47:53AM -0400, Liang, Kan wrote:
>
>
> On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> > Commit 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default")
> > changed to parse "cycles:P" event instead of creating a new cycles
> > event for perf record. But it also changed the way how modifiers are
> > handled so it doesn't set the exclude_guest bit by default.
> >
> > It seems Apple M1 PMU requires exclude_guest set and returns EOPNOTSUPP
> > if not. Let's add a fallback so that it can work with default events.
> >
> > Fixes: 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default")
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: James Clark <james.clark@linaro.org>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> > tools/perf/builtin-stat.c | 3 +--
> > tools/perf/util/evsel.c | 21 +++++++++++++++++++++
> > 2 files changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index cf985cdb9a6ee588..d8315dae930184ba 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -639,8 +639,7 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
> > * (behavior changed with commit b0a873e).
> > */
> > if (errno == EINVAL || errno == ENOSYS ||
> > - errno == ENOENT || errno == EOPNOTSUPP ||
> > - errno == ENXIO) {
> > + errno == ENOENT || errno == ENXIO) {
> > if (verbose > 0)
> > ui__warning("%s event is not supported by the kernel.\n",
> > evsel__name(counter));
>
> It seems the behavior for other reasons which trigger the 'EOPNOTSUPP'
> is changed as well.
> At least, it looks like we don't skip the member event with EOPNOTSUPP
> anymore.
>
> I'm not sure if it's a big deal. But I think we'd better mention it in
> the change log or the comments.
Yeah I think it should handle EOPNOTSUPP at the end of the function to
maintain the behavior. Still it's not exactly the same but I think the
skippable case is ok. Thanks for pointing this out.
Thanks,
Namhyung
>
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 49cc71511c0c8ce8..d59ad76b28758906 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -3244,6 +3244,27 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
> > evsel->core.attr.exclude_kernel = 1;
> > evsel->core.attr.exclude_hv = 1;
> >
> > + return true;
> > + } else if (err == EOPNOTSUPP && !evsel->core.attr.exclude_guest &&
> > + !evsel->exclude_GH) {
> > + const char *name = evsel__name(evsel);
> > + char *new_name;
> > + const char *sep = ":";
> > +
> > + /* Is there already the separator in the name. */
> > + if (strchr(name, '/') ||
> > + (strchr(name, ':') && !evsel->is_libpfm_event))
> > + sep = "";
> > +
> > + if (asprintf(&new_name, "%s%sH", name, sep) < 0)
> > + return false;
> > +
> > + free(evsel->name);
> > + evsel->name = new_name;
> > + /* Apple M1 requires exclude_guest */
> > + scnprintf(msg, msgsize, "trying to fall back to excluding guest samples");
> > + evsel->core.attr.exclude_guest = 1;
> > +
> > return true;
> > }
> >
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 07/10] perf tools: Separate exclude_hv fallback
2024-09-06 15:21 ` Liang, Kan
@ 2024-09-30 20:37 ` Namhyung Kim
0 siblings, 0 replies; 23+ messages in thread
From: Namhyung Kim @ 2024-09-30 20:37 UTC (permalink / raw)
To: Liang, Kan
Cc: Arnaldo Carvalho de Melo, Ian Rogers, Jiri Olsa, Adrian Hunter,
Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
Ravi Bangoria, Mark Rutland, James Clark, Kajol Jain,
Thomas Richter, Atish Patra, Palmer Dabbelt, Mingwei Zhang
On Fri, Sep 06, 2024 at 11:21:21AM -0400, Liang, Kan wrote:
>
>
> On 2024-09-05 4:24 p.m., Namhyung Kim wrote:
> > The exclude_hv was added in the evsel__fallback() in the commit
> > 4ec8d984895fef43a ("perf record: Fix priv level with branch sampling
> > for paranoid=2") to address branch stack samples on Intel PMUs.
> > As some other PMUs might not support that, let's separate the bit from
> > exclude_kernel to make sure it can add the bit only if required.
> >
> > Technically it should change the modifier string at the end of the
> > event name. ":u" is for exclude_kernel + exclude_hv, so it should be
> > ":uh" if it has exclude_kernel only. That means the default events for
> > regular users will looks like "cycles:Puh" (for perf record) or
> > "instructions:uh" (for perf stat). But I'm not sure if it's worth the
> > trouble so I didn't touch the name in this patch.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> > tools/perf/util/evsel.c | 16 +++++++++++++---
> > 1 file changed, 13 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 1a4f52767942e5ad..c5df45bb74dfc1b5 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -3389,10 +3389,20 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
> > free(evsel->name);
> > evsel->name = new_name;
> > scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
> > - "to fall back to excluding kernel and hypervisor "
> > - " samples", paranoid);
> > + "to fall back to excluding kernel samples", paranoid);
> > evsel->core.attr.exclude_kernel = 1;
> > - evsel->core.attr.exclude_hv = 1;
> > +
> > + return true;
> > + } else if (err == EACCES && !evsel->core.attr.exclude_hv &&
> > + (paranoid = perf_event_paranoid()) > 1) {
> > + /* If event has exclude user then don't exclude hv. */
> > + if (evsel->core.attr.exclude_user)
> > + return false;
> > +
> > + /* Intel branch stack requires exclude_hv */
>
> I don't think it's an requirement for Intel branch stack. The HV is
> ignored for all X86.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/core.c#n542
>
> I think it's a generic request for branch on all arch.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c#n366
Ok, I'll update the comment.
Thanks,
Namhyung
>
> > + scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
> > + "to fall back to excluding hypervisor samples", paranoid);
> > + evsel->core.attr.exclude_hv = 1;
> >
> > return true;
> > } else if (err == EOPNOTSUPP && !evsel->core.attr.exclude_guest &&
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-09-30 20:37 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-05 20:24 [RFC/PATCHSET 00/10] perf tools: Do not set attr.exclude_guest by default (v3) Namhyung Kim
2024-09-05 20:24 ` [PATCH 01/10] perf tools: Add fallback for exclude_guest Namhyung Kim
2024-09-06 13:47 ` Liang, Kan
2024-09-30 20:36 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 02/10] perf tools: Don't set attr.exclude_guest by default Namhyung Kim
2024-09-06 14:10 ` Liang, Kan
2024-09-05 20:24 ` [PATCH 03/10] perf tools: Simplify evsel__add_modifier() Namhyung Kim
2024-09-05 20:24 ` [PATCH 04/10] perf stat: Add --exclude-guest option Namhyung Kim
2024-09-06 14:33 ` Liang, Kan
2024-09-23 8:47 ` James Clark
2024-09-24 20:21 ` Namhyung Kim
2024-09-25 8:36 ` James Clark
2024-09-30 20:13 ` Namhyung Kim
2024-09-25 13:49 ` Liang, Kan
2024-09-30 20:07 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 05/10] perf tools: Do not set exclude_guest for precise_ip Namhyung Kim
2024-09-05 20:24 ` [PATCH 06/10] perf tools: Detect missing kernel features properly Namhyung Kim
2024-09-05 20:24 ` [PATCH 07/10] perf tools: Separate exclude_hv fallback Namhyung Kim
2024-09-06 15:21 ` Liang, Kan
2024-09-30 20:37 ` Namhyung Kim
2024-09-05 20:24 ` [PATCH 08/10] perf tools: Move x86__is_amd_cpu() to util/env.c Namhyung Kim
2024-09-05 20:24 ` [PATCH 09/10] perf tools: Check fallback error and order Namhyung Kim
2024-09-05 20:24 ` [PATCH 10/10] perf record: Just use "cycles:P" as the default event Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).