* [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit
@ 2024-08-01 12:30 Ben Gainey
2024-08-01 12:30 ` [PATCH v10 1/2] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values Ben Gainey
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Ben Gainey @ 2024-08-01 12:30 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: james.clark, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, linux-perf-users, linux-kernel, Ben Gainey
This revision of this change splits out the tools/perf changes requested
by Namhung King for my previous work to enable PERF_SAMPLE READ with inherit (see
https://lore.kernel.org/linux-perf-users/20240730084417.7693-1-ben.gainey@arm.com/ )
as the kernel side changes have been picked up by Peter Zijlstra.
Changes since v9:
- Split out tools/perf patches only
- Fixed system-wide mode in `perf record` to not set the inherit bit.
Changes since v8:
- Rebase on v6.11-rc1
Changes since v7:
- Rebase on v6.10-rc3
- Respond to Peter Zijlstra's feedback:
- Renamed nr_pending to nr_no_switch_fast and merged in nr_inherit_read
which otherwise had overlapping use
- Updated some of the commit messages to provide better justifications
of usecase, behavioural changes and so on
- Cleanup perf_event_count/_cumulative
- Make it explicit that the sampling event decides whether or not the
per-thread value is given in read_format for PERF_RECORD_SAMPLE and
PERF_RECORD_READ; updated tools to account for this.
Changes since v6:
- Rebase on v6.10-rc2
- Make additional "perf test" tests succeed / skip based on kernel
version as per feedback from Namhyung.
Changes since v5:
- Rebase on v6.9
- Cleanup feedback from Namhyung Kim
Changes since v4:
- Rebase on v6.9-rc1
- Removed the dependency on inherit_stat that was previously assumed
necessary as per feedback from Namhyung Kim.
- Fixed an incorrect use of zfree instead of free in the tools leading
to an abort on tool shutdown.
- Additional test coverage improvements added to perf test.
- Cleaned up the remaining bit of irrelevant change missed between v3
and v4.
Changes since v3:
- Cleaned up perf test data changes incorrectly included into this
series from elsewhere.
Changes since v2:
- Rebase on v6.8
- Respond to James Clarke's feedback; fixup some typos and move some
repeated checks into a helper macro.
- Cleaned up checkpatch lints.
- Updated perf test; fixed evsel handling so that existing tests pass
and added new tests to cover the new behaviour.
Changes since v1:
- Rebase on v6.8-rc1
- Fixed value written into sample after child exists.
- Modified handling of switch-out so that context with these events
take the slow path, so that the per-event/per-thread PMU state is
correctly switched.
- Modified perf tools to support this mode of operation.
Ben Gainey (2):
tools/perf: Correctly calculate sample period for inherited
SAMPLE_READ values
tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events
tools/lib/perf/evsel.c | 48 ++++++++++++++
tools/lib/perf/include/internal/evsel.h | 63 ++++++++++++++++++-
tools/perf/tests/attr/README | 2 +
tools/perf/tests/attr/test-record-C0 | 2 +
tools/perf/tests/attr/test-record-dummy-C0 | 2 +-
.../tests/attr/test-record-group-sampling | 3 +-
.../tests/attr/test-record-group-sampling1 | 51 +++++++++++++++
.../tests/attr/test-record-group-sampling2 | 61 ++++++++++++++++++
tools/perf/tests/attr/test-record-group2 | 1 +
...{test-record-group2 => test-record-group3} | 10 +--
tools/perf/util/evsel.c | 21 ++++++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/session.c | 25 +++++---
13 files changed, 271 insertions(+), 19 deletions(-)
create mode 100644 tools/perf/tests/attr/test-record-group-sampling1
create mode 100644 tools/perf/tests/attr/test-record-group-sampling2
copy tools/perf/tests/attr/{test-record-group2 => test-record-group3} (81%)
--
2.45.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v10 1/2] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values
2024-08-01 12:30 [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit Ben Gainey
@ 2024-08-01 12:30 ` Ben Gainey
2024-08-02 18:21 ` Namhyung Kim
2024-08-01 12:30 ` [PATCH v10 2/2] tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events Ben Gainey
2024-08-02 18:22 ` [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit Namhyung Kim
2 siblings, 1 reply; 5+ messages in thread
From: Ben Gainey @ 2024-08-01 12:30 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: james.clark, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, linux-perf-users, linux-kernel, Ben Gainey
Sample period calculation in deliver_sample_value is updated to
calculate the per-thread period delta for events that are inherit +
PERF_SAMPLE_READ. When the sampling event has this configuration, the
read_format.id is used with the tid from the sample to lookup the
storage of the previously accumulated counter total before calculating
the delta. All existing valid configurations where read_format.value
represents some global value continue to use just the read_format.id to
locate the storage of the previously accumulated total.
perf_sample_id is modified to support tracking per-thread
values, along with the existing global per-id values. In the
per-thread case, values are stored in a hash by tid within the
perf_sample_id, and are dynamically allocated as the number is not known
ahead of time.
Signed-off-by: Ben Gainey <ben.gainey@arm.com>
---
tools/lib/perf/evsel.c | 48 +++++++++++++++++++
tools/lib/perf/include/internal/evsel.h | 63 ++++++++++++++++++++++++-
tools/perf/util/session.c | 25 ++++++----
3 files changed, 126 insertions(+), 10 deletions(-)
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index c07160953224..abdae2f9498b 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -5,6 +5,7 @@
#include <perf/evsel.h>
#include <perf/cpumap.h>
#include <perf/threadmap.h>
+#include <linux/hash.h>
#include <linux/list.h>
#include <internal/evsel.h>
#include <linux/zalloc.h>
@@ -23,6 +24,7 @@ void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
int idx)
{
INIT_LIST_HEAD(&evsel->node);
+ INIT_LIST_HEAD(&evsel->per_stream_periods);
evsel->attr = *attr;
evsel->idx = idx;
evsel->leader = evsel;
@@ -531,10 +533,56 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
void perf_evsel__free_id(struct perf_evsel *evsel)
{
+ struct perf_sample_id_period *pos, *n;
+
xyarray__delete(evsel->sample_id);
evsel->sample_id = NULL;
zfree(&evsel->id);
evsel->ids = 0;
+
+ perf_evsel_for_each_per_thread_period_safe(evsel, n, pos) {
+ list_del_init(&pos->node);
+ free(pos);
+ }
+}
+
+bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evsel)
+{
+ return (evsel->attr.sample_type & PERF_SAMPLE_READ)
+ && (evsel->attr.sample_type & PERF_SAMPLE_TID)
+ && evsel->attr.inherit;
+}
+
+u64 *perf_sample_id__get_period_storage(struct perf_sample_id *sid, u32 tid, bool per_thread)
+{
+ struct hlist_head *head;
+ struct perf_sample_id_period *res;
+ int hash;
+
+ if (!per_thread)
+ return &sid->period;
+
+ hash = hash_32(tid, PERF_SAMPLE_ID__HLIST_BITS);
+ head = &sid->periods[hash];
+
+ hlist_for_each_entry(res, head, hnode)
+ if (res->tid == tid)
+ return &res->period;
+
+ if (sid->evsel == NULL)
+ return NULL;
+
+ res = zalloc(sizeof(struct perf_sample_id_period));
+ if (res == NULL)
+ return NULL;
+
+ INIT_LIST_HEAD(&res->node);
+ res->tid = tid;
+
+ list_add_tail(&res->node, &sid->evsel->per_stream_periods);
+ hlist_add_head(&res->hnode, &sid->periods[hash]);
+
+ return &res->period;
}
void perf_counts_values__scale(struct perf_counts_values *count,
diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
index 5cd220a61962..ea78defa77d0 100644
--- a/tools/lib/perf/include/internal/evsel.h
+++ b/tools/lib/perf/include/internal/evsel.h
@@ -11,6 +11,32 @@
struct perf_thread_map;
struct xyarray;
+/**
+ * The per-thread accumulated period storage node.
+ */
+struct perf_sample_id_period {
+ struct list_head node;
+ struct hlist_node hnode;
+ /* Holds total ID period value for PERF_SAMPLE_READ processing. */
+ u64 period;
+ /* The TID that the values belongs to */
+ u32 tid;
+};
+
+/**
+ * perf_evsel_for_each_per_thread_period_safe - safely iterate thru all the
+ * per_stream_periods
+ * @evlist:perf_evsel instance to iterate
+ * @item: struct perf_sample_id_period iterator
+ * @tmp: struct perf_sample_id_period temp iterator
+ */
+#define perf_evsel_for_each_per_thread_period_safe(evsel, tmp, item) \
+ list_for_each_entry_safe(item, tmp, &(evsel)->per_stream_periods, node)
+
+
+#define PERF_SAMPLE_ID__HLIST_BITS 4
+#define PERF_SAMPLE_ID__HLIST_SIZE (1 << PERF_SAMPLE_ID__HLIST_BITS)
+
/*
* Per fd, to map back from PERF_SAMPLE_ID to evsel, only used when there are
* more than one entry in the evlist.
@@ -34,8 +60,32 @@ struct perf_sample_id {
pid_t machine_pid;
struct perf_cpu vcpu;
- /* Holds total ID period value for PERF_SAMPLE_READ processing. */
- u64 period;
+ /*
+ * Per-thread, and global event counts are mutually exclusive:
+ * Whilst it is possible to combine events into a group with differing
+ * values of PERF_SAMPLE_READ, it is not valid to have inconsistent
+ * values for `inherit`. Therefore it is not possible to have a
+ * situation where a per-thread event is sampled as a global event;
+ * all !inherit groups are global, and all groups where the sampling
+ * event is inherit + PERF_SAMPLE_READ will be per-thread. Any event
+ * that is part of such a group that is inherit but not PERF_SAMPLE_READ
+ * will be read as per-thread. If such an event can also trigger a
+ * sample (such as with sample_period > 0) then it will not cause
+ * `read_format` to be included in its PERF_RECORD_SAMPLE, and
+ * therefore will not expose the per-thread group members as global.
+ */
+ union {
+ /*
+ * Holds total ID period value for PERF_SAMPLE_READ processing
+ * (when period is not per-thread).
+ */
+ u64 period;
+ /*
+ * Holds total ID period value for PERF_SAMPLE_READ processing
+ * (when period is per-thread).
+ */
+ struct hlist_head periods[PERF_SAMPLE_ID__HLIST_SIZE];
+ };
};
struct perf_evsel {
@@ -58,6 +108,10 @@ struct perf_evsel {
u32 ids;
struct perf_evsel *leader;
+ /* For events where the read_format value is per-thread rather than
+ * global, stores the per-thread cumulative period */
+ struct list_head per_stream_periods;
+
/* parse modifier helper */
int nr_members;
/*
@@ -88,4 +142,9 @@ int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter);
int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads);
void perf_evsel__free_id(struct perf_evsel *evsel);
+bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evsel);
+
+u64 *perf_sample_id__get_period_storage(struct perf_sample_id *sid, u32 tid,
+ bool per_thread);
+
#endif /* __LIBPERF_INTERNAL_EVSEL_H */
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 5596bed1b8c8..fac0557ff6ea 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1474,18 +1474,24 @@ static int deliver_sample_value(struct evlist *evlist,
union perf_event *event,
struct perf_sample *sample,
struct sample_read_value *v,
- struct machine *machine)
+ struct machine *machine,
+ bool per_thread)
{
struct perf_sample_id *sid = evlist__id2sid(evlist, v->id);
struct evsel *evsel;
+ u64 *storage = NULL;
if (sid) {
+ storage = perf_sample_id__get_period_storage(sid, sample->tid, per_thread);
+ }
+
+ if (storage) {
sample->id = v->id;
- sample->period = v->value - sid->period;
- sid->period = v->value;
+ sample->period = v->value - *storage;
+ *storage = v->value;
}
- if (!sid || sid->evsel == NULL) {
+ if (!storage || sid->evsel == NULL) {
++evlist->stats.nr_unknown_id;
return 0;
}
@@ -1506,14 +1512,15 @@ static int deliver_sample_group(struct evlist *evlist,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine,
- u64 read_format)
+ u64 read_format,
+ bool per_thread)
{
int ret = -EINVAL;
struct sample_read_value *v = sample->read.group.values;
sample_read_group__for_each(v, sample->read.group.nr, read_format) {
ret = deliver_sample_value(evlist, tool, event, sample, v,
- machine);
+ machine, per_thread);
if (ret)
break;
}
@@ -1528,6 +1535,7 @@ static int evlist__deliver_sample(struct evlist *evlist, struct perf_tool *tool,
/* We know evsel != NULL. */
u64 sample_type = evsel->core.attr.sample_type;
u64 read_format = evsel->core.attr.read_format;
+ bool per_thread = perf_evsel__attr_has_per_thread_sample_period(&evsel->core);
/* Standard sample delivery. */
if (!(sample_type & PERF_SAMPLE_READ))
@@ -1536,10 +1544,11 @@ static int evlist__deliver_sample(struct evlist *evlist, struct perf_tool *tool,
/* For PERF_SAMPLE_READ we have either single or group mode. */
if (read_format & PERF_FORMAT_GROUP)
return deliver_sample_group(evlist, tool, event, sample,
- machine, read_format);
+ machine, read_format, per_thread);
else
return deliver_sample_value(evlist, tool, event, sample,
- &sample->read.one, machine);
+ &sample->read.one, machine,
+ per_thread);
}
static int machines__deliver_event(struct machines *machines,
--
2.45.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v10 2/2] tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events
2024-08-01 12:30 [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit Ben Gainey
2024-08-01 12:30 ` [PATCH v10 1/2] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values Ben Gainey
@ 2024-08-01 12:30 ` Ben Gainey
2024-08-02 18:22 ` [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit Namhyung Kim
2 siblings, 0 replies; 5+ messages in thread
From: Ben Gainey @ 2024-08-01 12:30 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: james.clark, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, linux-perf-users, linux-kernel, Ben Gainey
The "perf record" tool will now default to this new mode if the user
specifies a sampling group when not in system-wide mode, and when
"--no-inherit" is not specified.
This change updates evsel to allow the combination of inherit
and PERF_SAMPLE_READ.
A fallback is implemented for kernel versions where this feature is not
supported.
Signed-off-by: Ben Gainey <ben.gainey@arm.com>
---
tools/perf/tests/attr/README | 2 +
tools/perf/tests/attr/test-record-C0 | 2 +
tools/perf/tests/attr/test-record-dummy-C0 | 2 +-
.../tests/attr/test-record-group-sampling | 3 +-
.../tests/attr/test-record-group-sampling1 | 51 ++++++++++++++++
.../tests/attr/test-record-group-sampling2 | 61 +++++++++++++++++++
tools/perf/tests/attr/test-record-group2 | 1 +
...{test-record-group2 => test-record-group3} | 10 +--
tools/perf/util/evsel.c | 21 ++++++-
tools/perf/util/evsel.h | 1 +
10 files changed, 145 insertions(+), 9 deletions(-)
create mode 100644 tools/perf/tests/attr/test-record-group-sampling1
create mode 100644 tools/perf/tests/attr/test-record-group-sampling2
copy tools/perf/tests/attr/{test-record-group2 => test-record-group3} (81%)
diff --git a/tools/perf/tests/attr/README b/tools/perf/tests/attr/README
index 4066fec7180a..67c4ca76b85d 100644
--- a/tools/perf/tests/attr/README
+++ b/tools/perf/tests/attr/README
@@ -51,6 +51,8 @@ Following tests are defined (with perf commands):
perf record --call-graph fp kill (test-record-graph-fp-aarch64)
perf record -e '{cycles,instructions}' kill (test-record-group1)
perf record -e '{cycles/period=1/,instructions/period=2/}:S' kill (test-record-group2)
+ perf record -e '{cycles,cache-misses}:S' kill (test-record-group-sampling1)
+ perf record -c 10000 -e '{cycles,cache-misses}:S' kill (test-record-group-sampling2)
perf record -D kill (test-record-no-delay)
perf record -i kill (test-record-no-inherit)
perf record -n kill (test-record-no-samples)
diff --git a/tools/perf/tests/attr/test-record-C0 b/tools/perf/tests/attr/test-record-C0
index 198e8429a1bf..1049ac8b52f2 100644
--- a/tools/perf/tests/attr/test-record-C0
+++ b/tools/perf/tests/attr/test-record-C0
@@ -18,5 +18,7 @@ sample_type=65927
mmap=0
comm=0
task=0
+inherit=0
[event:system-wide-dummy]
+inherit=0
diff --git a/tools/perf/tests/attr/test-record-dummy-C0 b/tools/perf/tests/attr/test-record-dummy-C0
index 576ec48b3aaf..3050298bd614 100644
--- a/tools/perf/tests/attr/test-record-dummy-C0
+++ b/tools/perf/tests/attr/test-record-dummy-C0
@@ -19,7 +19,7 @@ sample_period=4000
sample_type=391
read_format=4|20
disabled=0
-inherit=1
+inherit=0
pinned=0
exclusive=0
exclude_user=0
diff --git a/tools/perf/tests/attr/test-record-group-sampling b/tools/perf/tests/attr/test-record-group-sampling
index 97e7e64a38f0..86a940d7895d 100644
--- a/tools/perf/tests/attr/test-record-group-sampling
+++ b/tools/perf/tests/attr/test-record-group-sampling
@@ -2,6 +2,7 @@
command = record
args = --no-bpf-event -e '{cycles,cache-misses}:S' kill >/dev/null 2>&1
ret = 1
+kernel_until = 6.12
[event-1:base-record]
fd=1
@@ -18,7 +19,7 @@ group_fd=1
type=0
config=3
-# default | PERF_SAMPLE_READ
+# default | PERF_SAMPLE_READ | PERF_SAMPLE_PERIOD
sample_type=343
# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST
diff --git a/tools/perf/tests/attr/test-record-group-sampling1 b/tools/perf/tests/attr/test-record-group-sampling1
new file mode 100644
index 000000000000..e96a10627a46
--- /dev/null
+++ b/tools/perf/tests/attr/test-record-group-sampling1
@@ -0,0 +1,51 @@
+[config]
+command = record
+args = --no-bpf-event -e '{cycles,cache-misses}:S' kill >/dev/null 2>&1
+ret = 1
+kernel_since = 6.12
+
+[event-1:base-record]
+fd=1
+group_fd=-1
+
+# cycles
+type=0
+config=0
+
+# default | PERF_SAMPLE_READ | PERF_SAMPLE_PERIOD
+sample_type=343
+
+# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING
+read_format=28|31
+task=1
+mmap=1
+comm=1
+enable_on_exec=1
+disabled=1
+
+# inherit is enabled for group sampling
+inherit=1
+
+[event-2:base-record]
+fd=2
+group_fd=1
+
+# cache-misses
+type=0
+config=3
+
+# default | PERF_SAMPLE_READ | PERF_SAMPLE_PERIOD
+sample_type=343
+
+# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING
+read_format=28|31
+task=0
+mmap=0
+comm=0
+enable_on_exec=0
+disabled=0
+freq=0
+
+# inherit is enabled for group sampling
+inherit=1
+
diff --git a/tools/perf/tests/attr/test-record-group-sampling2 b/tools/perf/tests/attr/test-record-group-sampling2
new file mode 100644
index 000000000000..e0432244a0eb
--- /dev/null
+++ b/tools/perf/tests/attr/test-record-group-sampling2
@@ -0,0 +1,61 @@
+[config]
+command = record
+args = --no-bpf-event -c 10000 -e '{cycles,cache-misses}:S' kill >/dev/null 2>&1
+ret = 1
+kernel_since = 6.12
+
+[event-1:base-record]
+fd=1
+group_fd=-1
+
+# cycles
+type=0
+config=0
+
+# default | PERF_SAMPLE_READ
+sample_type=87
+
+# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING
+read_format=28|31
+task=1
+mmap=1
+comm=1
+enable_on_exec=1
+disabled=1
+
+# inherit is enabled for group sampling
+inherit=1
+
+# sampling disabled
+sample_freq=0
+sample_period=10000
+freq=0
+write_backward=0
+
+[event-2:base-record]
+fd=2
+group_fd=1
+
+# cache-misses
+type=0
+config=3
+
+# default | PERF_SAMPLE_READ
+sample_type=87
+
+# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING
+read_format=28|31
+task=0
+mmap=0
+comm=0
+enable_on_exec=0
+disabled=0
+
+# inherit is enabled for group sampling
+inherit=1
+
+# sampling disabled
+sample_freq=0
+sample_period=0
+freq=0
+write_backward=0
diff --git a/tools/perf/tests/attr/test-record-group2 b/tools/perf/tests/attr/test-record-group2
index cebdaa8e64e4..891d41a7bddf 100644
--- a/tools/perf/tests/attr/test-record-group2
+++ b/tools/perf/tests/attr/test-record-group2
@@ -2,6 +2,7 @@
command = record
args = --no-bpf-event -e '{cycles/period=1234000/,instructions/period=6789000/}:S' kill >/dev/null 2>&1
ret = 1
+kernel_until = 6.12
[event-1:base-record]
fd=1
diff --git a/tools/perf/tests/attr/test-record-group2 b/tools/perf/tests/attr/test-record-group3
similarity index 81%
copy from tools/perf/tests/attr/test-record-group2
copy to tools/perf/tests/attr/test-record-group3
index cebdaa8e64e4..249be884959e 100644
--- a/tools/perf/tests/attr/test-record-group2
+++ b/tools/perf/tests/attr/test-record-group3
@@ -2,6 +2,7 @@
command = record
args = --no-bpf-event -e '{cycles/period=1234000/,instructions/period=6789000/}:S' kill >/dev/null 2>&1
ret = 1
+kernel_since = 6.12
[event-1:base-record]
fd=1
@@ -9,8 +10,9 @@ group_fd=-1
config=0|1
sample_period=1234000
sample_type=87
-read_format=12|28
-inherit=0
+read_format=28|31
+disabled=1
+inherit=1
freq=0
[event-2:base-record]
@@ -19,9 +21,9 @@ group_fd=1
config=0|1
sample_period=6789000
sample_type=87
-read_format=12|28
+read_format=28|31
disabled=0
-inherit=0
+inherit=1
mmap=0
comm=0
freq=0
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index bc603193c477..d12e1b16313d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1149,7 +1149,7 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
bool per_cpu = opts->target.default_per_cpu && !opts->target.per_thread;
attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
- attr->inherit = !opts->no_inherit;
+ attr->inherit = target__has_cpu(&opts->target) ? 0 : !opts->no_inherit;
attr->write_backward = opts->overwrite ? 1 : 0;
attr->read_format = PERF_FORMAT_LOST;
@@ -1171,7 +1171,15 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
*/
if (leader->core.nr_members > 1) {
attr->read_format |= PERF_FORMAT_GROUP;
- attr->inherit = 0;
+ }
+
+ /*
+ * Inherit + SAMPLE_READ requires SAMPLE_TID in the read_format
+ */
+ if (attr->inherit) {
+ evsel__set_sample_bit(evsel, TID);
+ evsel->core.attr.read_format |=
+ PERF_FORMAT_ID;
}
}
@@ -2020,6 +2028,8 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
static void evsel__disable_missing_features(struct evsel *evsel)
{
+ if (perf_missing_features.inherit_sample_read)
+ evsel->core.attr.inherit = 0;
if (perf_missing_features.branch_counters)
evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_COUNTERS;
if (perf_missing_features.read_lost)
@@ -2075,7 +2085,12 @@ bool evsel__detect_missing_features(struct evsel *evsel)
* Must probe features in the order they were added to the
* perf_event_attr interface.
*/
- if (!perf_missing_features.branch_counters &&
+ if (!perf_missing_features.inherit_sample_read &&
+ evsel->core.attr.inherit && (evsel->core.attr.sample_type & PERF_SAMPLE_READ)) {
+ perf_missing_features.inherit_sample_read = true;
+ pr_debug2("Using PERF_SAMPLE_READ / :S modifier is not compatible with inherit, falling back to no-inherit.\n");
+ return true;
+ } else if (!perf_missing_features.branch_counters &&
(evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_COUNTERS)) {
perf_missing_features.branch_counters = true;
pr_debug2("switching off branch counters support\n");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 80b5f6dd868e..bb0c91c23679 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -206,6 +206,7 @@ struct perf_missing_features {
bool weight_struct;
bool read_lost;
bool branch_counters;
+ bool inherit_sample_read;
};
extern struct perf_missing_features perf_missing_features;
--
2.45.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v10 1/2] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values
2024-08-01 12:30 ` [PATCH v10 1/2] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values Ben Gainey
@ 2024-08-02 18:21 ` Namhyung Kim
0 siblings, 0 replies; 5+ messages in thread
From: Namhyung Kim @ 2024-08-02 18:21 UTC (permalink / raw)
To: Ben Gainey
Cc: peterz, mingo, acme, james.clark, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter,
linux-perf-users, linux-kernel
On Thu, Aug 01, 2024 at 01:30:29PM +0100, Ben Gainey wrote:
> Sample period calculation in deliver_sample_value is updated to
> calculate the per-thread period delta for events that are inherit +
> PERF_SAMPLE_READ. When the sampling event has this configuration, the
> read_format.id is used with the tid from the sample to lookup the
> storage of the previously accumulated counter total before calculating
> the delta. All existing valid configurations where read_format.value
> represents some global value continue to use just the read_format.id to
> locate the storage of the previously accumulated total.
>
> perf_sample_id is modified to support tracking per-thread
> values, along with the existing global per-id values. In the
> per-thread case, values are stored in a hash by tid within the
> perf_sample_id, and are dynamically allocated as the number is not known
> ahead of time.
>
> Signed-off-by: Ben Gainey <ben.gainey@arm.com>
> ---
> tools/lib/perf/evsel.c | 48 +++++++++++++++++++
> tools/lib/perf/include/internal/evsel.h | 63 ++++++++++++++++++++++++-
> tools/perf/util/session.c | 25 ++++++----
> 3 files changed, 126 insertions(+), 10 deletions(-)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index c07160953224..abdae2f9498b 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -5,6 +5,7 @@
> #include <perf/evsel.h>
> #include <perf/cpumap.h>
> #include <perf/threadmap.h>
> +#include <linux/hash.h>
> #include <linux/list.h>
> #include <internal/evsel.h>
> #include <linux/zalloc.h>
> @@ -23,6 +24,7 @@ void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> int idx)
> {
> INIT_LIST_HEAD(&evsel->node);
> + INIT_LIST_HEAD(&evsel->per_stream_periods);
> evsel->attr = *attr;
> evsel->idx = idx;
> evsel->leader = evsel;
> @@ -531,10 +533,56 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
>
> void perf_evsel__free_id(struct perf_evsel *evsel)
> {
> + struct perf_sample_id_period *pos, *n;
> +
> xyarray__delete(evsel->sample_id);
> evsel->sample_id = NULL;
> zfree(&evsel->id);
> evsel->ids = 0;
> +
> + perf_evsel_for_each_per_thread_period_safe(evsel, n, pos) {
> + list_del_init(&pos->node);
> + free(pos);
> + }
> +}
> +
> +bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evsel)
> +{
> + return (evsel->attr.sample_type & PERF_SAMPLE_READ)
> + && (evsel->attr.sample_type & PERF_SAMPLE_TID)
> + && evsel->attr.inherit;
Nitpick: I believe the coding style wants to put the operators
at the end of the line like
return (evsel->attr.sample_type & PERF_SAMPLE_READ) &&
(evsel->attr.sample_type & PERF_SAMPLE_TID) &&
evsel->attr.inherit;
Thanks,
Namhyung
> +}
> +
> +u64 *perf_sample_id__get_period_storage(struct perf_sample_id *sid, u32 tid, bool per_thread)
> +{
> + struct hlist_head *head;
> + struct perf_sample_id_period *res;
> + int hash;
> +
> + if (!per_thread)
> + return &sid->period;
> +
> + hash = hash_32(tid, PERF_SAMPLE_ID__HLIST_BITS);
> + head = &sid->periods[hash];
> +
> + hlist_for_each_entry(res, head, hnode)
> + if (res->tid == tid)
> + return &res->period;
> +
> + if (sid->evsel == NULL)
> + return NULL;
> +
> + res = zalloc(sizeof(struct perf_sample_id_period));
> + if (res == NULL)
> + return NULL;
> +
> + INIT_LIST_HEAD(&res->node);
> + res->tid = tid;
> +
> + list_add_tail(&res->node, &sid->evsel->per_stream_periods);
> + hlist_add_head(&res->hnode, &sid->periods[hash]);
> +
> + return &res->period;
> }
>
> void perf_counts_values__scale(struct perf_counts_values *count,
> diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
> index 5cd220a61962..ea78defa77d0 100644
> --- a/tools/lib/perf/include/internal/evsel.h
> +++ b/tools/lib/perf/include/internal/evsel.h
> @@ -11,6 +11,32 @@
> struct perf_thread_map;
> struct xyarray;
>
> +/**
> + * The per-thread accumulated period storage node.
> + */
> +struct perf_sample_id_period {
> + struct list_head node;
> + struct hlist_node hnode;
> + /* Holds total ID period value for PERF_SAMPLE_READ processing. */
> + u64 period;
> + /* The TID that the values belongs to */
> + u32 tid;
> +};
> +
> +/**
> + * perf_evsel_for_each_per_thread_period_safe - safely iterate thru all the
> + * per_stream_periods
> + * @evlist:perf_evsel instance to iterate
> + * @item: struct perf_sample_id_period iterator
> + * @tmp: struct perf_sample_id_period temp iterator
> + */
> +#define perf_evsel_for_each_per_thread_period_safe(evsel, tmp, item) \
> + list_for_each_entry_safe(item, tmp, &(evsel)->per_stream_periods, node)
> +
> +
> +#define PERF_SAMPLE_ID__HLIST_BITS 4
> +#define PERF_SAMPLE_ID__HLIST_SIZE (1 << PERF_SAMPLE_ID__HLIST_BITS)
> +
> /*
> * Per fd, to map back from PERF_SAMPLE_ID to evsel, only used when there are
> * more than one entry in the evlist.
> @@ -34,8 +60,32 @@ struct perf_sample_id {
> pid_t machine_pid;
> struct perf_cpu vcpu;
>
> - /* Holds total ID period value for PERF_SAMPLE_READ processing. */
> - u64 period;
> + /*
> + * Per-thread, and global event counts are mutually exclusive:
> + * Whilst it is possible to combine events into a group with differing
> + * values of PERF_SAMPLE_READ, it is not valid to have inconsistent
> + * values for `inherit`. Therefore it is not possible to have a
> + * situation where a per-thread event is sampled as a global event;
> + * all !inherit groups are global, and all groups where the sampling
> + * event is inherit + PERF_SAMPLE_READ will be per-thread. Any event
> + * that is part of such a group that is inherit but not PERF_SAMPLE_READ
> + * will be read as per-thread. If such an event can also trigger a
> + * sample (such as with sample_period > 0) then it will not cause
> + * `read_format` to be included in its PERF_RECORD_SAMPLE, and
> + * therefore will not expose the per-thread group members as global.
> + */
> + union {
> + /*
> + * Holds total ID period value for PERF_SAMPLE_READ processing
> + * (when period is not per-thread).
> + */
> + u64 period;
> + /*
> + * Holds total ID period value for PERF_SAMPLE_READ processing
> + * (when period is per-thread).
> + */
> + struct hlist_head periods[PERF_SAMPLE_ID__HLIST_SIZE];
> + };
> };
>
> struct perf_evsel {
> @@ -58,6 +108,10 @@ struct perf_evsel {
> u32 ids;
> struct perf_evsel *leader;
>
> + /* For events where the read_format value is per-thread rather than
> + * global, stores the per-thread cumulative period */
> + struct list_head per_stream_periods;
> +
> /* parse modifier helper */
> int nr_members;
> /*
> @@ -88,4 +142,9 @@ int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter);
> int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads);
> void perf_evsel__free_id(struct perf_evsel *evsel);
>
> +bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evsel);
> +
> +u64 *perf_sample_id__get_period_storage(struct perf_sample_id *sid, u32 tid,
> + bool per_thread);
> +
> #endif /* __LIBPERF_INTERNAL_EVSEL_H */
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 5596bed1b8c8..fac0557ff6ea 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1474,18 +1474,24 @@ static int deliver_sample_value(struct evlist *evlist,
> union perf_event *event,
> struct perf_sample *sample,
> struct sample_read_value *v,
> - struct machine *machine)
> + struct machine *machine,
> + bool per_thread)
> {
> struct perf_sample_id *sid = evlist__id2sid(evlist, v->id);
> struct evsel *evsel;
> + u64 *storage = NULL;
>
> if (sid) {
> + storage = perf_sample_id__get_period_storage(sid, sample->tid, per_thread);
> + }
> +
> + if (storage) {
> sample->id = v->id;
> - sample->period = v->value - sid->period;
> - sid->period = v->value;
> + sample->period = v->value - *storage;
> + *storage = v->value;
> }
>
> - if (!sid || sid->evsel == NULL) {
> + if (!storage || sid->evsel == NULL) {
> ++evlist->stats.nr_unknown_id;
> return 0;
> }
> @@ -1506,14 +1512,15 @@ static int deliver_sample_group(struct evlist *evlist,
> union perf_event *event,
> struct perf_sample *sample,
> struct machine *machine,
> - u64 read_format)
> + u64 read_format,
> + bool per_thread)
> {
> int ret = -EINVAL;
> struct sample_read_value *v = sample->read.group.values;
>
> sample_read_group__for_each(v, sample->read.group.nr, read_format) {
> ret = deliver_sample_value(evlist, tool, event, sample, v,
> - machine);
> + machine, per_thread);
> if (ret)
> break;
> }
> @@ -1528,6 +1535,7 @@ static int evlist__deliver_sample(struct evlist *evlist, struct perf_tool *tool,
> /* We know evsel != NULL. */
> u64 sample_type = evsel->core.attr.sample_type;
> u64 read_format = evsel->core.attr.read_format;
> + bool per_thread = perf_evsel__attr_has_per_thread_sample_period(&evsel->core);
>
> /* Standard sample delivery. */
> if (!(sample_type & PERF_SAMPLE_READ))
> @@ -1536,10 +1544,11 @@ static int evlist__deliver_sample(struct evlist *evlist, struct perf_tool *tool,
> /* For PERF_SAMPLE_READ we have either single or group mode. */
> if (read_format & PERF_FORMAT_GROUP)
> return deliver_sample_group(evlist, tool, event, sample,
> - machine, read_format);
> + machine, read_format, per_thread);
> else
> return deliver_sample_value(evlist, tool, event, sample,
> - &sample->read.one, machine);
> + &sample->read.one, machine,
> + per_thread);
> }
>
> static int machines__deliver_event(struct machines *machines,
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit
2024-08-01 12:30 [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit Ben Gainey
2024-08-01 12:30 ` [PATCH v10 1/2] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values Ben Gainey
2024-08-01 12:30 ` [PATCH v10 2/2] tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events Ben Gainey
@ 2024-08-02 18:22 ` Namhyung Kim
2 siblings, 0 replies; 5+ messages in thread
From: Namhyung Kim @ 2024-08-02 18:22 UTC (permalink / raw)
To: Ben Gainey
Cc: peterz, mingo, acme, james.clark, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter,
linux-perf-users, linux-kernel
Hello,
On Thu, Aug 01, 2024 at 01:30:28PM +0100, Ben Gainey wrote:
> This revision of this change splits out the tools/perf changes requested
> by Namhung King for my previous work to enable PERF_SAMPLE READ with inherit (see
> https://lore.kernel.org/linux-perf-users/20240730084417.7693-1-ben.gainey@arm.com/ )
> as the kernel side changes have been picked up by Peter Zijlstra.
I only have a nitpick but otherwise looks good:
Acked-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
>
> Changes since v9:
> - Split out tools/perf patches only
> - Fixed system-wide mode in `perf record` to not set the inherit bit.
>
> Changes since v8:
> - Rebase on v6.11-rc1
>
> Changes since v7:
> - Rebase on v6.10-rc3
> - Respond to Peter Zijlstra's feedback:
> - Renamed nr_pending to nr_no_switch_fast and merged in nr_inherit_read
> which otherwise had overlapping use
> - Updated some of the commit messages to provide better justifications
> of usecase, behavioural changes and so on
> - Cleanup perf_event_count/_cumulative
> - Make it explicit that the sampling event decides whether or not the
> per-thread value is given in read_format for PERF_RECORD_SAMPLE and
> PERF_RECORD_READ; updated tools to account for this.
>
> Changes since v6:
> - Rebase on v6.10-rc2
> - Make additional "perf test" tests succeed / skip based on kernel
> version as per feedback from Namhyung.
>
> Changes since v5:
> - Rebase on v6.9
> - Cleanup feedback from Namhyung Kim
>
> Changes since v4:
> - Rebase on v6.9-rc1
> - Removed the dependency on inherit_stat that was previously assumed
> necessary as per feedback from Namhyung Kim.
> - Fixed an incorrect use of zfree instead of free in the tools leading
> to an abort on tool shutdown.
> - Additional test coverage improvements added to perf test.
> - Cleaned up the remaining bit of irrelevant change missed between v3
> and v4.
>
> Changes since v3:
> - Cleaned up perf test data changes incorrectly included into this
> series from elsewhere.
>
> Changes since v2:
> - Rebase on v6.8
> - Respond to James Clarke's feedback; fixup some typos and move some
> repeated checks into a helper macro.
> - Cleaned up checkpatch lints.
> - Updated perf test; fixed evsel handling so that existing tests pass
> and added new tests to cover the new behaviour.
>
> Changes since v1:
> - Rebase on v6.8-rc1
> - Fixed value written into sample after child exists.
> - Modified handling of switch-out so that context with these events
> take the slow path, so that the per-event/per-thread PMU state is
> correctly switched.
> - Modified perf tools to support this mode of operation.
>
> Ben Gainey (2):
> tools/perf: Correctly calculate sample period for inherited
> SAMPLE_READ values
> tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events
>
> tools/lib/perf/evsel.c | 48 ++++++++++++++
> tools/lib/perf/include/internal/evsel.h | 63 ++++++++++++++++++-
> tools/perf/tests/attr/README | 2 +
> tools/perf/tests/attr/test-record-C0 | 2 +
> tools/perf/tests/attr/test-record-dummy-C0 | 2 +-
> .../tests/attr/test-record-group-sampling | 3 +-
> .../tests/attr/test-record-group-sampling1 | 51 +++++++++++++++
> .../tests/attr/test-record-group-sampling2 | 61 ++++++++++++++++++
> tools/perf/tests/attr/test-record-group2 | 1 +
> ...{test-record-group2 => test-record-group3} | 10 +--
> tools/perf/util/evsel.c | 21 ++++++-
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/session.c | 25 +++++---
> 13 files changed, 271 insertions(+), 19 deletions(-)
> create mode 100644 tools/perf/tests/attr/test-record-group-sampling1
> create mode 100644 tools/perf/tests/attr/test-record-group-sampling2
> copy tools/perf/tests/attr/{test-record-group2 => test-record-group3} (81%)
>
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-08-02 18:22 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 12:30 [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit Ben Gainey
2024-08-01 12:30 ` [PATCH v10 1/2] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values Ben Gainey
2024-08-02 18:21 ` Namhyung Kim
2024-08-01 12:30 ` [PATCH v10 2/2] tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events Ben Gainey
2024-08-02 18:22 ` [PATCH v10 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit Namhyung Kim
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.