linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] Support PERF_SAMPLE_READ with inherit_stat
@ 2024-01-19 16:39 Ben Gainey
  2024-01-19 16:39 ` [PATCH 1/1] perf: " Ben Gainey
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ben Gainey @ 2024-01-19 16:39 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, james.clark, Ben Gainey

This change allows events to use PERF_SAMPLE READ with inherit so long 
as both inherit_stat and PERF_SAMPLE_TID are set.

Currently it is not possible to use PERF_SAMPLE_READ with inherit. This 
restriction assumes the user is interested in collecting aggregate 
statistics as per `perf stat`. It prevents a user from collecting 
per-thread samples using counter groups from a multi-threaded or 
multi-process application, as with `perf record -e '{....}:S'`. Instead 
users must use system-wide mode, or forgo the ability to sample counter 
groups. System-wide mode is often problematic as it requires specific 
permissions (no CAP_PERFMON / root access), or may lead to capture of 
significant amounts of extra data from other processes running on the 
system. 

Perf already supports the ability to collect per-thread counts with 
`inherit` via the `inherit_stat` flag. This patch changes 
`perf_event_alloc` relaxing the restriction to combine `inherit` with 
`PERF_SAMPLE_READ` so that the combination will be allowed so long as 
`inherit_stat` and `PERF_SAMPLE_TID` are enabled.

In this configuration stream ids (such as may appear in the read_format 
field of a PERF_RECORD_SAMPLE) are no longer globally unique, rather 
the pair of (stream id, tid) uniquely identify each event. Tools that 
rely on this, for example to calculate a delta between samples, would 
need updating to take this into account. Previously valid event 
configurations (system-wide, no-inherit and so on) where each stream id 
is the identifier are unaffected.

This patch has been tested on aarch64 both my manual inspection of the 
output of `perf script -D` and through a modified version of Arm's 
commercial profiling tools and the numbers appear to line up as one 
would expect, but some further validation across other architectures 
and/or edge cases would be welcome.

This patch was developed and tested on top of v6.7.


Ben Gainey (1):
  perf: Support PERF_SAMPLE_READ with inherit_stat

 kernel/events/core.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-01-20 16:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-19 16:39 [PATCH 0/1] Support PERF_SAMPLE_READ with inherit_stat Ben Gainey
2024-01-19 16:39 ` [PATCH 1/1] perf: " Ben Gainey
2024-01-19 17:45 ` [PATCH 0/1] " Andi Kleen
2024-01-19 18:08   ` Ben Gainey
2024-01-19 22:55     ` Andi Kleen
2024-01-20  0:49 ` Namhyung Kim
2024-01-20 16:14   ` Ben Gainey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).