* [PATCH v5 1/2] libbpf: Add the ability to suppress perf event enablement
2025-08-06 16:22 [PATCH v5 0/2] perf/s390: Regression: Move uid filtering to BPF filters Ilya Leoshkevich
@ 2025-08-06 16:22 ` Ilya Leoshkevich
2025-08-06 16:22 ` [PATCH v5 2/2] perf bpf-filter: Enable events manually Ilya Leoshkevich
2025-08-07 16:20 ` [PATCH v5 0/2] perf/s390: Regression: Move uid filtering to BPF filters patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Ilya Leoshkevich @ 2025-08-06 16:22 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Ian Rogers,
Arnaldo Carvalho de Melo
Cc: bpf, linux-perf-users, linux-kernel, linux-s390, Thomas Richter,
Jiri Olsa, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Ilya Leoshkevich, Eduard Zingerman
Automatically enabling a perf event after attaching a BPF prog to it is
not always desirable.
Add a new "dont_enable" field to struct bpf_perf_event_opts. While
introducing "enable" instead would be nicer in that it would avoid
a double negation in the implementation, it would make
DECLARE_LIBBPF_OPTS() less efficient.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Co-developed-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
tools/lib/bpf/libbpf.c | 13 ++++++++-----
tools/lib/bpf/libbpf.h | 4 +++-
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index fb4d92c5c339..8f5a81b672e1 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -10965,11 +10965,14 @@ struct bpf_link *bpf_program__attach_perf_event_opts(const struct bpf_program *p
}
link->link.fd = pfd;
}
- if (ioctl(pfd, PERF_EVENT_IOC_ENABLE, 0) < 0) {
- err = -errno;
- pr_warn("prog '%s': failed to enable perf_event FD %d: %s\n",
- prog->name, pfd, errstr(err));
- goto err_out;
+
+ if (!OPTS_GET(opts, dont_enable, false)) {
+ if (ioctl(pfd, PERF_EVENT_IOC_ENABLE, 0) < 0) {
+ err = -errno;
+ pr_warn("prog '%s': failed to enable perf_event FD %d: %s\n",
+ prog->name, pfd, errstr(err));
+ goto err_out;
+ }
}
return &link->link;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index d1cf813a057b..455a957cb702 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -499,9 +499,11 @@ struct bpf_perf_event_opts {
__u64 bpf_cookie;
/* don't use BPF link when attach BPF program */
bool force_ioctl_attach;
+ /* don't automatically enable the event */
+ bool dont_enable;
size_t :0;
};
-#define bpf_perf_event_opts__last_field force_ioctl_attach
+#define bpf_perf_event_opts__last_field dont_enable
LIBBPF_API struct bpf_link *
bpf_program__attach_perf_event(const struct bpf_program *prog, int pfd);
--
2.50.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v5 2/2] perf bpf-filter: Enable events manually
2025-08-06 16:22 [PATCH v5 0/2] perf/s390: Regression: Move uid filtering to BPF filters Ilya Leoshkevich
2025-08-06 16:22 ` [PATCH v5 1/2] libbpf: Add the ability to suppress perf event enablement Ilya Leoshkevich
@ 2025-08-06 16:22 ` Ilya Leoshkevich
2025-08-07 16:20 ` [PATCH v5 0/2] perf/s390: Regression: Move uid filtering to BPF filters patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Ilya Leoshkevich @ 2025-08-06 16:22 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Ian Rogers,
Arnaldo Carvalho de Melo
Cc: bpf, linux-perf-users, linux-kernel, linux-s390, Thomas Richter,
Jiri Olsa, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Ilya Leoshkevich
On s390, and, in general, on all platforms where the respective event
supports auxiliary data gathering, the command:
# ./perf record -u 0 -aB --synth=no -- ./perf test -w thloop
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data ]
# ./perf report --stats | grep SAMPLE
#
does not generate samples in the perf.data file. On x86 the command:
# sudo perf record -e intel_pt// -u 0 ls
is broken too.
Looking at the sequence of calls in 'perf record' reveals this
behavior:
1. The event 'cycles' is created and enabled:
record__open()
+-> evlist__apply_filters()
+-> perf_bpf_filter__prepare()
+-> bpf_program.attach_perf_event()
+-> bpf_program.attach_perf_event_opts()
+-> __GI___ioctl(..., PERF_EVENT_IOC_ENABLE, ...)
The event 'cycles' is enabled and active now. However the event's
ring-buffer to store the samples generated by hardware is not
allocated yet.
2. The event's fd is mmap()ed to create the ring buffer:
record__open()
+-> record__mmap()
+-> record__mmap_evlist()
+-> evlist__mmap_ex()
+-> perf_evlist__mmap_ops()
+-> mmap_per_cpu()
+-> mmap_per_evsel()
+-> mmap__mmap()
+-> perf_mmap__mmap()
+-> mmap()
This allocates the ring buffer for the event 'cycles'. With mmap()
the kernel creates the ring buffer:
perf_mmap(): kernel function to create the event's ring
| buffer to save the sampled data.
|
+-> ring_buffer_attach(): Allocates memory for ring buffer.
| The PMU has auxiliary data setup function. The
| has_aux(event) condition is true and the PMU's
| stop() is called to stop sampling. It is not
| restarted:
|
| if (has_aux(event))
| perf_event_stop(event, 0);
|
+-> cpumsf_pmu_stop():
Hardware sampling is stopped. No samples are generated and saved
anymore.
3. After the event 'cycles' has been mapped, the event is enabled a
second time in:
__cmd_record()
+-> evlist__enable()
+-> __evlist__enable()
+-> evsel__enable_cpu()
+-> perf_evsel__enable_cpu()
+-> perf_evsel__run_ioctl()
+-> perf_evsel__ioctl()
+-> __GI___ioctl(., PERF_EVENT_IOC_ENABLE, .)
The second
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
is just a NOP in this case. The first invocation in (1.) sets the
event::state to PERF_EVENT_STATE_ACTIVE. The kernel functions
perf_ioctl()
+-> _perf_ioctl()
+-> _perf_event_enable()
+-> __perf_event_enable()
return immediately because event::state is already set to
PERF_EVENT_STATE_ACTIVE.
This happens on s390, because the event 'cycles' offers the possibility
to save auxilary data. The PMU callbacks setup_aux() and free_aux() are
defined. Without both callback functions, cpumsf_pmu_stop() is not
invoked and sampling continues.
To remedy this, remove the first invocation of
ioctl(..., PERF_EVENT_IOC_ENABLE, ...).
in step (1.) Create the event in step (1.) and enable it in step (3.)
after the ring buffer has been mapped.
Output after:
# ./perf record -aB --synth=no -u 0 -- ./perf test -w thloop 2
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.876 MB perf.data ]
# ./perf report --stats | grep SAMPLE
SAMPLE events: 16200 (99.5%)
SAMPLE events: 16200
#
The software event succeeded both before and after the patch:
# ./perf record -e cpu-clock -aB --synth=no -u 0 -- \
./perf test -w thloop 2
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 2.870 MB perf.data ]
# ./perf report --stats | grep SAMPLE
SAMPLE events: 53506 (99.8%)
SAMPLE events: 53506
#
Fixes: b4c658d4d63d61 ("perf target: Remove uid from target")
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Co-developed-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
tools/perf/util/bpf-filter.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/bpf-filter.c b/tools/perf/util/bpf-filter.c
index d0e013eeb0f7..a0b11f35395f 100644
--- a/tools/perf/util/bpf-filter.c
+++ b/tools/perf/util/bpf-filter.c
@@ -451,6 +451,8 @@ int perf_bpf_filter__prepare(struct evsel *evsel, struct target *target)
struct bpf_link *link;
struct perf_bpf_filter_entry *entry;
bool needs_idx_hash = !target__has_cpu(target);
+ DECLARE_LIBBPF_OPTS(bpf_perf_event_opts, pe_opts,
+ .dont_enable = true);
entry = calloc(MAX_FILTERS, sizeof(*entry));
if (entry == NULL)
@@ -522,7 +524,8 @@ int perf_bpf_filter__prepare(struct evsel *evsel, struct target *target)
prog = skel->progs.perf_sample_filter;
for (x = 0; x < xyarray__max_x(evsel->core.fd); x++) {
for (y = 0; y < xyarray__max_y(evsel->core.fd); y++) {
- link = bpf_program__attach_perf_event(prog, FD(evsel, x, y));
+ link = bpf_program__attach_perf_event_opts(prog, FD(evsel, x, y),
+ &pe_opts);
if (IS_ERR(link)) {
pr_err("Failed to attach perf sample-filter program\n");
ret = PTR_ERR(link);
--
2.50.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v5 0/2] perf/s390: Regression: Move uid filtering to BPF filters
2025-08-06 16:22 [PATCH v5 0/2] perf/s390: Regression: Move uid filtering to BPF filters Ilya Leoshkevich
2025-08-06 16:22 ` [PATCH v5 1/2] libbpf: Add the ability to suppress perf event enablement Ilya Leoshkevich
2025-08-06 16:22 ` [PATCH v5 2/2] perf bpf-filter: Enable events manually Ilya Leoshkevich
@ 2025-08-07 16:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-08-07 16:20 UTC (permalink / raw)
To: Ilya Leoshkevich
Cc: ast, daniel, andrii, irogers, acme, bpf, linux-perf-users,
linux-kernel, linux-s390, tmricht, jolsa, hca, gor, agordeev
Hello:
This series was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Wed, 6 Aug 2025 18:22:40 +0200 you wrote:
> v4: https://lore.kernel.org/bpf/20250806114227.14617-1-iii@linux.ibm.com/
> v4 -> v5: Fix a typo in the commit message (Yonghong).
>
> v3: https://lore.kernel.org/bpf/20250805130346.1225535-1-iii@linux.ibm.com/
> v3 -> v4: Rename the new field to dont_enable (Alexei, Eduard).
> Switch the Fixes: tag in patch 2 (Alexander, Thomas).
> Fix typos in the cover letter (Thomas).
>
> [...]
Here is the summary with links:
- [v5,1/2] libbpf: Add the ability to suppress perf event enablement
https://git.kernel.org/bpf/bpf/c/9474e27a24a4
- [v5,2/2] perf bpf-filter: Enable events manually
https://git.kernel.org/bpf/bpf/c/5e2ac8e8571d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread