* [PATCH -next v3 0/2] perf stat: Support inherit events for bperf
@ 2024-09-16 1:43 Tengda Wu
2024-09-16 1:43 ` [PATCH -next v3 1/2] perf stat: Support inherit events during fork() " Tengda Wu
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Tengda Wu @ 2024-09-16 1:43 UTC (permalink / raw)
To: Peter Zijlstra, Namhyung Kim
Cc: song, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
kan.liang, linux-perf-users, linux-kernel, bpf
Hi,
Here is the 3th version of the series to support inherit events for bperf.
This version add pid or tgid selection based on filter type in new_task
prog to avoid memory waste and potential count loss.
bperf (perf-stat --bpf-counter) has not supported inherit events
during fork() since it was first introduced.
This patch series tries to add this support by:
1) adding two new bpf programs to monitor task lifecycle;
2) recording new tasks in the filter map dynamically;
3) reusing `accum_key` of parent task for new tasks.
Thanks,
Tengda
Changelog:
---------
v3: (Address comments from Namhyung, thanks)
* Use pid or tgid based on filter type in new_task prog
* Add comments to explain pid usage for TGID type in exit_task prog
v2: https://lore.kernel.org/all/20240905115918.772234-1-wutengda@huaweicloud.com/
* Remove the unused init_filter_entries in follower bpf, declare
a global filter_entry_count in bpf_counter instead
* Attach on_newtask and on_exittask progs only if the filter type
is either PID or TGID
v1: https://lore.kernel.org/all/20240904123103.732507-1-wutengda@huaweicloud.com/
Tengda Wu (2):
perf stat: Support inherit events during fork() for bperf
perf test: Use sqrtloop workload to test bperf event
tools/perf/tests/shell/stat_bpf_counters.sh | 2 +-
tools/perf/util/bpf_counter.c | 32 +++++--
tools/perf/util/bpf_skel/bperf_follower.bpf.c | 87 +++++++++++++++++--
tools/perf/util/bpf_skel/bperf_u.h | 5 ++
4 files changed, 112 insertions(+), 14 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH -next v3 1/2] perf stat: Support inherit events during fork() for bperf
2024-09-16 1:43 [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
@ 2024-09-16 1:43 ` Tengda Wu
2024-10-09 17:18 ` Song Liu
2024-09-16 1:43 ` [PATCH -next v3 2/2] perf test: Use sqrtloop workload to test bperf event Tengda Wu
2024-09-25 14:16 ` [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
2 siblings, 1 reply; 13+ messages in thread
From: Tengda Wu @ 2024-09-16 1:43 UTC (permalink / raw)
To: Peter Zijlstra, Namhyung Kim
Cc: song, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
kan.liang, linux-perf-users, linux-kernel, bpf
bperf has a nice ability to share PMUs, but it still does not support
inherit events during fork(), resulting in some deviations in its stat
results compared with perf.
perf stat result:
$ ./perf stat -e cycles,instructions -- ./perf test -w sqrtloop
Performance counter stats for './perf test -w sqrtloop':
2,316,038,116 cycles
2,859,350,725 instructions
1.009603637 seconds time elapsed
1.004196000 seconds user
0.003950000 seconds sys
bperf stat result:
$ ./perf stat --bpf-counters -e cycles,instructions -- \
./perf test -w sqrtloop
Performance counter stats for './perf test -w sqrtloop':
18,762,093 cycles
23,487,766 instructions
1.008913769 seconds time elapsed
1.003248000 seconds user
0.004069000 seconds sys
In order to support event inheritance, two new bpf programs are added
to monitor the fork and exit of tasks respectively. When a task is
created, add it to the filter map to enable counting, and reuse the
`accum_key` of its parent task to count together with the parent task.
When a task exits, remove it from the filter map to disable counting.
After support:
$ ./perf stat --bpf-counters -e cycles,instructions -- \
./perf test -w sqrtloop
Performance counter stats for './perf test -w sqrtloop':
2,316,252,189 cycles
2,859,946,547 instructions
1.009422314 seconds time elapsed
1.003597000 seconds user
0.004270000 seconds sys
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
---
tools/perf/util/bpf_counter.c | 32 +++++--
tools/perf/util/bpf_skel/bperf_follower.bpf.c | 87 +++++++++++++++++--
tools/perf/util/bpf_skel/bperf_u.h | 5 ++
3 files changed, 111 insertions(+), 13 deletions(-)
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index 7a8af60e0f51..94aa46f50052 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -394,6 +394,7 @@ static int bperf_check_target(struct evsel *evsel,
}
static struct perf_cpu_map *all_cpu_map;
+static __u32 filter_entry_cnt;
static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
struct perf_event_attr_map_entry *entry)
@@ -444,12 +445,31 @@ static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
return err;
}
+/* Attach programs on demand according to filter types to reduce overhead */
+static int bperf_attach_follower_program(struct bperf_follower_bpf *skel,
+ enum bperf_filter_type filter_type)
+{
+ struct bpf_link *link;
+ int err = 0;
+
+ if (filter_type == BPERF_FILTER_PID ||
+ filter_type == BPERF_FILTER_TGID)
+ err = bperf_follower_bpf__attach(skel);
+ else {
+ link = bpf_program__attach(skel->progs.fexit_XXX);
+ if (IS_ERR(link))
+ err = PTR_ERR(link);
+ }
+
+ return err;
+}
+
static int bperf__load(struct evsel *evsel, struct target *target)
{
struct perf_event_attr_map_entry entry = {0xffffffff, 0xffffffff};
int attr_map_fd, diff_map_fd = -1, err;
enum bperf_filter_type filter_type;
- __u32 filter_entry_cnt, i;
+ __u32 i;
if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
return -1;
@@ -529,9 +549,6 @@ static int bperf__load(struct evsel *evsel, struct target *target)
/* set up reading map */
bpf_map__set_max_entries(evsel->follower_skel->maps.accum_readings,
filter_entry_cnt);
- /* set up follower filter based on target */
- bpf_map__set_max_entries(evsel->follower_skel->maps.filter,
- filter_entry_cnt);
err = bperf_follower_bpf__load(evsel->follower_skel);
if (err) {
pr_err("Failed to load follower skeleton\n");
@@ -543,6 +560,7 @@ static int bperf__load(struct evsel *evsel, struct target *target)
for (i = 0; i < filter_entry_cnt; i++) {
int filter_map_fd;
__u32 key;
+ struct bperf_filter_value fval = { i, 0 };
if (filter_type == BPERF_FILTER_PID ||
filter_type == BPERF_FILTER_TGID)
@@ -553,12 +571,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
break;
filter_map_fd = bpf_map__fd(evsel->follower_skel->maps.filter);
- bpf_map_update_elem(filter_map_fd, &key, &i, BPF_ANY);
+ bpf_map_update_elem(filter_map_fd, &key, &fval, BPF_ANY);
}
evsel->follower_skel->bss->type = filter_type;
- err = bperf_follower_bpf__attach(evsel->follower_skel);
+ err = bperf_attach_follower_program(evsel->follower_skel, filter_type);
out:
if (err && evsel->bperf_leader_link_fd >= 0)
@@ -623,7 +641,7 @@ static int bperf__read(struct evsel *evsel)
bperf_sync_counters(evsel);
reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
- for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
+ for (i = 0; i < filter_entry_cnt; i++) {
struct perf_cpu entry;
__u32 cpu;
diff --git a/tools/perf/util/bpf_skel/bperf_follower.bpf.c b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
index f193998530d4..e804b2a9d0a6 100644
--- a/tools/perf/util/bpf_skel/bperf_follower.bpf.c
+++ b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
@@ -5,6 +5,8 @@
#include <bpf/bpf_tracing.h>
#include "bperf_u.h"
+#define MAX_ENTRIES 102400
+
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__uint(key_size, sizeof(__u32));
@@ -22,7 +24,9 @@ struct {
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(key_size, sizeof(__u32));
- __uint(value_size, sizeof(__u32));
+ __uint(value_size, sizeof(struct bperf_filter_value));
+ __uint(max_entries, MAX_ENTRIES);
+ __uint(map_flags, BPF_F_NO_PREALLOC);
} filter SEC(".maps");
enum bperf_filter_type type = 0;
@@ -33,14 +37,15 @@ int BPF_PROG(fexit_XXX)
{
struct bpf_perf_event_value *diff_val, *accum_val;
__u32 filter_key, zero = 0;
- __u32 *accum_key;
+ __u32 accum_key;
+ struct bperf_filter_value *fval;
if (!enabled)
return 0;
switch (type) {
case BPERF_FILTER_GLOBAL:
- accum_key = &zero;
+ accum_key = zero;
goto do_add;
case BPERF_FILTER_CPU:
filter_key = bpf_get_smp_processor_id();
@@ -55,16 +60,20 @@ int BPF_PROG(fexit_XXX)
return 0;
}
- accum_key = bpf_map_lookup_elem(&filter, &filter_key);
- if (!accum_key)
+ fval = bpf_map_lookup_elem(&filter, &filter_key);
+ if (!fval)
return 0;
+ accum_key = fval->accum_key;
+ if (fval->exited)
+ bpf_map_delete_elem(&filter, &filter_key);
+
do_add:
diff_val = bpf_map_lookup_elem(&diff_readings, &zero);
if (!diff_val)
return 0;
- accum_val = bpf_map_lookup_elem(&accum_readings, accum_key);
+ accum_val = bpf_map_lookup_elem(&accum_readings, &accum_key);
if (!accum_val)
return 0;
@@ -75,4 +84,70 @@ int BPF_PROG(fexit_XXX)
return 0;
}
+/* The program is only used for PID or TGID filter types. */
+SEC("tp_btf/task_newtask")
+int BPF_PROG(on_newtask, struct task_struct *task, __u64 clone_flags)
+{
+ __u32 parent_key, child_key;
+ struct bperf_filter_value *parent_fval;
+ struct bperf_filter_value child_fval = { 0 };
+
+ if (!enabled)
+ return 0;
+
+ switch (type) {
+ case BPERF_FILTER_PID:
+ parent_key = bpf_get_current_pid_tgid() & 0xffffffff;
+ child_key = task->pid;
+ break;
+ case BPERF_FILTER_TGID:
+ parent_key = bpf_get_current_pid_tgid() >> 32;
+ child_key = task->tgid;
+ if (child_key == parent_key)
+ return 0;
+ break;
+ default:
+ return 0;
+ }
+
+ /* Check if the current task is one of the target tasks to be counted */
+ parent_fval = bpf_map_lookup_elem(&filter, &parent_key);
+ if (!parent_fval)
+ return 0;
+
+ /* Start counting for the new task by adding it into filter map,
+ * inherit the accum key of its parent task so that they can be
+ * counted together.
+ */
+ child_fval.accum_key = parent_fval->accum_key;
+ child_fval.exited = 0;
+ bpf_map_update_elem(&filter, &child_key, &child_fval, BPF_NOEXIST);
+
+ return 0;
+}
+
+/* The program is only used for PID or TGID filter types. */
+SEC("tp_btf/sched_process_exit")
+int BPF_PROG(on_exittask, struct task_struct *task)
+{
+ __u32 pid;
+ struct bperf_filter_value *fval;
+
+ if (!enabled)
+ return 0;
+
+ /* Stop counting for this task by removing it from filter map.
+ * For TGID type, if the pid can be found in the map, it means that
+ * this pid belongs to the leader task. After the task exits, the
+ * tgid of its child tasks (if any) will be 1, so the pid can be
+ * safely removed.
+ */
+ pid = task->pid;
+ fval = bpf_map_lookup_elem(&filter, &pid);
+ if (fval)
+ fval->exited = 1;
+
+ return 0;
+}
+
char LICENSE[] SEC("license") = "Dual BSD/GPL";
diff --git a/tools/perf/util/bpf_skel/bperf_u.h b/tools/perf/util/bpf_skel/bperf_u.h
index 1ce0c2c905c1..4a4a753980be 100644
--- a/tools/perf/util/bpf_skel/bperf_u.h
+++ b/tools/perf/util/bpf_skel/bperf_u.h
@@ -11,4 +11,9 @@ enum bperf_filter_type {
BPERF_FILTER_TGID,
};
+struct bperf_filter_value {
+ __u32 accum_key;
+ __u8 exited;
+};
+
#endif /* __BPERF_STAT_U_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH -next v3 2/2] perf test: Use sqrtloop workload to test bperf event
2024-09-16 1:43 [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
2024-09-16 1:43 ` [PATCH -next v3 1/2] perf stat: Support inherit events during fork() " Tengda Wu
@ 2024-09-16 1:43 ` Tengda Wu
2024-09-25 14:16 ` [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
2 siblings, 0 replies; 13+ messages in thread
From: Tengda Wu @ 2024-09-16 1:43 UTC (permalink / raw)
To: Peter Zijlstra, Namhyung Kim
Cc: song, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
kan.liang, linux-perf-users, linux-kernel, bpf
Replace `brstack` workload with `sqrtloop` workload, because `sqrtloop`
workload contains fork(), which is suitable for testing the bperf event
inheritance feature.
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
---
tools/perf/tests/shell/stat_bpf_counters.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh
index f250b7d6f773..831f02add75e 100755
--- a/tools/perf/tests/shell/stat_bpf_counters.sh
+++ b/tools/perf/tests/shell/stat_bpf_counters.sh
@@ -4,7 +4,7 @@
set -e
-workload="perf test -w brstack"
+workload="perf test -w sqrtloop"
# check whether $2 is within +/- 20% of $1
compare_number()
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 0/2] perf stat: Support inherit events for bperf
2024-09-16 1:43 [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
2024-09-16 1:43 ` [PATCH -next v3 1/2] perf stat: Support inherit events during fork() " Tengda Wu
2024-09-16 1:43 ` [PATCH -next v3 2/2] perf test: Use sqrtloop workload to test bperf event Tengda Wu
@ 2024-09-25 14:16 ` Tengda Wu
2024-09-26 3:43 ` Namhyung Kim
2 siblings, 1 reply; 13+ messages in thread
From: Tengda Wu @ 2024-09-25 14:16 UTC (permalink / raw)
To: Peter Zijlstra, Namhyung Kim
Cc: song, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
kan.liang, linux-perf-users, linux-kernel, bpf
Hello,
Sorry for pinging again. Is there any other suggestion with this patch set?
If there is, please let me know.
Thanks,
Tengda
On 2024/9/16 9:43, Tengda Wu wrote:
> Hi,
>
> Here is the 3th version of the series to support inherit events for bperf.
> This version add pid or tgid selection based on filter type in new_task
> prog to avoid memory waste and potential count loss.
>
>
> bperf (perf-stat --bpf-counter) has not supported inherit events
> during fork() since it was first introduced.
>
> This patch series tries to add this support by:
> 1) adding two new bpf programs to monitor task lifecycle;
> 2) recording new tasks in the filter map dynamically;
> 3) reusing `accum_key` of parent task for new tasks.
>
> Thanks,
> Tengda
>
>
> Changelog:
> ---------
> v3: (Address comments from Namhyung, thanks)
> * Use pid or tgid based on filter type in new_task prog
> * Add comments to explain pid usage for TGID type in exit_task prog
>
> v2: https://lore.kernel.org/all/20240905115918.772234-1-wutengda@huaweicloud.com/
> * Remove the unused init_filter_entries in follower bpf, declare
> a global filter_entry_count in bpf_counter instead
> * Attach on_newtask and on_exittask progs only if the filter type
> is either PID or TGID
>
> v1: https://lore.kernel.org/all/20240904123103.732507-1-wutengda@huaweicloud.com/
>
>
> Tengda Wu (2):
> perf stat: Support inherit events during fork() for bperf
> perf test: Use sqrtloop workload to test bperf event
>
> tools/perf/tests/shell/stat_bpf_counters.sh | 2 +-
> tools/perf/util/bpf_counter.c | 32 +++++--
> tools/perf/util/bpf_skel/bperf_follower.bpf.c | 87 +++++++++++++++++--
> tools/perf/util/bpf_skel/bperf_u.h | 5 ++
> 4 files changed, 112 insertions(+), 14 deletions(-)
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 0/2] perf stat: Support inherit events for bperf
2024-09-25 14:16 ` [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
@ 2024-09-26 3:43 ` Namhyung Kim
2024-10-09 5:21 ` Namhyung Kim
0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2024-09-26 3:43 UTC (permalink / raw)
To: Tengda Wu, song
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, kan.liang, linux-perf-users, linux-kernel, bpf
Hello,
On Wed, Sep 25, 2024 at 10:16:16PM +0800, Tengda Wu wrote:
> Hello,
>
> Sorry for pinging again. Is there any other suggestion with this patch set?
> If there is, please let me know.
Sorry I was traveling last week. I think it's good now.
Song, can I get your ack?
Thanks,
Namhyung
>
> On 2024/9/16 9:43, Tengda Wu wrote:
> > Hi,
> >
> > Here is the 3th version of the series to support inherit events for bperf.
> > This version add pid or tgid selection based on filter type in new_task
> > prog to avoid memory waste and potential count loss.
> >
> >
> > bperf (perf-stat --bpf-counter) has not supported inherit events
> > during fork() since it was first introduced.
> >
> > This patch series tries to add this support by:
> > 1) adding two new bpf programs to monitor task lifecycle;
> > 2) recording new tasks in the filter map dynamically;
> > 3) reusing `accum_key` of parent task for new tasks.
> >
> > Thanks,
> > Tengda
> >
> >
> > Changelog:
> > ---------
> > v3: (Address comments from Namhyung, thanks)
> > * Use pid or tgid based on filter type in new_task prog
> > * Add comments to explain pid usage for TGID type in exit_task prog
> >
> > v2: https://lore.kernel.org/all/20240905115918.772234-1-wutengda@huaweicloud.com/
> > * Remove the unused init_filter_entries in follower bpf, declare
> > a global filter_entry_count in bpf_counter instead
> > * Attach on_newtask and on_exittask progs only if the filter type
> > is either PID or TGID
> >
> > v1: https://lore.kernel.org/all/20240904123103.732507-1-wutengda@huaweicloud.com/
> >
> >
> > Tengda Wu (2):
> > perf stat: Support inherit events during fork() for bperf
> > perf test: Use sqrtloop workload to test bperf event
> >
> > tools/perf/tests/shell/stat_bpf_counters.sh | 2 +-
> > tools/perf/util/bpf_counter.c | 32 +++++--
> > tools/perf/util/bpf_skel/bperf_follower.bpf.c | 87 +++++++++++++++++--
> > tools/perf/util/bpf_skel/bperf_u.h | 5 ++
> > 4 files changed, 112 insertions(+), 14 deletions(-)
> >
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 0/2] perf stat: Support inherit events for bperf
2024-09-26 3:43 ` Namhyung Kim
@ 2024-10-09 5:21 ` Namhyung Kim
2024-10-09 17:21 ` Song Liu
0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2024-10-09 5:21 UTC (permalink / raw)
To: Tengda Wu, song
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, kan.liang, linux-perf-users, linux-kernel, bpf
On Wed, Sep 25, 2024 at 08:43:50PM -0700, Namhyung Kim wrote:
> Hello,
>
> On Wed, Sep 25, 2024 at 10:16:16PM +0800, Tengda Wu wrote:
> > Hello,
> >
> > Sorry for pinging again. Is there any other suggestion with this patch set?
> > If there is, please let me know.
>
> Sorry I was traveling last week. I think it's good now.
>
> Song, can I get your ack?
He seems to be very busy. I'll pick this up and run some tests.
Thanks,
Namhyung
>
> >
> > On 2024/9/16 9:43, Tengda Wu wrote:
> > > Hi,
> > >
> > > Here is the 3th version of the series to support inherit events for bperf.
> > > This version add pid or tgid selection based on filter type in new_task
> > > prog to avoid memory waste and potential count loss.
> > >
> > >
> > > bperf (perf-stat --bpf-counter) has not supported inherit events
> > > during fork() since it was first introduced.
> > >
> > > This patch series tries to add this support by:
> > > 1) adding two new bpf programs to monitor task lifecycle;
> > > 2) recording new tasks in the filter map dynamically;
> > > 3) reusing `accum_key` of parent task for new tasks.
> > >
> > > Thanks,
> > > Tengda
> > >
> > >
> > > Changelog:
> > > ---------
> > > v3: (Address comments from Namhyung, thanks)
> > > * Use pid or tgid based on filter type in new_task prog
> > > * Add comments to explain pid usage for TGID type in exit_task prog
> > >
> > > v2: https://lore.kernel.org/all/20240905115918.772234-1-wutengda@huaweicloud.com/
> > > * Remove the unused init_filter_entries in follower bpf, declare
> > > a global filter_entry_count in bpf_counter instead
> > > * Attach on_newtask and on_exittask progs only if the filter type
> > > is either PID or TGID
> > >
> > > v1: https://lore.kernel.org/all/20240904123103.732507-1-wutengda@huaweicloud.com/
> > >
> > >
> > > Tengda Wu (2):
> > > perf stat: Support inherit events during fork() for bperf
> > > perf test: Use sqrtloop workload to test bperf event
> > >
> > > tools/perf/tests/shell/stat_bpf_counters.sh | 2 +-
> > > tools/perf/util/bpf_counter.c | 32 +++++--
> > > tools/perf/util/bpf_skel/bperf_follower.bpf.c | 87 +++++++++++++++++--
> > > tools/perf/util/bpf_skel/bperf_u.h | 5 ++
> > > 4 files changed, 112 insertions(+), 14 deletions(-)
> > >
> >
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 1/2] perf stat: Support inherit events during fork() for bperf
2024-09-16 1:43 ` [PATCH -next v3 1/2] perf stat: Support inherit events during fork() " Tengda Wu
@ 2024-10-09 17:18 ` Song Liu
2024-10-10 0:31 ` Namhyung Kim
0 siblings, 1 reply; 13+ messages in thread
From: Song Liu @ 2024-10-09 17:18 UTC (permalink / raw)
To: Tengda Wu
Cc: Peter Zijlstra, Namhyung Kim, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Ian Rogers, Adrian Hunter, kan.liang, linux-perf-users,
linux-kernel, bpf
On Sun, Sep 15, 2024 at 6:53 PM Tengda Wu <wutengda@huaweicloud.com> wrote:
>
> bperf has a nice ability to share PMUs, but it still does not support
> inherit events during fork(), resulting in some deviations in its stat
> results compared with perf.
>
> perf stat result:
> $ ./perf stat -e cycles,instructions -- ./perf test -w sqrtloop
>
> Performance counter stats for './perf test -w sqrtloop':
>
> 2,316,038,116 cycles
> 2,859,350,725 instructions
>
> 1.009603637 seconds time elapsed
>
> 1.004196000 seconds user
> 0.003950000 seconds sys
>
> bperf stat result:
> $ ./perf stat --bpf-counters -e cycles,instructions -- \
> ./perf test -w sqrtloop
>
> Performance counter stats for './perf test -w sqrtloop':
>
> 18,762,093 cycles
> 23,487,766 instructions
>
> 1.008913769 seconds time elapsed
>
> 1.003248000 seconds user
> 0.004069000 seconds sys
>
> In order to support event inheritance, two new bpf programs are added
> to monitor the fork and exit of tasks respectively. When a task is
> created, add it to the filter map to enable counting, and reuse the
> `accum_key` of its parent task to count together with the parent task.
> When a task exits, remove it from the filter map to disable counting.
>
> After support:
> $ ./perf stat --bpf-counters -e cycles,instructions -- \
> ./perf test -w sqrtloop
>
> Performance counter stats for './perf test -w sqrtloop':
>
> 2,316,252,189 cycles
> 2,859,946,547 instructions
>
> 1.009422314 seconds time elapsed
>
> 1.003597000 seconds user
> 0.004270000 seconds sys
>
> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
The solution looks good to me. Question on the UI: do we always
want the inherit behavior from PID and TGID monitoring? If not,
maybe we should add a flag for it. (I think we do need the flag).
One nitpick below.
Thanks,
Song
[...]
>
> +struct bperf_filter_value {
> + __u32 accum_key;
> + __u8 exited;
> +};
nit:
Can we use a special value of accum_key to replace exited==1
case?
> +
> #endif /* __BPERF_STAT_U_H */
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 0/2] perf stat: Support inherit events for bperf
2024-10-09 5:21 ` Namhyung Kim
@ 2024-10-09 17:21 ` Song Liu
2024-10-10 0:22 ` Namhyung Kim
0 siblings, 1 reply; 13+ messages in thread
From: Song Liu @ 2024-10-09 17:21 UTC (permalink / raw)
To: Namhyung Kim
Cc: Tengda Wu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, kan.liang, linux-perf-users, linux-kernel, bpf
Hi Namhyung,
On Tue, Oct 8, 2024 at 10:21 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Sep 25, 2024 at 08:43:50PM -0700, Namhyung Kim wrote:
> > Hello,
> >
> > On Wed, Sep 25, 2024 at 10:16:16PM +0800, Tengda Wu wrote:
> > > Hello,
> > >
> > > Sorry for pinging again. Is there any other suggestion with this patch set?
> > > If there is, please let me know.
> >
> > Sorry I was traveling last week. I think it's good now.
> >
> > Song, can I get your ack?
>
> He seems to be very busy. I'll pick this up and run some tests.
Sorry for the late reply. I somehow missed the earlier email.
I have a question and a nitpick for 1/2. Otherwise, this looks good
to me.
Thanks,
Song
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 0/2] perf stat: Support inherit events for bperf
2024-10-09 17:21 ` Song Liu
@ 2024-10-10 0:22 ` Namhyung Kim
0 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2024-10-10 0:22 UTC (permalink / raw)
To: Song Liu
Cc: Tengda Wu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, kan.liang, linux-perf-users, linux-kernel, bpf
Hi Song,
On Wed, Oct 09, 2024 at 10:21:05AM -0700, Song Liu wrote:
> Hi Namhyung,
>
> On Tue, Oct 8, 2024 at 10:21 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, Sep 25, 2024 at 08:43:50PM -0700, Namhyung Kim wrote:
> > > Hello,
> > >
> > > On Wed, Sep 25, 2024 at 10:16:16PM +0800, Tengda Wu wrote:
> > > > Hello,
> > > >
> > > > Sorry for pinging again. Is there any other suggestion with this patch set?
> > > > If there is, please let me know.
> > >
> > > Sorry I was traveling last week. I think it's good now.
> > >
> > > Song, can I get your ack?
> >
> > He seems to be very busy. I'll pick this up and run some tests.
>
> Sorry for the late reply. I somehow missed the earlier email.
No worries.
>
> I have a question and a nitpick for 1/2. Otherwise, this looks good
> to me.
Thanks for your reivew!
Namhyung
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 1/2] perf stat: Support inherit events during fork() for bperf
2024-10-09 17:18 ` Song Liu
@ 2024-10-10 0:31 ` Namhyung Kim
2024-10-10 4:53 ` Tengda Wu
0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2024-10-10 0:31 UTC (permalink / raw)
To: Song Liu
Cc: Tengda Wu, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, kan.liang, linux-perf-users, linux-kernel, bpf
On Wed, Oct 09, 2024 at 10:18:44AM -0700, Song Liu wrote:
> On Sun, Sep 15, 2024 at 6:53 PM Tengda Wu <wutengda@huaweicloud.com> wrote:
> >
> > bperf has a nice ability to share PMUs, but it still does not support
> > inherit events during fork(), resulting in some deviations in its stat
> > results compared with perf.
> >
> > perf stat result:
> > $ ./perf stat -e cycles,instructions -- ./perf test -w sqrtloop
> >
> > Performance counter stats for './perf test -w sqrtloop':
> >
> > 2,316,038,116 cycles
> > 2,859,350,725 instructions
> >
> > 1.009603637 seconds time elapsed
> >
> > 1.004196000 seconds user
> > 0.003950000 seconds sys
> >
> > bperf stat result:
> > $ ./perf stat --bpf-counters -e cycles,instructions -- \
> > ./perf test -w sqrtloop
> >
> > Performance counter stats for './perf test -w sqrtloop':
> >
> > 18,762,093 cycles
> > 23,487,766 instructions
> >
> > 1.008913769 seconds time elapsed
> >
> > 1.003248000 seconds user
> > 0.004069000 seconds sys
> >
> > In order to support event inheritance, two new bpf programs are added
> > to monitor the fork and exit of tasks respectively. When a task is
> > created, add it to the filter map to enable counting, and reuse the
> > `accum_key` of its parent task to count together with the parent task.
> > When a task exits, remove it from the filter map to disable counting.
> >
> > After support:
> > $ ./perf stat --bpf-counters -e cycles,instructions -- \
> > ./perf test -w sqrtloop
> >
> > Performance counter stats for './perf test -w sqrtloop':
> >
> > 2,316,252,189 cycles
> > 2,859,946,547 instructions
> >
> > 1.009422314 seconds time elapsed
> >
> > 1.003597000 seconds user
> > 0.004270000 seconds sys
> >
> > Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
>
> The solution looks good to me. Question on the UI: do we always
> want the inherit behavior from PID and TGID monitoring? If not,
> maybe we should add a flag for it. (I think we do need the flag).
I think it should depend on the value of attr.inherit. Maybe we can
disable the autoload for !inherit.
>
> One nitpick below.
>
> Thanks,
> Song
>
> [...]
> >
> > +struct bperf_filter_value {
> > + __u32 accum_key;
> > + __u8 exited;
> > +};
> nit:
> Can we use a special value of accum_key to replace exited==1
> case?
I'm not sure. I guess it still needs to use the accum_key to save the
final value when exited = 1.
Thanks,
Namhyung
>
> > +
> > #endif /* __BPERF_STAT_U_H */
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 1/2] perf stat: Support inherit events during fork() for bperf
2024-10-10 0:31 ` Namhyung Kim
@ 2024-10-10 4:53 ` Tengda Wu
2024-10-11 3:07 ` Tengda Wu
0 siblings, 1 reply; 13+ messages in thread
From: Tengda Wu @ 2024-10-10 4:53 UTC (permalink / raw)
To: Namhyung Kim, Song Liu
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, kan.liang, linux-perf-users, linux-kernel, bpf
On 2024/10/10 8:31, Namhyung Kim wrote:
> On Wed, Oct 09, 2024 at 10:18:44AM -0700, Song Liu wrote:
>> On Sun, Sep 15, 2024 at 6:53 PM Tengda Wu <wutengda@huaweicloud.com> wrote:
>>>
>>> bperf has a nice ability to share PMUs, but it still does not support
>>> inherit events during fork(), resulting in some deviations in its stat
>>> results compared with perf.
>>>
>>> perf stat result:
>>> $ ./perf stat -e cycles,instructions -- ./perf test -w sqrtloop
>>>
>>> Performance counter stats for './perf test -w sqrtloop':
>>>
>>> 2,316,038,116 cycles
>>> 2,859,350,725 instructions
>>>
>>> 1.009603637 seconds time elapsed
>>>
>>> 1.004196000 seconds user
>>> 0.003950000 seconds sys
>>>
>>> bperf stat result:
>>> $ ./perf stat --bpf-counters -e cycles,instructions -- \
>>> ./perf test -w sqrtloop
>>>
>>> Performance counter stats for './perf test -w sqrtloop':
>>>
>>> 18,762,093 cycles
>>> 23,487,766 instructions
>>>
>>> 1.008913769 seconds time elapsed
>>>
>>> 1.003248000 seconds user
>>> 0.004069000 seconds sys
>>>
>>> In order to support event inheritance, two new bpf programs are added
>>> to monitor the fork and exit of tasks respectively. When a task is
>>> created, add it to the filter map to enable counting, and reuse the
>>> `accum_key` of its parent task to count together with the parent task.
>>> When a task exits, remove it from the filter map to disable counting.
>>>
>>> After support:
>>> $ ./perf stat --bpf-counters -e cycles,instructions -- \
>>> ./perf test -w sqrtloop
>>>
>>> Performance counter stats for './perf test -w sqrtloop':
>>>
>>> 2,316,252,189 cycles
>>> 2,859,946,547 instructions
>>>
>>> 1.009422314 seconds time elapsed
>>>
>>> 1.003597000 seconds user
>>> 0.004270000 seconds sys
>>>
>>> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
>>
>> The solution looks good to me. Question on the UI: do we always
>> want the inherit behavior from PID and TGID monitoring? If not,
>> maybe we should add a flag for it. (I think we do need the flag).
>
> I think it should depend on the value of attr.inherit. Maybe we can
> disable the autoload for !inherit.
>
Got it. The attr.inherit flag(related to --no-inherit in perf command)
is suitable for controlling inherit behavior. I will fix it. Thanks!
>>
>> One nitpick below.
>>
>> Thanks,
>> Song
>>
>> [...]
>>>
>>> +struct bperf_filter_value {
>>> + __u32 accum_key;
>>> + __u8 exited;
>>> +};
>> nit:
>> Can we use a special value of accum_key to replace exited==1
>> case?
>
> I'm not sure. I guess it still needs to use the accum_key to save the
> final value when exited = 1.
In theory, it is possible. The accum_key is currently only used to index value
in accum_readings map, so if the task is not being counted, the accum_key can
be set to an special value.
Due to accum_key is of u32 type, there are two special values to choose from: 0
or max_entries+1. I think the latter, max_entries+1, may be more suitable because
it can avoid memory waste in the accum_readings map and does not require too
many changes to bpf_counter.
Thanks for your kindly review!
Tengda
>
> Thanks,
> Namhyung
>
>>
>>> +
>>> #endif /* __BPERF_STAT_U_H */
>>> --
>>> 2.34.1
>>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 1/2] perf stat: Support inherit events during fork() for bperf
2024-10-10 4:53 ` Tengda Wu
@ 2024-10-11 3:07 ` Tengda Wu
2024-10-11 3:21 ` Song Liu
0 siblings, 1 reply; 13+ messages in thread
From: Tengda Wu @ 2024-10-11 3:07 UTC (permalink / raw)
To: Namhyung Kim, Song Liu
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, kan.liang, linux-perf-users, linux-kernel, bpf
On 2024/10/10 12:53, Tengda Wu wrote:
>
>
> On 2024/10/10 8:31, Namhyung Kim wrote:
>> On Wed, Oct 09, 2024 at 10:18:44AM -0700, Song Liu wrote:
>>> On Sun, Sep 15, 2024 at 6:53 PM Tengda Wu <wutengda@huaweicloud.com> wrote:
>>>>
>>>> bperf has a nice ability to share PMUs, but it still does not support
>>>> inherit events during fork(), resulting in some deviations in its stat
>>>> results compared with perf.
>>>>
>>>> perf stat result:
>>>> $ ./perf stat -e cycles,instructions -- ./perf test -w sqrtloop
>>>>
>>>> Performance counter stats for './perf test -w sqrtloop':
>>>>
>>>> 2,316,038,116 cycles
>>>> 2,859,350,725 instructions
>>>>
>>>> 1.009603637 seconds time elapsed
>>>>
>>>> 1.004196000 seconds user
>>>> 0.003950000 seconds sys
>>>>
>>>> bperf stat result:
>>>> $ ./perf stat --bpf-counters -e cycles,instructions -- \
>>>> ./perf test -w sqrtloop
>>>>
>>>> Performance counter stats for './perf test -w sqrtloop':
>>>>
>>>> 18,762,093 cycles
>>>> 23,487,766 instructions
>>>>
>>>> 1.008913769 seconds time elapsed
>>>>
>>>> 1.003248000 seconds user
>>>> 0.004069000 seconds sys
>>>>
>>>> In order to support event inheritance, two new bpf programs are added
>>>> to monitor the fork and exit of tasks respectively. When a task is
>>>> created, add it to the filter map to enable counting, and reuse the
>>>> `accum_key` of its parent task to count together with the parent task.
>>>> When a task exits, remove it from the filter map to disable counting.
>>>>
>>>> After support:
>>>> $ ./perf stat --bpf-counters -e cycles,instructions -- \
>>>> ./perf test -w sqrtloop
>>>>
>>>> Performance counter stats for './perf test -w sqrtloop':
>>>>
>>>> 2,316,252,189 cycles
>>>> 2,859,946,547 instructions
>>>>
>>>> 1.009422314 seconds time elapsed
>>>>
>>>> 1.003597000 seconds user
>>>> 0.004270000 seconds sys
>>>>
>>>> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
>>>
>>> The solution looks good to me. Question on the UI: do we always
>>> want the inherit behavior from PID and TGID monitoring? If not,
>>> maybe we should add a flag for it. (I think we do need the flag).
>>
>> I think it should depend on the value of attr.inherit. Maybe we can
>> disable the autoload for !inherit.
>>
>
> Got it. The attr.inherit flag(related to --no-inherit in perf command)
> is suitable for controlling inherit behavior. I will fix it. Thanks!
>
>>>
>>> One nitpick below.
>>>
>>> Thanks,
>>> Song
>>>
>>> [...]
>>>>
>>>> +struct bperf_filter_value {
>>>> + __u32 accum_key;
>>>> + __u8 exited;
>>>> +};
>>> nit:
>>> Can we use a special value of accum_key to replace exited==1
>>> case?
>>
>> I'm not sure. I guess it still needs to use the accum_key to save the
>> final value when exited = 1.
>
> In theory, it is possible. The accum_key is currently only used to index value
> in accum_readings map, so if the task is not being counted, the accum_key can
> be set to an special value.
>
> Due to accum_key is of u32 type, there are two special values to choose from: 0
> or max_entries+1. I think the latter, max_entries+1, may be more suitable because
> it can avoid memory waste in the accum_readings map and does not require too
> many changes to bpf_counter.
>
Sorry, I was wrong. As Namhyung said, 'accum_readings[accum_key]' saves the
last count of the task when it exits. If accum_key is set to a special value
at this time, the count will be lost.
So exited==1 is necessary, we can not use a special value of accum_key to
replace it.
Thanks,
Tengda
>
>>
>> Thanks,
>> Namhyung
>>
>>>
>>>> +
>>>> #endif /* __BPERF_STAT_U_H */
>>>> --
>>>> 2.34.1
>>>>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH -next v3 1/2] perf stat: Support inherit events during fork() for bperf
2024-10-11 3:07 ` Tengda Wu
@ 2024-10-11 3:21 ` Song Liu
0 siblings, 0 replies; 13+ messages in thread
From: Song Liu @ 2024-10-11 3:21 UTC (permalink / raw)
To: Tengda Wu
Cc: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Ian Rogers, Adrian Hunter, kan.liang, linux-perf-users,
linux-kernel, bpf
On Thu, Oct 10, 2024 at 8:07 PM Tengda Wu <wutengda@huaweicloud.com> wrote:
[...]
> >>>>
> >>>> +struct bperf_filter_value {
> >>>> + __u32 accum_key;
> >>>> + __u8 exited;
> >>>> +};
> >>> nit:
> >>> Can we use a special value of accum_key to replace exited==1
> >>> case?
> >>
> >> I'm not sure. I guess it still needs to use the accum_key to save the
> >> final value when exited = 1.
> >
> > In theory, it is possible. The accum_key is currently only used to index value
> > in accum_readings map, so if the task is not being counted, the accum_key can
> > be set to an special value.
> >
> > Due to accum_key is of u32 type, there are two special values to choose from: 0
> > or max_entries+1. I think the latter, max_entries+1, may be more suitable because
> > it can avoid memory waste in the accum_readings map and does not require too
> > many changes to bpf_counter.
> >
>
> Sorry, I was wrong. As Namhyung said, 'accum_readings[accum_key]' saves the
> last count of the task when it exits. If accum_key is set to a special value
> at this time, the count will be lost.
>
> So exited==1 is necessary, we can not use a special value of accum_key to
> replace it.
Got it. Thanks for the explanation.
Song
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-10-11 3:21 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-16 1:43 [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
2024-09-16 1:43 ` [PATCH -next v3 1/2] perf stat: Support inherit events during fork() " Tengda Wu
2024-10-09 17:18 ` Song Liu
2024-10-10 0:31 ` Namhyung Kim
2024-10-10 4:53 ` Tengda Wu
2024-10-11 3:07 ` Tengda Wu
2024-10-11 3:21 ` Song Liu
2024-09-16 1:43 ` [PATCH -next v3 2/2] perf test: Use sqrtloop workload to test bperf event Tengda Wu
2024-09-25 14:16 ` [PATCH -next v3 0/2] perf stat: Support inherit events for bperf Tengda Wu
2024-09-26 3:43 ` Namhyung Kim
2024-10-09 5:21 ` Namhyung Kim
2024-10-09 17:21 ` Song Liu
2024-10-10 0:22 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).