[PATCHSET 0/5] perf tools: Constify BPF control data properly (v1)

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1)
@ 2024-09-02 20:05 Namhyung Kim
  2024-09-02 20:05 ` [PATCH 1/5] perf stat: Constify control data for BPF Namhyung Kim
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Namhyung Kim @ 2024-09-02 20:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Song Liu, bpf

Hello,

I've realized that some control data (usually for filter actions)
should be defined as 'const volatile' so that it can passed to the BPF
core and to be optimized properly (like with dead code elimination).

Convert the existing codes with the similar patterns.

Thanks,
Namhyung


Namhyung Kim (5):
  perf stat: Constify control data for BPF
  perf ftrace latency: Constify control data for BPF
  perf kwork: Constify control data for BPF
  perf lock contention: Constify control data for BPF
  perf record offcpu: Constify control data for BPF

 tools/perf/util/bpf_counter_cgroup.c          |  6 +--
 tools/perf/util/bpf_ftrace.c                  |  8 ++--
 tools/perf/util/bpf_kwork.c                   |  9 ++--
 tools/perf/util/bpf_kwork_top.c               |  7 +--
 tools/perf/util/bpf_lock_contention.c         | 45 ++++++++++---------
 tools/perf/util/bpf_off_cpu.c                 | 16 +++----
 tools/perf/util/bpf_skel/bperf_cgroup.bpf.c   |  2 +-
 tools/perf/util/bpf_skel/func_latency.bpf.c   |  7 +--
 tools/perf/util/bpf_skel/kwork_top.bpf.c      |  2 +-
 tools/perf/util/bpf_skel/kwork_trace.bpf.c    |  5 ++-
 .../perf/util/bpf_skel/lock_contention.bpf.c  | 27 +++++------
 tools/perf/util/bpf_skel/off_cpu.bpf.c        |  9 ++--
 12 files changed, 76 insertions(+), 67 deletions(-)

-- 
2.46.0.469.g59c65b2a67-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] perf stat: Constify control data for BPF
  2024-09-02 20:05 [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Namhyung Kim
@ 2024-09-02 20:05 ` Namhyung Kim
  2024-09-02 20:05 ` [PATCH 2/5] perf ftrace latency: " Namhyung Kim
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Namhyung Kim @ 2024-09-02 20:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Song Liu, bpf

The control knobs set before loading BPF programs should be declared as
'const volatile' so that it can be optimized by the BPF core.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/bpf_counter_cgroup.c        | 6 +++---
 tools/perf/util/bpf_skel/bperf_cgroup.bpf.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/bpf_counter_cgroup.c b/tools/perf/util/bpf_counter_cgroup.c
index ea29c372f339705d..6ff42619de12bddf 100644
--- a/tools/perf/util/bpf_counter_cgroup.c
+++ b/tools/perf/util/bpf_counter_cgroup.c
@@ -61,6 +61,9 @@ static int bperf_load_program(struct evlist *evlist)
 	skel->rodata->num_cpus = total_cpus;
 	skel->rodata->num_events = evlist->core.nr_entries / nr_cgroups;
 
+	if (cgroup_is_v2("perf_event") > 0)
+		skel->rodata->use_cgroup_v2 = 1;
+
 	BUG_ON(evlist->core.nr_entries % nr_cgroups != 0);
 
 	/* we need one copy of events per cpu for reading */
@@ -82,9 +85,6 @@ static int bperf_load_program(struct evlist *evlist)
 		goto out;
 	}
 
-	if (cgroup_is_v2("perf_event") > 0)
-		skel->bss->use_cgroup_v2 = 1;
-
 	err = -1;
 
 	cgrp_switch = evsel__new(&cgrp_switch_attr);
diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
index 6a438e0102c5a2cb..57cab7647a9ad766 100644
--- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
+++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
@@ -57,9 +57,9 @@ struct cgroup___old {
 
 const volatile __u32 num_events = 1;
 const volatile __u32 num_cpus = 1;
+const volatile int use_cgroup_v2 = 0;
 
 int enabled = 0;
-int use_cgroup_v2 = 0;
 int perf_subsys_id = -1;
 
 static inline __u64 get_cgroup_v1_ancestor_id(struct cgroup *cgrp, int level)
-- 
2.46.0.469.g59c65b2a67-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/5] perf ftrace latency: Constify control data for BPF
  2024-09-02 20:05 [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Namhyung Kim
  2024-09-02 20:05 ` [PATCH 1/5] perf stat: Constify control data for BPF Namhyung Kim
@ 2024-09-02 20:05 ` Namhyung Kim
  2024-09-02 20:05 ` [PATCH 3/5] perf kwork: " Namhyung Kim
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Namhyung Kim @ 2024-09-02 20:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Song Liu, bpf

The control knobs set before loading BPF programs should be declared as
'const volatile' so that it can be optimized by the BPF core.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/bpf_ftrace.c                | 8 ++++----
 tools/perf/util/bpf_skel/func_latency.bpf.c | 7 ++++---
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
index 7a4297d8fd2ce925..06d1c4018407a265 100644
--- a/tools/perf/util/bpf_ftrace.c
+++ b/tools/perf/util/bpf_ftrace.c
@@ -40,13 +40,17 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
 	if (ftrace->target.cpu_list) {
 		ncpus = perf_cpu_map__nr(ftrace->evlist->core.user_requested_cpus);
 		bpf_map__set_max_entries(skel->maps.cpu_filter, ncpus);
+		skel->rodata->has_cpu = 1;
 	}
 
 	if (target__has_task(&ftrace->target) || target__none(&ftrace->target)) {
 		ntasks = perf_thread_map__nr(ftrace->evlist->core.threads);
 		bpf_map__set_max_entries(skel->maps.task_filter, ntasks);
+		skel->rodata->has_task = 1;
 	}
 
+	skel->rodata->use_nsec = ftrace->use_nsec;
+
 	set_max_rlimit();
 
 	err = func_latency_bpf__load(skel);
@@ -59,7 +63,6 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
 		u32 cpu;
 		u8 val = 1;
 
-		skel->bss->has_cpu = 1;
 		fd = bpf_map__fd(skel->maps.cpu_filter);
 
 		for (i = 0; i < ncpus; i++) {
@@ -72,7 +75,6 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
 		u32 pid;
 		u8 val = 1;
 
-		skel->bss->has_task = 1;
 		fd = bpf_map__fd(skel->maps.task_filter);
 
 		for (i = 0; i < ntasks; i++) {
@@ -81,8 +83,6 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
 		}
 	}
 
-	skel->bss->use_nsec = ftrace->use_nsec;
-
 	skel->links.func_begin = bpf_program__attach_kprobe(skel->progs.func_begin,
 							    false, func->name);
 	if (IS_ERR(skel->links.func_begin)) {
diff --git a/tools/perf/util/bpf_skel/func_latency.bpf.c b/tools/perf/util/bpf_skel/func_latency.bpf.c
index 9d01e3af747922ca..f613dc9cb123480c 100644
--- a/tools/perf/util/bpf_skel/func_latency.bpf.c
+++ b/tools/perf/util/bpf_skel/func_latency.bpf.c
@@ -37,9 +37,10 @@ struct {
 
 
 int enabled = 0;
-int has_cpu = 0;
-int has_task = 0;
-int use_nsec = 0;
+
+const volatile int has_cpu = 0;
+const volatile int has_task = 0;
+const volatile int use_nsec = 0;
 
 SEC("kprobe/func")
 int BPF_PROG(func_begin)
-- 
2.46.0.469.g59c65b2a67-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/5] perf kwork: Constify control data for BPF
  2024-09-02 20:05 [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Namhyung Kim
  2024-09-02 20:05 ` [PATCH 1/5] perf stat: Constify control data for BPF Namhyung Kim
  2024-09-02 20:05 ` [PATCH 2/5] perf ftrace latency: " Namhyung Kim
@ 2024-09-02 20:05 ` Namhyung Kim
  2024-09-02 20:05 ` [PATCH 4/5] perf lock contention: " Namhyung Kim
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Namhyung Kim @ 2024-09-02 20:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Song Liu, bpf, Yang Jihong

The control knobs set before loading BPF programs should be declared as
'const volatile' so that it can be optimized by the BPF core.

Cc: Yang Jihong <yangjihong@bytedance.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/bpf_kwork.c                | 9 +++++----
 tools/perf/util/bpf_kwork_top.c            | 7 ++++---
 tools/perf/util/bpf_skel/kwork_top.bpf.c   | 2 +-
 tools/perf/util/bpf_skel/kwork_trace.bpf.c | 5 +++--
 4 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/bpf_kwork.c b/tools/perf/util/bpf_kwork.c
index 44f0f708a15d6a14..6c7126b7670dd0c9 100644
--- a/tools/perf/util/bpf_kwork.c
+++ b/tools/perf/util/bpf_kwork.c
@@ -176,8 +176,6 @@ static int setup_filters(struct perf_kwork *kwork)
 			bpf_map_update_elem(fd, &cpu.cpu, &val, BPF_ANY);
 		}
 		perf_cpu_map__put(map);
-
-		skel->bss->has_cpu_filter = 1;
 	}
 
 	if (kwork->profile_name != NULL) {
@@ -197,8 +195,6 @@ static int setup_filters(struct perf_kwork *kwork)
 
 		key = 0;
 		bpf_map_update_elem(fd, &key, kwork->profile_name, BPF_ANY);
-
-		skel->bss->has_name_filter = 1;
 	}
 
 	return 0;
@@ -239,6 +235,11 @@ int perf_kwork__trace_prepare_bpf(struct perf_kwork *kwork)
 			class_bpf->load_prepare(kwork);
 	}
 
+	if (kwork->cpu_list != NULL)
+		skel->rodata->has_cpu_filter = 1;
+	if (kwork->profile_name != NULL)
+		skel->rodata->has_name_filter = 1;
+
 	if (kwork_trace_bpf__load(skel)) {
 		pr_debug("Failed to load kwork trace skeleton\n");
 		goto out;
diff --git a/tools/perf/util/bpf_kwork_top.c b/tools/perf/util/bpf_kwork_top.c
index 22a3b00a1e23f929..7261cad43468d7a4 100644
--- a/tools/perf/util/bpf_kwork_top.c
+++ b/tools/perf/util/bpf_kwork_top.c
@@ -151,14 +151,12 @@ static int setup_filters(struct perf_kwork *kwork)
 			bpf_map_update_elem(fd, &cpu.cpu, &val, BPF_ANY);
 		}
 		perf_cpu_map__put(map);
-
-		skel->bss->has_cpu_filter = 1;
 	}
 
 	return 0;
 }
 
-int perf_kwork__top_prepare_bpf(struct perf_kwork *kwork __maybe_unused)
+int perf_kwork__top_prepare_bpf(struct perf_kwork *kwork)
 {
 	struct bpf_program *prog;
 	struct kwork_class *class;
@@ -193,6 +191,9 @@ int perf_kwork__top_prepare_bpf(struct perf_kwork *kwork __maybe_unused)
 			class_bpf->load_prepare();
 	}
 
+	if (kwork->cpu_list)
+		skel->rodata->has_cpu_filter = 1;
+
 	if (kwork_top_bpf__load(skel)) {
 		pr_debug("Failed to load kwork top skeleton\n");
 		goto out;
diff --git a/tools/perf/util/bpf_skel/kwork_top.bpf.c b/tools/perf/util/bpf_skel/kwork_top.bpf.c
index 84c15ccbab44ccc9..594da91965a2f6c9 100644
--- a/tools/perf/util/bpf_skel/kwork_top.bpf.c
+++ b/tools/perf/util/bpf_skel/kwork_top.bpf.c
@@ -84,7 +84,7 @@ struct {
 
 int enabled = 0;
 
-int has_cpu_filter = 0;
+const volatile int has_cpu_filter = 0;
 
 __u64 from_timestamp = 0;
 __u64 to_timestamp = 0;
diff --git a/tools/perf/util/bpf_skel/kwork_trace.bpf.c b/tools/perf/util/bpf_skel/kwork_trace.bpf.c
index 063c124e099938ed..cbd79bc4b3302d81 100644
--- a/tools/perf/util/bpf_skel/kwork_trace.bpf.c
+++ b/tools/perf/util/bpf_skel/kwork_trace.bpf.c
@@ -68,8 +68,9 @@ struct {
 } perf_kwork_name_filter SEC(".maps");
 
 int enabled = 0;
-int has_cpu_filter = 0;
-int has_name_filter = 0;
+
+const volatile int has_cpu_filter = 0;
+const volatile int has_name_filter = 0;
 
 static __always_inline int local_strncmp(const char *s1,
 					 unsigned int sz, const char *s2)
-- 
2.46.0.469.g59c65b2a67-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/5] perf lock contention: Constify control data for BPF
  2024-09-02 20:05 [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Namhyung Kim
                   ` (2 preceding siblings ...)
  2024-09-02 20:05 ` [PATCH 3/5] perf kwork: " Namhyung Kim
@ 2024-09-02 20:05 ` Namhyung Kim
  2024-09-02 20:05 ` [PATCH 5/5] perf record offcpu: " Namhyung Kim
  2024-09-03 15:18 ` [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 7+ messages in thread
From: Namhyung Kim @ 2024-09-02 20:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Song Liu, bpf

The control knobs set before loading BPF programs should be declared as
'const volatile' so that it can be optimized by the BPF core.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/bpf_lock_contention.c         | 45 ++++++++++---------
 .../perf/util/bpf_skel/lock_contention.bpf.c  | 27 +++++------
 2 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/tools/perf/util/bpf_lock_contention.c b/tools/perf/util/bpf_lock_contention.c
index bc4e92c0c08b8b20..41a1ad08789511c3 100644
--- a/tools/perf/util/bpf_lock_contention.c
+++ b/tools/perf/util/bpf_lock_contention.c
@@ -46,14 +46,22 @@ int lock_contention_prepare(struct lock_contention *con)
 	else
 		bpf_map__set_max_entries(skel->maps.stacks, 1);
 
-	if (target__has_cpu(target))
+	if (target__has_cpu(target)) {
+		skel->rodata->has_cpu = 1;
 		ncpus = perf_cpu_map__nr(evlist->core.user_requested_cpus);
-	if (target__has_task(target))
+	}
+	if (target__has_task(target)) {
+		skel->rodata->has_task = 1;
 		ntasks = perf_thread_map__nr(evlist->core.threads);
-	if (con->filters->nr_types)
+	}
+	if (con->filters->nr_types) {
+		skel->rodata->has_type = 1;
 		ntypes = con->filters->nr_types;
-	if (con->filters->nr_cgrps)
+	}
+	if (con->filters->nr_cgrps) {
+		skel->rodata->has_cgroup = 1;
 		ncgrps = con->filters->nr_cgrps;
+	}
 
 	/* resolve lock name filters to addr */
 	if (con->filters->nr_syms) {
@@ -82,6 +90,7 @@ int lock_contention_prepare(struct lock_contention *con)
 			con->filters->addrs = addrs;
 		}
 		naddrs = con->filters->nr_addrs;
+		skel->rodata->has_addr = 1;
 	}
 
 	bpf_map__set_max_entries(skel->maps.cpu_filter, ncpus);
@@ -90,6 +99,16 @@ int lock_contention_prepare(struct lock_contention *con)
 	bpf_map__set_max_entries(skel->maps.addr_filter, naddrs);
 	bpf_map__set_max_entries(skel->maps.cgroup_filter, ncgrps);
 
+	skel->rodata->stack_skip = con->stack_skip;
+	skel->rodata->aggr_mode = con->aggr_mode;
+	skel->rodata->needs_callstack = con->save_callstack;
+	skel->rodata->lock_owner = con->owner;
+
+	if (con->aggr_mode == LOCK_AGGR_CGROUP || con->filters->nr_cgrps) {
+		if (cgroup_is_v2("perf_event"))
+			skel->rodata->use_cgroup_v2 = 1;
+	}
+
 	if (lock_contention_bpf__load(skel) < 0) {
 		pr_err("Failed to load lock-contention BPF skeleton\n");
 		return -1;
@@ -99,7 +118,6 @@ int lock_contention_prepare(struct lock_contention *con)
 		u32 cpu;
 		u8 val = 1;
 
-		skel->bss->has_cpu = 1;
 		fd = bpf_map__fd(skel->maps.cpu_filter);
 
 		for (i = 0; i < ncpus; i++) {
@@ -112,7 +130,6 @@ int lock_contention_prepare(struct lock_contention *con)
 		u32 pid;
 		u8 val = 1;
 
-		skel->bss->has_task = 1;
 		fd = bpf_map__fd(skel->maps.task_filter);
 
 		for (i = 0; i < ntasks; i++) {
@@ -125,7 +142,6 @@ int lock_contention_prepare(struct lock_contention *con)
 		u32 pid = evlist->workload.pid;
 		u8 val = 1;
 
-		skel->bss->has_task = 1;
 		fd = bpf_map__fd(skel->maps.task_filter);
 		bpf_map_update_elem(fd, &pid, &val, BPF_ANY);
 	}
@@ -133,7 +149,6 @@ int lock_contention_prepare(struct lock_contention *con)
 	if (con->filters->nr_types) {
 		u8 val = 1;
 
-		skel->bss->has_type = 1;
 		fd = bpf_map__fd(skel->maps.type_filter);
 
 		for (i = 0; i < con->filters->nr_types; i++)
@@ -143,7 +158,6 @@ int lock_contention_prepare(struct lock_contention *con)
 	if (con->filters->nr_addrs) {
 		u8 val = 1;
 
-		skel->bss->has_addr = 1;
 		fd = bpf_map__fd(skel->maps.addr_filter);
 
 		for (i = 0; i < con->filters->nr_addrs; i++)
@@ -153,25 +167,14 @@ int lock_contention_prepare(struct lock_contention *con)
 	if (con->filters->nr_cgrps) {
 		u8 val = 1;
 
-		skel->bss->has_cgroup = 1;
 		fd = bpf_map__fd(skel->maps.cgroup_filter);
 
 		for (i = 0; i < con->filters->nr_cgrps; i++)
 			bpf_map_update_elem(fd, &con->filters->cgrps[i], &val, BPF_ANY);
 	}
 
-	/* these don't work well if in the rodata section */
-	skel->bss->stack_skip = con->stack_skip;
-	skel->bss->aggr_mode = con->aggr_mode;
-	skel->bss->needs_callstack = con->save_callstack;
-	skel->bss->lock_owner = con->owner;
-
-	if (con->aggr_mode == LOCK_AGGR_CGROUP) {
-		if (cgroup_is_v2("perf_event"))
-			skel->bss->use_cgroup_v2 = 1;
-
+	if (con->aggr_mode == LOCK_AGGR_CGROUP)
 		read_all_cgroups(&con->cgroups);
-	}
 
 	bpf_program__set_autoload(skel->progs.collect_lock_syms, false);
 
diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
index 52a876b4269917fe..1069bda5d733887f 100644
--- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
+++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
@@ -117,21 +117,22 @@ struct mm_struct___new {
 } __attribute__((preserve_access_index));
 
 /* control flags */
-int enabled;
-int has_cpu;
-int has_task;
-int has_type;
-int has_addr;
-int has_cgroup;
-int needs_callstack;
-int stack_skip;
-int lock_owner;
-
-int use_cgroup_v2;
-int perf_subsys_id = -1;
+const volatile int has_cpu;
+const volatile int has_task;
+const volatile int has_type;
+const volatile int has_addr;
+const volatile int has_cgroup;
+const volatile int needs_callstack;
+const volatile int stack_skip;
+const volatile int lock_owner;
+const volatile int use_cgroup_v2;
 
 /* determine the key of lock stat */
-int aggr_mode;
+const volatile int aggr_mode;
+
+int enabled;
+
+int perf_subsys_id = -1;
 
 __u64 end_ts;
 
-- 
2.46.0.469.g59c65b2a67-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/5] perf record offcpu: Constify control data for BPF
  2024-09-02 20:05 [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Namhyung Kim
                   ` (3 preceding siblings ...)
  2024-09-02 20:05 ` [PATCH 4/5] perf lock contention: " Namhyung Kim
@ 2024-09-02 20:05 ` Namhyung Kim
  2024-09-03 15:18 ` [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 7+ messages in thread
From: Namhyung Kim @ 2024-09-02 20:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang
  Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Song Liu, bpf

The control knobs set before loading BPF programs should be declared as
'const volatile' so that it can be optimized by the BPF core.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/bpf_off_cpu.c          | 16 ++++++++--------
 tools/perf/util/bpf_skel/off_cpu.bpf.c |  9 +++++----
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/bpf_off_cpu.c b/tools/perf/util/bpf_off_cpu.c
index 6af36142dc5a7fd0..a590a8ac1f9d42f0 100644
--- a/tools/perf/util/bpf_off_cpu.c
+++ b/tools/perf/util/bpf_off_cpu.c
@@ -73,14 +73,12 @@ static void off_cpu_start(void *arg)
 	struct evlist *evlist = arg;
 
 	/* update task filter for the given workload */
-	if (!skel->bss->has_cpu && !skel->bss->has_task &&
+	if (skel->rodata->has_task && skel->rodata->uses_tgid &&
 	    perf_thread_map__pid(evlist->core.threads, 0) != -1) {
 		int fd;
 		u32 pid;
 		u8 val = 1;
 
-		skel->bss->has_task = 1;
-		skel->bss->uses_tgid = 1;
 		fd = bpf_map__fd(skel->maps.task_filter);
 		pid = perf_thread_map__pid(evlist->core.threads, 0);
 		bpf_map_update_elem(fd, &pid, &val, BPF_ANY);
@@ -148,6 +146,7 @@ int off_cpu_prepare(struct evlist *evlist, struct target *target,
 	if (target->cpu_list) {
 		ncpus = perf_cpu_map__nr(evlist->core.user_requested_cpus);
 		bpf_map__set_max_entries(skel->maps.cpu_filter, ncpus);
+		skel->rodata->has_cpu = 1;
 	}
 
 	if (target->pid) {
@@ -173,11 +172,16 @@ int off_cpu_prepare(struct evlist *evlist, struct target *target,
 			ntasks = MAX_PROC;
 
 		bpf_map__set_max_entries(skel->maps.task_filter, ntasks);
+		skel->rodata->has_task = 1;
+		skel->rodata->uses_tgid = 1;
 	} else if (target__has_task(target)) {
 		ntasks = perf_thread_map__nr(evlist->core.threads);
 		bpf_map__set_max_entries(skel->maps.task_filter, ntasks);
+		skel->rodata->has_task = 1;
 	} else if (target__none(target)) {
 		bpf_map__set_max_entries(skel->maps.task_filter, MAX_PROC);
+		skel->rodata->has_task = 1;
+		skel->rodata->uses_tgid = 1;
 	}
 
 	if (evlist__first(evlist)->cgrp) {
@@ -186,6 +190,7 @@ int off_cpu_prepare(struct evlist *evlist, struct target *target,
 
 		if (!cgroup_is_v2("perf_event"))
 			skel->rodata->uses_cgroup_v1 = true;
+		skel->rodata->has_cgroup = 1;
 	}
 
 	if (opts->record_cgroup) {
@@ -208,7 +213,6 @@ int off_cpu_prepare(struct evlist *evlist, struct target *target,
 		u32 cpu;
 		u8 val = 1;
 
-		skel->bss->has_cpu = 1;
 		fd = bpf_map__fd(skel->maps.cpu_filter);
 
 		for (i = 0; i < ncpus; i++) {
@@ -220,8 +224,6 @@ int off_cpu_prepare(struct evlist *evlist, struct target *target,
 	if (target->pid) {
 		u8 val = 1;
 
-		skel->bss->has_task = 1;
-		skel->bss->uses_tgid = 1;
 		fd = bpf_map__fd(skel->maps.task_filter);
 
 		strlist__for_each_entry(pos, pid_slist) {
@@ -240,7 +242,6 @@ int off_cpu_prepare(struct evlist *evlist, struct target *target,
 		u32 pid;
 		u8 val = 1;
 
-		skel->bss->has_task = 1;
 		fd = bpf_map__fd(skel->maps.task_filter);
 
 		for (i = 0; i < ntasks; i++) {
@@ -253,7 +254,6 @@ int off_cpu_prepare(struct evlist *evlist, struct target *target,
 		struct evsel *evsel;
 		u8 val = 1;
 
-		skel->bss->has_cgroup = 1;
 		fd = bpf_map__fd(skel->maps.cgroup_filter);
 
 		evlist__for_each_entry(evlist, evsel) {
diff --git a/tools/perf/util/bpf_skel/off_cpu.bpf.c b/tools/perf/util/bpf_skel/off_cpu.bpf.c
index d877a0a9731f9f2f..c152116df72f9bc1 100644
--- a/tools/perf/util/bpf_skel/off_cpu.bpf.c
+++ b/tools/perf/util/bpf_skel/off_cpu.bpf.c
@@ -85,10 +85,11 @@ struct task_struct___old {
 } __attribute__((preserve_access_index));
 
 int enabled = 0;
-int has_cpu = 0;
-int has_task = 0;
-int has_cgroup = 0;
-int uses_tgid = 0;
+
+const volatile int has_cpu = 0;
+const volatile int has_task = 0;
+const volatile int has_cgroup = 0;
+const volatile int uses_tgid = 0;
 
 const volatile bool has_prev_state = false;
 const volatile bool needs_cgroup = false;
-- 
2.46.0.469.g59c65b2a67-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1)
  2024-09-02 20:05 [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Namhyung Kim
                   ` (4 preceding siblings ...)
  2024-09-02 20:05 ` [PATCH 5/5] perf record offcpu: " Namhyung Kim
@ 2024-09-03 15:18 ` Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-03 15:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, LKML, linux-perf-users, Song Liu, bpf

On Mon, Sep 02, 2024 at 01:05:10PM -0700, Namhyung Kim wrote:
> Hello,
> 
> I've realized that some control data (usually for filter actions)
> should be defined as 'const volatile' so that it can passed to the BPF
> core and to be optimized properly (like with dead code elimination).
> 
> Convert the existing codes with the similar patterns.

Thanks, tested all the features using BPF, applied to perf-tools-next,

- Arnaldo
 
> Thanks,
> Namhyung
> 
> 
> Namhyung Kim (5):
>   perf stat: Constify control data for BPF
>   perf ftrace latency: Constify control data for BPF
>   perf kwork: Constify control data for BPF
>   perf lock contention: Constify control data for BPF
>   perf record offcpu: Constify control data for BPF
> 
>  tools/perf/util/bpf_counter_cgroup.c          |  6 +--
>  tools/perf/util/bpf_ftrace.c                  |  8 ++--
>  tools/perf/util/bpf_kwork.c                   |  9 ++--
>  tools/perf/util/bpf_kwork_top.c               |  7 +--
>  tools/perf/util/bpf_lock_contention.c         | 45 ++++++++++---------
>  tools/perf/util/bpf_off_cpu.c                 | 16 +++----
>  tools/perf/util/bpf_skel/bperf_cgroup.bpf.c   |  2 +-
>  tools/perf/util/bpf_skel/func_latency.bpf.c   |  7 +--
>  tools/perf/util/bpf_skel/kwork_top.bpf.c      |  2 +-
>  tools/perf/util/bpf_skel/kwork_trace.bpf.c    |  5 ++-
>  .../perf/util/bpf_skel/lock_contention.bpf.c  | 27 +++++------
>  tools/perf/util/bpf_skel/off_cpu.bpf.c        |  9 ++--
>  12 files changed, 76 insertions(+), 67 deletions(-)
> 
> -- 
> 2.46.0.469.g59c65b2a67-goog

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-09-03 15:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-02 20:05 [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Namhyung Kim
2024-09-02 20:05 ` [PATCH 1/5] perf stat: Constify control data for BPF Namhyung Kim
2024-09-02 20:05 ` [PATCH 2/5] perf ftrace latency: " Namhyung Kim
2024-09-02 20:05 ` [PATCH 3/5] perf kwork: " Namhyung Kim
2024-09-02 20:05 ` [PATCH 4/5] perf lock contention: " Namhyung Kim
2024-09-02 20:05 ` [PATCH 5/5] perf record offcpu: " Namhyung Kim
2024-09-03 15:18 ` [PATCHSET 0/5] perf tools: Constify BPF control data properly (v1) Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).