[PATCHES/RFC 1/5] perf bench uprobe + BPF skel

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHES/RFC 1/5] perf bench uprobe + BPF skel
@ 2023-07-19 20:49 Arnaldo Carvalho de Melo
  2023-07-19 20:49 ` [PATCH 1/5] perf bench uprobe: Add benchmark to test uprobe overhead Arnaldo Carvalho de Melo
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-07-19 20:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Masami Hiramatsu

Hi,

	This adds a 'perf bench' to test the overhead of uprobes + BPF
programs, for now just a few simple tests, but I plan to make it
possible to specify the functions to attach the uprobe + BPF, other BPF
operations dealing with maps, etc.

	This is how it looks like now:

  [root@five ~]# perf bench uprobe all
  # Running uprobe/baseline benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,053,963 usecs
  
   1,053.963 usecs/op
  
  # Running uprobe/empty benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,056,293 usecs +2,330 to baseline
  
   1,056.293 usecs/op 2.330 usecs/op to baseline
  
  # Running uprobe/trace_printk benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,056,977 usecs +3,014 to baseline +684 to previous
  
   1,056.977 usecs/op 3.014 usecs/op to baseline 0.684 usecs/op to previous
  
  [root@five ~]

I put it here:

  https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf-bench-uprobe

  git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-bench-uprobe

Further ideas, problems?

- Arnaldo



Arnaldo Carvalho de Melo (5):
  perf bench uprobe: Add benchmark to test uprobe overhead
  perf bench uprobe: Print diff to baseline
  perf bench uprobe: Show diff to previous
  perf bench uprobe empty: Add entry attaching an empty BPF program
  perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk

 tools/perf/Documentation/perf-bench.txt     |   3 +
 tools/perf/Makefile.perf                    |   1 +
 tools/perf/bench/Build                      |   1 +
 tools/perf/bench/bench.h                    |   3 +
 tools/perf/bench/uprobe.c                   | 198 ++++++++++++++++++++
 tools/perf/builtin-bench.c                  |   8 +
 tools/perf/util/bpf_skel/bench_uprobe.bpf.c |  23 +++
 7 files changed, 237 insertions(+)
 create mode 100644 tools/perf/bench/uprobe.c
 create mode 100644 tools/perf/util/bpf_skel/bench_uprobe.bpf.c

-- 
2.41.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] perf bench uprobe: Add benchmark to test uprobe overhead
  2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
@ 2023-07-19 20:49 ` Arnaldo Carvalho de Melo
  2023-07-21 14:45   ` Masami Hiramatsu
  2023-07-19 20:49 ` [PATCH 2/5] perf bench uprobe: Print diff to baseline Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-07-19 20:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

From: Arnaldo Carvalho de Melo <acme@redhat.com>

This just adds the initial "workload", a call to libc's usleep(1000us)
function:

  $ perf stat --null perf bench uprobe all
  # Running uprobe/baseline benchmark...
  # Executed 1000 usleep(1000) calls
       Total time: 1053533 usecs

   1053.533 usecs/op

   Performance counter stats for 'perf bench uprobe all':

         1.061042896 seconds time elapsed

         0.001079000 seconds user
         0.006499000 seconds sys

  $

More entries will be added using a BPF skel to add various uprobes to
the usleep() function.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andre Fredette <anfredet@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Dave Tucker <datucker@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-bench.txt |  3 +
 tools/perf/bench/Build                  |  1 +
 tools/perf/bench/bench.h                |  1 +
 tools/perf/bench/uprobe.c               | 80 +++++++++++++++++++++++++
 tools/perf/builtin-bench.c              |  6 ++
 5 files changed, 91 insertions(+)
 create mode 100644 tools/perf/bench/uprobe.c

diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt
index f04f0eaded985fc8..ca5789625cd2b8e5 100644
--- a/tools/perf/Documentation/perf-bench.txt
+++ b/tools/perf/Documentation/perf-bench.txt
@@ -67,6 +67,9 @@ SUBSYSTEM
 'internals'::
 	Benchmark internal perf functionality.
 
+'uprobe'::
+	Benchmark overhead of uprobe + BPF.
+
 'all'::
 	All benchmark subsystems.
 
diff --git a/tools/perf/bench/Build b/tools/perf/bench/Build
index 0f158dc8139bbd0d..47412d47dccfeff2 100644
--- a/tools/perf/bench/Build
+++ b/tools/perf/bench/Build
@@ -16,6 +16,7 @@ perf-y += inject-buildid.o
 perf-y += evlist-open-close.o
 perf-y += breakpoint.o
 perf-y += pmu-scan.o
+perf-y += uprobe.o
 
 perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
 perf-$(CONFIG_X86_64) += mem-memset-x86-64-asm.o
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 0d2b65976212333a..201311f75c964df2 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -42,6 +42,7 @@ int bench_inject_build_id(int argc, const char **argv);
 int bench_evlist_open_close(int argc, const char **argv);
 int bench_breakpoint_thread(int argc, const char **argv);
 int bench_breakpoint_enable(int argc, const char **argv);
+int bench_uprobe_baseline(int argc, const char **argv);
 int bench_pmu_scan(int argc, const char **argv);
 
 #define BENCH_FORMAT_DEFAULT_STR	"default"
diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
new file mode 100644
index 0000000000000000..707174220a76701f
--- /dev/null
+++ b/tools/perf/bench/uprobe.c
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+/*
+ * uprobe.c
+ *
+ * uprobe benchmarks
+ *
+ *  Copyright (C) 2023, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
+ */
+#include "../perf.h"
+#include "../util/util.h"
+#include <subcmd/parse-options.h>
+#include "../builtin.h"
+#include "bench.h"
+#include <linux/time64.h>
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <time.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#define LOOPS_DEFAULT 1000
+static int loops = LOOPS_DEFAULT;
+
+static const struct option options[] = {
+	OPT_INTEGER('l', "loop",	&loops,		"Specify number of loops"),
+	OPT_END()
+};
+
+static const char * const bench_uprobe_usage[] = {
+	"perf bench uprobe <options>",
+	NULL
+};
+
+static int bench_uprobe(int argc, const char **argv)
+{
+	const char *name = "usleep(1000)", *unit = "usec";
+	struct timespec start, end;
+	u64 diff;
+	int i;
+
+	argc = parse_options(argc, argv, options, bench_uprobe_usage, 0);
+
+	clock_gettime(CLOCK_REALTIME, &start);
+
+	for (i = 0; i < loops; i++) {
+		usleep(USEC_PER_MSEC);
+	}
+
+	clock_gettime(CLOCK_REALTIME, &end);
+
+	diff = end.tv_sec * NSEC_PER_SEC + end.tv_nsec - (start.tv_sec * NSEC_PER_SEC + start.tv_nsec);
+	diff /= NSEC_PER_USEC;
+
+	switch (bench_format) {
+	case BENCH_FORMAT_DEFAULT:
+		printf("# Executed %'d %s calls\n", loops, name);
+		printf(" %14s: %'" PRIu64 " %ss\n\n", "Total time", diff, unit);
+		printf(" %'.3f %ss/op\n", (double)diff / (double)loops, unit);
+		break;
+
+	case BENCH_FORMAT_SIMPLE:
+		printf("%" PRIu64 "\n", diff);
+		break;
+
+	default:
+		/* reaching here is something of a disaster */
+		fprintf(stderr, "Unknown format:%d\n", bench_format);
+		exit(1);
+	}
+
+	return 0;
+}
+
+int bench_uprobe_baseline(int argc, const char **argv)
+{
+	return bench_uprobe(argc, argv);
+}
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index db435b791a09b69b..09637aee83413e63 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -104,6 +104,11 @@ static struct bench breakpoint_benchmarks[] = {
 	{ NULL,	NULL, NULL },
 };
 
+static struct bench uprobe_benchmarks[] = {
+	{ "baseline",	"Baseline libc usleep(1000) call",	bench_uprobe_baseline,	},
+	{ NULL,	NULL, NULL },
+};
+
 struct collection {
 	const char	*name;
 	const char	*summary;
@@ -123,6 +128,7 @@ static struct collection collections[] = {
 #endif
 	{ "internals",	"Perf-internals benchmarks",			internals_benchmarks	},
 	{ "breakpoint",	"Breakpoint benchmarks",			breakpoint_benchmarks	},
+	{ "uprobe",	"uprobe benchmarks",				uprobe_benchmarks	},
 	{ "all",	"All benchmarks",				NULL			},
 	{ NULL,		NULL,						NULL			}
 };
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] perf bench uprobe: Print diff to baseline
  2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
  2023-07-19 20:49 ` [PATCH 1/5] perf bench uprobe: Add benchmark to test uprobe overhead Arnaldo Carvalho de Melo
@ 2023-07-19 20:49 ` Arnaldo Carvalho de Melo
  2023-07-21 14:43   ` Masami Hiramatsu
  2023-07-19 20:49 ` [PATCH 3/5] perf bench uprobe: Show diff to previous Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-07-19 20:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

From: Arnaldo Carvalho de Melo <acme@redhat.com>

This is just prep work to show the diff to the unmodified workload.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andre Fredette <anfredet@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Dave Tucker <datucker@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/uprobe.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
index 707174220a76701f..60e7c43298d8cf56 100644
--- a/tools/perf/bench/uprobe.c
+++ b/tools/perf/bench/uprobe.c
@@ -34,6 +34,29 @@ static const char * const bench_uprobe_usage[] = {
 	NULL
 };
 
+static int bench_uprobe_format__default_fprintf(const char *name, const char *unit, u64 diff, FILE *fp)
+{
+	static u64 baseline;
+	s64 diff_to_baseline = diff - baseline;
+	int printed = fprintf(fp, "# Executed %'d %s calls\n", loops, name);
+
+	printed += fprintf(fp, " %14s: %'" PRIu64 " %ss", "Total time", diff, unit);
+
+	if (baseline)
+		printed += fprintf(fp, " %s%'" PRId64 " to baseline", diff_to_baseline > 0 ? "+" : "", diff_to_baseline);
+
+	printed += fprintf(fp, "\n\n %'.3f %ss/op", (double)diff / (double)loops, unit);
+
+	if (baseline)
+		printed += fprintf(fp, " %'.3f %ss/op to baseline", (double)diff_to_baseline / (double)loops, unit);
+	else
+		baseline = diff;
+
+	fputc('\n', fp);
+
+	return printed + 1;
+}
+
 static int bench_uprobe(int argc, const char **argv)
 {
 	const char *name = "usleep(1000)", *unit = "usec";
@@ -56,9 +79,7 @@ static int bench_uprobe(int argc, const char **argv)
 
 	switch (bench_format) {
 	case BENCH_FORMAT_DEFAULT:
-		printf("# Executed %'d %s calls\n", loops, name);
-		printf(" %14s: %'" PRIu64 " %ss\n\n", "Total time", diff, unit);
-		printf(" %'.3f %ss/op\n", (double)diff / (double)loops, unit);
+		bench_uprobe_format__default_fprintf(name, unit, diff, stdout);
 		break;
 
 	case BENCH_FORMAT_SIMPLE:
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] perf bench uprobe: Show diff to previous
  2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
  2023-07-19 20:49 ` [PATCH 1/5] perf bench uprobe: Add benchmark to test uprobe overhead Arnaldo Carvalho de Melo
  2023-07-19 20:49 ` [PATCH 2/5] perf bench uprobe: Print diff to baseline Arnaldo Carvalho de Melo
@ 2023-07-19 20:49 ` Arnaldo Carvalho de Melo
  2023-07-21 14:48   ` Masami Hiramatsu
  2023-07-19 20:49 ` [PATCH 4/5] perf bench uprobe empty: Add entry attaching an empty BPF program Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-07-19 20:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Will be useful to show the incremental overhead as we do more stuff in
the BPF program attached to the uprobes.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andre Fredette <anfredet@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Dave Tucker <datucker@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/uprobe.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
index 60e7c43298d8cf56..a90e09f791c540a9 100644
--- a/tools/perf/bench/uprobe.c
+++ b/tools/perf/bench/uprobe.c
@@ -36,24 +36,35 @@ static const char * const bench_uprobe_usage[] = {
 
 static int bench_uprobe_format__default_fprintf(const char *name, const char *unit, u64 diff, FILE *fp)
 {
-	static u64 baseline;
-	s64 diff_to_baseline = diff - baseline;
+	static u64 baseline, previous;
+	s64 diff_to_baseline = diff - baseline,
+	    diff_to_previous = diff - previous;
 	int printed = fprintf(fp, "# Executed %'d %s calls\n", loops, name);
 
 	printed += fprintf(fp, " %14s: %'" PRIu64 " %ss", "Total time", diff, unit);
 
-	if (baseline)
+	if (baseline) {
 		printed += fprintf(fp, " %s%'" PRId64 " to baseline", diff_to_baseline > 0 ? "+" : "", diff_to_baseline);
 
+		if (previous != baseline)
+			fprintf(stdout, " %s%'" PRId64 " to previous", diff_to_previous > 0 ? "+" : "", diff_to_previous);
+	}
+
 	printed += fprintf(fp, "\n\n %'.3f %ss/op", (double)diff / (double)loops, unit);
 
-	if (baseline)
+	if (baseline) {
 		printed += fprintf(fp, " %'.3f %ss/op to baseline", (double)diff_to_baseline / (double)loops, unit);
-	else
+
+		if (previous != baseline)
+			printed += fprintf(fp, " %'.3f %ss/op to previous", (double)diff_to_previous / (double)loops, unit);
+	} else {
 		baseline = diff;
+	}
 
 	fputc('\n', fp);
 
+	previous = diff;
+
 	return printed + 1;
 }
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] perf bench uprobe empty: Add entry attaching an empty BPF program
  2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2023-07-19 20:49 ` [PATCH 3/5] perf bench uprobe: Show diff to previous Arnaldo Carvalho de Melo
@ 2023-07-19 20:49 ` Arnaldo Carvalho de Melo
  2023-07-19 20:49 ` [PATCH 5/5] perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-07-19 20:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Using libbpf and a BPF skel:

  # perf bench uprobe all
  # Running uprobe/baseline benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,055,618 usecs

   1,055.618 usecs/op
  # Running uprobe/empty benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,057,146 usecs +1,528 to baseline

   1,057.146 usecs/op
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andre Fredette <anfredet@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Dave Tucker <datucker@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.perf                    |  1 +
 tools/perf/bench/bench.h                    |  1 +
 tools/perf/bench/uprobe.c                   | 75 ++++++++++++++++++++-
 tools/perf/builtin-bench.c                  |  3 +-
 tools/perf/util/bpf_skel/bench_uprobe.bpf.c | 12 ++++
 5 files changed, 88 insertions(+), 4 deletions(-)
 create mode 100644 tools/perf/util/bpf_skel/bench_uprobe.bpf.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 097316ef38e6a80f..a44d16ec11ee5490 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1057,6 +1057,7 @@ SKELETONS += $(SKEL_OUT)/bperf_leader.skel.h $(SKEL_OUT)/bperf_follower.skel.h
 SKELETONS += $(SKEL_OUT)/bperf_cgroup.skel.h $(SKEL_OUT)/func_latency.skel.h
 SKELETONS += $(SKEL_OUT)/off_cpu.skel.h $(SKEL_OUT)/lock_contention.skel.h
 SKELETONS += $(SKEL_OUT)/kwork_trace.skel.h $(SKEL_OUT)/sample_filter.skel.h
+SKELETONS += $(SKEL_OUT)/bench_uprobe.skel.h
 
 $(SKEL_TMP_OUT) $(LIBAPI_OUTPUT) $(LIBBPF_OUTPUT) $(LIBPERF_OUTPUT) $(LIBSUBCMD_OUTPUT) $(LIBSYMBOL_OUTPUT):
 	$(Q)$(MKDIR) -p $@
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 201311f75c964df2..daf4850b441cf91c 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -43,6 +43,7 @@ int bench_evlist_open_close(int argc, const char **argv);
 int bench_breakpoint_thread(int argc, const char **argv);
 int bench_breakpoint_enable(int argc, const char **argv);
 int bench_uprobe_baseline(int argc, const char **argv);
+int bench_uprobe_empty(int argc, const char **argv);
 int bench_pmu_scan(int argc, const char **argv);
 
 #define BENCH_FORMAT_DEFAULT_STR	"default"
diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
index a90e09f791c540a9..dfb90038a4f7a06a 100644
--- a/tools/perf/bench/uprobe.c
+++ b/tools/perf/bench/uprobe.c
@@ -24,6 +24,11 @@
 #define LOOPS_DEFAULT 1000
 static int loops = LOOPS_DEFAULT;
 
+enum bench_uprobe {
+        BENCH_UPROBE__BASELINE,
+        BENCH_UPROBE__EMPTY,
+};
+
 static const struct option options[] = {
 	OPT_INTEGER('l', "loop",	&loops,		"Specify number of loops"),
 	OPT_END()
@@ -34,6 +39,59 @@ static const char * const bench_uprobe_usage[] = {
 	NULL
 };
 
+#ifdef HAVE_BPF_SKEL
+#include "bpf_skel/bench_uprobe.skel.h"
+
+struct bench_uprobe_bpf *skel;
+
+static int bench_uprobe__setup_bpf_skel(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
+	int err;
+
+	/* Load and verify BPF application */
+	skel = bench_uprobe_bpf__open();
+	if (!skel) {
+		fprintf(stderr, "Failed to open and load uprobes bench BPF skeleton\n");
+		return -1;
+	}
+
+	err = bench_uprobe_bpf__load(skel);
+	if (err) {
+		fprintf(stderr, "Failed to load and verify BPF skeleton\n");
+		goto cleanup;
+	}
+
+	uprobe_opts.func_name = "usleep";
+	skel->links.empty = bpf_program__attach_uprobe_opts(/*prog=*/skel->progs.empty,
+							    /*pid=*/-1,
+							    /*binary_path=*/"/lib64/libc.so.6",
+							    /*func_offset=*/0,
+							    /*opts=*/&uprobe_opts);
+	if (!skel->links.empty) {
+		err = -errno;
+		fprintf(stderr, "Failed to attach bench uprobe: %s\n", strerror(errno));
+		goto cleanup;
+	}
+
+	return err;
+cleanup:
+	bench_uprobe_bpf__destroy(skel);
+	return err;
+}
+
+static void bench_uprobe__teardown_bpf_skel(void)
+{
+	if (skel) {
+		bench_uprobe_bpf__destroy(skel);
+		skel = NULL;
+	}
+}
+#else
+static int bench_uprobe__setup_bpf_skel(void) { return 0; }
+static void bench_uprobe__teardown_bpf_skel(void) {};
+#endif
+
 static int bench_uprobe_format__default_fprintf(const char *name, const char *unit, u64 diff, FILE *fp)
 {
 	static u64 baseline, previous;
@@ -68,7 +126,7 @@ static int bench_uprobe_format__default_fprintf(const char *name, const char *un
 	return printed + 1;
 }
 
-static int bench_uprobe(int argc, const char **argv)
+static int bench_uprobe(int argc, const char **argv, enum bench_uprobe bench)
 {
 	const char *name = "usleep(1000)", *unit = "usec";
 	struct timespec start, end;
@@ -77,7 +135,10 @@ static int bench_uprobe(int argc, const char **argv)
 
 	argc = parse_options(argc, argv, options, bench_uprobe_usage, 0);
 
-	clock_gettime(CLOCK_REALTIME, &start);
+	if (bench != BENCH_UPROBE__BASELINE && bench_uprobe__setup_bpf_skel() < 0)
+		return 0;
+
+        clock_gettime(CLOCK_REALTIME, &start);
 
 	for (i = 0; i < loops; i++) {
 		usleep(USEC_PER_MSEC);
@@ -103,10 +164,18 @@ static int bench_uprobe(int argc, const char **argv)
 		exit(1);
 	}
 
+	if (bench != BENCH_UPROBE__BASELINE)
+		bench_uprobe__teardown_bpf_skel();
+
 	return 0;
 }
 
 int bench_uprobe_baseline(int argc, const char **argv)
 {
-	return bench_uprobe(argc, argv);
+	return bench_uprobe(argc, argv, BENCH_UPROBE__BASELINE);
+}
+
+int bench_uprobe_empty(int argc, const char **argv)
+{
+	return bench_uprobe(argc, argv, BENCH_UPROBE__EMPTY);
 }
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index 09637aee83413e63..1021680bbc6d4298 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -105,7 +105,8 @@ static struct bench breakpoint_benchmarks[] = {
 };
 
 static struct bench uprobe_benchmarks[] = {
-	{ "baseline",	"Baseline libc usleep(1000) call",	bench_uprobe_baseline,	},
+	{ "baseline",	"Baseline libc usleep(1000) call",				bench_uprobe_baseline,	},
+	{ "empty",	"Attach empty BPF prog to uprobe on usleep, system wide",	bench_uprobe_empty,	},
 	{ NULL,	NULL, NULL },
 };
 
diff --git a/tools/perf/util/bpf_skel/bench_uprobe.bpf.c b/tools/perf/util/bpf_skel/bench_uprobe.bpf.c
new file mode 100644
index 0000000000000000..1365dcc5dddff546
--- /dev/null
+++ b/tools/perf/util/bpf_skel/bench_uprobe.bpf.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2023 Red Hat
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+
+SEC("uprobe")
+int BPF_UPROBE(empty)
+{
+       return 0;
+}
+
+char LICENSE[] SEC("license") = "Dual BSD/GPL";
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk
  2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2023-07-19 20:49 ` [PATCH 4/5] perf bench uprobe empty: Add entry attaching an empty BPF program Arnaldo Carvalho de Melo
@ 2023-07-19 20:49 ` Arnaldo Carvalho de Melo
  2023-07-19 22:41 ` [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Ian Rogers
  2023-07-21 14:32 ` Masami Hiramatsu
  6 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-07-19 20:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

From: Arnaldo Carvalho de Melo <acme@redhat.com>

  [root@five ~]# perf bench uprobe all
  # Running uprobe/baseline benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,053,963 usecs

   1,053.963 usecs/op

  # Running uprobe/empty benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,056,293 usecs +2,330 to baseline

   1,056.293 usecs/op 2.330 usecs/op to baseline

  # Running uprobe/trace_printk benchmark...
  # Executed 1,000 usleep(1000) calls
       Total time: 1,056,977 usecs +3,014 to baseline +684 to previous

   1,056.977 usecs/op 3.014 usecs/op to baseline 0.684 usecs/op to previous

  [root@five ~]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andre Fredette <anfredet@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Dave Tucker <datucker@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/bench.h                    |  1 +
 tools/perf/bench/uprobe.c                   | 39 +++++++++++++++------
 tools/perf/builtin-bench.c                  |  1 +
 tools/perf/util/bpf_skel/bench_uprobe.bpf.c | 11 ++++++
 4 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index daf4850b441cf91c..50de4773651f9914 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -44,6 +44,7 @@ int bench_breakpoint_thread(int argc, const char **argv);
 int bench_breakpoint_enable(int argc, const char **argv);
 int bench_uprobe_baseline(int argc, const char **argv);
 int bench_uprobe_empty(int argc, const char **argv);
+int bench_uprobe_trace_printk(int argc, const char **argv);
 int bench_pmu_scan(int argc, const char **argv);
 
 #define BENCH_FORMAT_DEFAULT_STR	"default"
diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
index dfb90038a4f7a06a..914c0817fe8ad31b 100644
--- a/tools/perf/bench/uprobe.c
+++ b/tools/perf/bench/uprobe.c
@@ -11,6 +11,7 @@
 #include <subcmd/parse-options.h>
 #include "../builtin.h"
 #include "bench.h"
+#include <linux/compiler.h>
 #include <linux/time64.h>
 
 #include <inttypes.h>
@@ -27,6 +28,7 @@ static int loops = LOOPS_DEFAULT;
 enum bench_uprobe {
         BENCH_UPROBE__BASELINE,
         BENCH_UPROBE__EMPTY,
+        BENCH_UPROBE__TRACE_PRINTK,
 };
 
 static const struct option options[] = {
@@ -42,9 +44,21 @@ static const char * const bench_uprobe_usage[] = {
 #ifdef HAVE_BPF_SKEL
 #include "bpf_skel/bench_uprobe.skel.h"
 
+#define bench_uprobe__attach_uprobe(prog) \
+	skel->links.prog = bpf_program__attach_uprobe_opts(/*prog=*/skel->progs.prog, \
+							   /*pid=*/-1, \
+							   /*binary_path=*/"/lib64/libc.so.6", \
+							   /*func_offset=*/0, \
+							   /*opts=*/&uprobe_opts); \
+	if (!skel->links.prog) { \
+		err = -errno; \
+		fprintf(stderr, "Failed to attach bench uprobe \"%s\": %s\n", #prog, strerror(errno)); \
+		goto cleanup; \
+	}
+
 struct bench_uprobe_bpf *skel;
 
-static int bench_uprobe__setup_bpf_skel(void)
+static int bench_uprobe__setup_bpf_skel(enum bench_uprobe bench)
 {
 	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
 	int err;
@@ -63,14 +77,12 @@ static int bench_uprobe__setup_bpf_skel(void)
 	}
 
 	uprobe_opts.func_name = "usleep";
-	skel->links.empty = bpf_program__attach_uprobe_opts(/*prog=*/skel->progs.empty,
-							    /*pid=*/-1,
-							    /*binary_path=*/"/lib64/libc.so.6",
-							    /*func_offset=*/0,
-							    /*opts=*/&uprobe_opts);
-	if (!skel->links.empty) {
-		err = -errno;
-		fprintf(stderr, "Failed to attach bench uprobe: %s\n", strerror(errno));
+	switch (bench) {
+	case BENCH_UPROBE__BASELINE:							break;
+	case BENCH_UPROBE__EMPTY:	 bench_uprobe__attach_uprobe(empty);		break;
+	case BENCH_UPROBE__TRACE_PRINTK: bench_uprobe__attach_uprobe(trace_printk);	break;
+	default:
+		fprintf(stderr, "Invalid bench: %d\n", bench);
 		goto cleanup;
 	}
 
@@ -88,7 +100,7 @@ static void bench_uprobe__teardown_bpf_skel(void)
 	}
 }
 #else
-static int bench_uprobe__setup_bpf_skel(void) { return 0; }
+static int bench_uprobe__setup_bpf_skel(enum bench_uprobe bench __maybe_unused) { return 0; }
 static void bench_uprobe__teardown_bpf_skel(void) {};
 #endif
 
@@ -135,7 +147,7 @@ static int bench_uprobe(int argc, const char **argv, enum bench_uprobe bench)
 
 	argc = parse_options(argc, argv, options, bench_uprobe_usage, 0);
 
-	if (bench != BENCH_UPROBE__BASELINE && bench_uprobe__setup_bpf_skel() < 0)
+	if (bench != BENCH_UPROBE__BASELINE && bench_uprobe__setup_bpf_skel(bench) < 0)
 		return 0;
 
         clock_gettime(CLOCK_REALTIME, &start);
@@ -179,3 +191,8 @@ int bench_uprobe_empty(int argc, const char **argv)
 {
 	return bench_uprobe(argc, argv, BENCH_UPROBE__EMPTY);
 }
+
+int bench_uprobe_trace_printk(int argc, const char **argv)
+{
+	return bench_uprobe(argc, argv, BENCH_UPROBE__TRACE_PRINTK);
+}
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index 1021680bbc6d4298..f60ccafccac25602 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -107,6 +107,7 @@ static struct bench breakpoint_benchmarks[] = {
 static struct bench uprobe_benchmarks[] = {
 	{ "baseline",	"Baseline libc usleep(1000) call",				bench_uprobe_baseline,	},
 	{ "empty",	"Attach empty BPF prog to uprobe on usleep, system wide",	bench_uprobe_empty,	},
+	{ "trace_printk", "Attach trace_printk BPF prog to uprobe on usleep syswide",	bench_uprobe_trace_printk,	},
 	{ NULL,	NULL, NULL },
 };
 
diff --git a/tools/perf/util/bpf_skel/bench_uprobe.bpf.c b/tools/perf/util/bpf_skel/bench_uprobe.bpf.c
index 1365dcc5dddff546..7046bea5da871627 100644
--- a/tools/perf/util/bpf_skel/bench_uprobe.bpf.c
+++ b/tools/perf/util/bpf_skel/bench_uprobe.bpf.c
@@ -3,10 +3,21 @@
 #include "vmlinux.h"
 #include <bpf/bpf_tracing.h>
 
+unsigned int nr_uprobes;
+
 SEC("uprobe")
 int BPF_UPROBE(empty)
 {
        return 0;
 }
 
+SEC("uprobe")
+int BPF_UPROBE(trace_printk)
+{
+	char fmt[] = "perf bench uprobe %u";
+
+        bpf_trace_printk(fmt, sizeof(fmt), ++nr_uprobes);
+	return 0;
+}
+
 char LICENSE[] SEC("license") = "Dual BSD/GPL";
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCHES/RFC 1/5] perf bench uprobe + BPF skel
  2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2023-07-19 20:49 ` [PATCH 5/5] perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk Arnaldo Carvalho de Melo
@ 2023-07-19 22:41 ` Ian Rogers
  2023-07-20 13:56   ` Arnaldo Carvalho de Melo
  2023-07-21 14:32 ` Masami Hiramatsu
  6 siblings, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2023-07-19 22:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Ingo Molnar, Thomas Gleixner, Jiri Olsa,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Masami Hiramatsu

On Wed, Jul 19, 2023 at 1:49 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Hi,
>
>         This adds a 'perf bench' to test the overhead of uprobes + BPF
> programs, for now just a few simple tests, but I plan to make it
> possible to specify the functions to attach the uprobe + BPF, other BPF
> operations dealing with maps, etc.
>
>         This is how it looks like now:
>
>   [root@five ~]# perf bench uprobe all
>   # Running uprobe/baseline benchmark...
>   # Executed 1,000 usleep(1000) calls
>        Total time: 1,053,963 usecs
>
>    1,053.963 usecs/op
>
>   # Running uprobe/empty benchmark...
>   # Executed 1,000 usleep(1000) calls
>        Total time: 1,056,293 usecs +2,330 to baseline
>
>    1,056.293 usecs/op 2.330 usecs/op to baseline
>
>   # Running uprobe/trace_printk benchmark...
>   # Executed 1,000 usleep(1000) calls
>        Total time: 1,056,977 usecs +3,014 to baseline +684 to previous
>
>    1,056.977 usecs/op 3.014 usecs/op to baseline 0.684 usecs/op to previous
>
>   [root@five ~]
>
> I put it here:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf-bench-uprobe
>
>   git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-bench-uprobe
>
> Further ideas, problems?

No problems. Perhaps it would be interesting to measure the uprobe
overhead compared to say the overhead attaching to the nanosleep
syscall?

Thanks,
Ian

> - Arnaldo
>
>
>
> Arnaldo Carvalho de Melo (5):
>   perf bench uprobe: Add benchmark to test uprobe overhead
>   perf bench uprobe: Print diff to baseline
>   perf bench uprobe: Show diff to previous
>   perf bench uprobe empty: Add entry attaching an empty BPF program
>   perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk
>
>  tools/perf/Documentation/perf-bench.txt     |   3 +
>  tools/perf/Makefile.perf                    |   1 +
>  tools/perf/bench/Build                      |   1 +
>  tools/perf/bench/bench.h                    |   3 +
>  tools/perf/bench/uprobe.c                   | 198 ++++++++++++++++++++
>  tools/perf/builtin-bench.c                  |   8 +
>  tools/perf/util/bpf_skel/bench_uprobe.bpf.c |  23 +++
>  7 files changed, 237 insertions(+)
>  create mode 100644 tools/perf/bench/uprobe.c
>  create mode 100644 tools/perf/util/bpf_skel/bench_uprobe.bpf.c
>
> --
> 2.41.0
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHES/RFC 1/5] perf bench uprobe + BPF skel
  2023-07-19 22:41 ` [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Ian Rogers
@ 2023-07-20 13:56   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-07-20 13:56 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Namhyung Kim, Ingo Molnar, Thomas Gleixner, Jiri Olsa,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Masami Hiramatsu

Em Wed, Jul 19, 2023 at 03:41:54PM -0700, Ian Rogers escreveu:
> On Wed, Jul 19, 2023 at 1:49 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Hi,
> >
> >         This adds a 'perf bench' to test the overhead of uprobes + BPF
> > programs, for now just a few simple tests, but I plan to make it
> > possible to specify the functions to attach the uprobe + BPF, other BPF
> > operations dealing with maps, etc.
> >
> >         This is how it looks like now:
> >
> >   [root@five ~]# perf bench uprobe all
> >   # Running uprobe/baseline benchmark...
> >   # Executed 1,000 usleep(1000) calls
> >        Total time: 1,053,963 usecs
> >
> >    1,053.963 usecs/op
> >
> >   # Running uprobe/empty benchmark...
> >   # Executed 1,000 usleep(1000) calls
> >        Total time: 1,056,293 usecs +2,330 to baseline
> >
> >    1,056.293 usecs/op 2.330 usecs/op to baseline
> >
> >   # Running uprobe/trace_printk benchmark...
> >   # Executed 1,000 usleep(1000) calls
> >        Total time: 1,056,977 usecs +3,014 to baseline +684 to previous
> >
> >    1,056.977 usecs/op 3.014 usecs/op to baseline 0.684 usecs/op to previous
> >
> >   [root@five ~]
> >
> > I put it here:
> >
> >   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf-bench-uprobe
> >
> >   git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-bench-uprobe
> >
> > Further ideas, problems?
> 
> No problems. Perhaps it would be interesting to measure the uprobe
> overhead compared to say the overhead attaching to the nanosleep
> syscall?

Can you rephrase your question?

The test is comparing the overhead attaching to the clock_nanosleep
syscall:

[root@five ~]# strace -c ~/bin/perf bench uprobe baseline
# Running 'uprobe/baseline' benchmark:
# Executed 1,000 usleep(1000) calls
     Total time: 1,077,139 usecs

 1,077.139 usecs/op
==7056==LeakSanitizer has encountered a fatal error.
==7056==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
==7056==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 52.87    0.002973           2      1000           clock_nanosleep
 22.55    0.001268           3       370           mmap
  8.87    0.000499           4       106           read
  5.42    0.000305           4        62           munmap
  2.42    0.000136           3        38           openat
  1.69    0.000095           1        48           mprotect
  1.28    0.000072           1        57           close
  1.19    0.000067           3        18           open
  0.98    0.000055           1        40         1 newfstatat
  0.44    0.000025           0        30           pread64
  0.44    0.000025           6         4           getdents64
  0.32    0.000018          18         1           readlink
  0.28    0.000016           2         8           write
  0.23    0.000013           1         9         4 prctl
  0.21    0.000012           6         2         2 access
  0.12    0.000007           0         8           madvise
  0.11    0.000006           1         4           clock_gettime
  0.11    0.000006           1         4           prlimit64
  0.07    0.000004           1         3           rt_sigaction
  0.07    0.000004           1         4           sigaltstack
  0.07    0.000004           4         1           sched_getaffinity
  0.05    0.000003           0         6           getpid
  0.04    0.000002           0         3           rt_sigprocmask
  0.04    0.000002           1         2         1 arch_prctl
  0.04    0.000002           1         2           futex
  0.04    0.000002           2         1           set_robust_list
  0.02    0.000001           1         1           set_tid_address
  0.02    0.000001           1         1           rseq
  0.00    0.000000           0         1           brk
  0.00    0.000000           0        14           sched_yield
  0.00    0.000000           0         1           clone
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           wait4
  0.00    0.000000           0         1           gettid
------ ----------- ----------- --------- --------- ------------------
100.00    0.005623           3      1852         8 total
[root@five ~]#

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHES/RFC 1/5] perf bench uprobe + BPF skel
  2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2023-07-19 22:41 ` [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Ian Rogers
@ 2023-07-21 14:32 ` Masami Hiramatsu
  6 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2023-07-21 14:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Masami Hiramatsu

On Wed, 19 Jul 2023 17:49:05 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi,
> 
> 	This adds a 'perf bench' to test the overhead of uprobes + BPF
> programs, for now just a few simple tests, but I plan to make it
> possible to specify the functions to attach the uprobe + BPF, other BPF
> operations dealing with maps, etc.
> 
> 	This is how it looks like now:
> 
>   [root@five ~]# perf bench uprobe all
>   # Running uprobe/baseline benchmark...
>   # Executed 1,000 usleep(1000) calls
>        Total time: 1,053,963 usecs
>   
>    1,053.963 usecs/op
>   
>   # Running uprobe/empty benchmark...
>   # Executed 1,000 usleep(1000) calls
>        Total time: 1,056,293 usecs +2,330 to baseline
>   
>    1,056.293 usecs/op 2.330 usecs/op to baseline
>   
>   # Running uprobe/trace_printk benchmark...
>   # Executed 1,000 usleep(1000) calls
>        Total time: 1,056,977 usecs +3,014 to baseline +684 to previous
>   
>    1,056.977 usecs/op 3.014 usecs/op to baseline 0.684 usecs/op to previous
>   
>   [root@five ~]

Looks great! maybe we can also make kprobes benchmark too (but it depends
on optimization and ftrace-based...)

Thank you,

> 
> I put it here:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf-bench-uprobe
> 
>   git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-bench-uprobe
> 
> Further ideas, problems?
> 
> - Arnaldo
> 
> 
> 
> Arnaldo Carvalho de Melo (5):
>   perf bench uprobe: Add benchmark to test uprobe overhead
>   perf bench uprobe: Print diff to baseline
>   perf bench uprobe: Show diff to previous
>   perf bench uprobe empty: Add entry attaching an empty BPF program
>   perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk
> 
>  tools/perf/Documentation/perf-bench.txt     |   3 +
>  tools/perf/Makefile.perf                    |   1 +
>  tools/perf/bench/Build                      |   1 +
>  tools/perf/bench/bench.h                    |   3 +
>  tools/perf/bench/uprobe.c                   | 198 ++++++++++++++++++++
>  tools/perf/builtin-bench.c                  |   8 +
>  tools/perf/util/bpf_skel/bench_uprobe.bpf.c |  23 +++
>  7 files changed, 237 insertions(+)
>  create mode 100644 tools/perf/bench/uprobe.c
>  create mode 100644 tools/perf/util/bpf_skel/bench_uprobe.bpf.c
> 
> -- 
> 2.41.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/5] perf bench uprobe: Print diff to baseline
  2023-07-19 20:49 ` [PATCH 2/5] perf bench uprobe: Print diff to baseline Arnaldo Carvalho de Melo
@ 2023-07-21 14:43   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2023-07-21 14:43 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

On Wed, 19 Jul 2023 17:49:07 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> This is just prep work to show the diff to the unmodified workload.

Looks good to me, just one comment below.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

> 
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Andre Fredette <anfredet@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: Dave Tucker <datucker@redhat.com>
> Cc: Derek Barbosa <debarbos@redhat.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/bench/uprobe.c | 27 ++++++++++++++++++++++++---
>  1 file changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
> index 707174220a76701f..60e7c43298d8cf56 100644
> --- a/tools/perf/bench/uprobe.c
> +++ b/tools/perf/bench/uprobe.c
> @@ -34,6 +34,29 @@ static const char * const bench_uprobe_usage[] = {
>  	NULL
>  };
>  
> +static int bench_uprobe_format__default_fprintf(const char *name, const char *unit, u64 diff, FILE *fp)
> +{
> +	static u64 baseline;
> +	s64 diff_to_baseline = diff - baseline;
> +	int printed = fprintf(fp, "# Executed %'d %s calls\n", loops, name);
> +
> +	printed += fprintf(fp, " %14s: %'" PRIu64 " %ss", "Total time", diff, unit);
> +
> +	if (baseline)
> +		printed += fprintf(fp, " %s%'" PRId64 " to baseline", diff_to_baseline > 0 ? "+" : "", diff_to_baseline);
> +
> +	printed += fprintf(fp, "\n\n %'.3f %ss/op", (double)diff / (double)loops, unit);

Just a nit, do we need to repeat (double) cast on the denominator side too?

> +
> +	if (baseline)
> +		printed += fprintf(fp, " %'.3f %ss/op to baseline", (double)diff_to_baseline / (double)loops, unit);
> +	else
> +		baseline = diff;
> +
> +	fputc('\n', fp);
> +
> +	return printed + 1;
> +}
> +
>  static int bench_uprobe(int argc, const char **argv)
>  {
>  	const char *name = "usleep(1000)", *unit = "usec";
> @@ -56,9 +79,7 @@ static int bench_uprobe(int argc, const char **argv)
>  
>  	switch (bench_format) {
>  	case BENCH_FORMAT_DEFAULT:
> -		printf("# Executed %'d %s calls\n", loops, name);
> -		printf(" %14s: %'" PRIu64 " %ss\n\n", "Total time", diff, unit);
> -		printf(" %'.3f %ss/op\n", (double)diff / (double)loops, unit);
> +		bench_uprobe_format__default_fprintf(name, unit, diff, stdout);
>  		break;
>  
>  	case BENCH_FORMAT_SIMPLE:
> -- 
> 2.41.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] perf bench uprobe: Add benchmark to test uprobe overhead
  2023-07-19 20:49 ` [PATCH 1/5] perf bench uprobe: Add benchmark to test uprobe overhead Arnaldo Carvalho de Melo
@ 2023-07-21 14:45   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2023-07-21 14:45 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

On Wed, 19 Jul 2023 17:49:06 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> This just adds the initial "workload", a call to libc's usleep(1000us)
> function:
> 
>   $ perf stat --null perf bench uprobe all
>   # Running uprobe/baseline benchmark...
>   # Executed 1000 usleep(1000) calls
>        Total time: 1053533 usecs
> 
>    1053.533 usecs/op
> 
>    Performance counter stats for 'perf bench uprobe all':
> 
>          1.061042896 seconds time elapsed
> 
>          0.001079000 seconds user
>          0.006499000 seconds sys
> 
>   $
> 
> More entries will be added using a BPF skel to add various uprobes to
> the usleep() function.

Looks good to me. 

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks,

> 
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Andre Fredette <anfredet@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: Dave Tucker <datucker@redhat.com>
> Cc: Derek Barbosa <debarbos@redhat.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/Documentation/perf-bench.txt |  3 +
>  tools/perf/bench/Build                  |  1 +
>  tools/perf/bench/bench.h                |  1 +
>  tools/perf/bench/uprobe.c               | 80 +++++++++++++++++++++++++
>  tools/perf/builtin-bench.c              |  6 ++
>  5 files changed, 91 insertions(+)
>  create mode 100644 tools/perf/bench/uprobe.c
> 
> diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt
> index f04f0eaded985fc8..ca5789625cd2b8e5 100644
> --- a/tools/perf/Documentation/perf-bench.txt
> +++ b/tools/perf/Documentation/perf-bench.txt
> @@ -67,6 +67,9 @@ SUBSYSTEM
>  'internals'::
>  	Benchmark internal perf functionality.
>  
> +'uprobe'::
> +	Benchmark overhead of uprobe + BPF.
> +
>  'all'::
>  	All benchmark subsystems.
>  
> diff --git a/tools/perf/bench/Build b/tools/perf/bench/Build
> index 0f158dc8139bbd0d..47412d47dccfeff2 100644
> --- a/tools/perf/bench/Build
> +++ b/tools/perf/bench/Build
> @@ -16,6 +16,7 @@ perf-y += inject-buildid.o
>  perf-y += evlist-open-close.o
>  perf-y += breakpoint.o
>  perf-y += pmu-scan.o
> +perf-y += uprobe.o
>  
>  perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
>  perf-$(CONFIG_X86_64) += mem-memset-x86-64-asm.o
> diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
> index 0d2b65976212333a..201311f75c964df2 100644
> --- a/tools/perf/bench/bench.h
> +++ b/tools/perf/bench/bench.h
> @@ -42,6 +42,7 @@ int bench_inject_build_id(int argc, const char **argv);
>  int bench_evlist_open_close(int argc, const char **argv);
>  int bench_breakpoint_thread(int argc, const char **argv);
>  int bench_breakpoint_enable(int argc, const char **argv);
> +int bench_uprobe_baseline(int argc, const char **argv);
>  int bench_pmu_scan(int argc, const char **argv);
>  
>  #define BENCH_FORMAT_DEFAULT_STR	"default"
> diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
> new file mode 100644
> index 0000000000000000..707174220a76701f
> --- /dev/null
> +++ b/tools/perf/bench/uprobe.c
> @@ -0,0 +1,80 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +/*
> + * uprobe.c
> + *
> + * uprobe benchmarks
> + *
> + *  Copyright (C) 2023, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
> + */
> +#include "../perf.h"
> +#include "../util/util.h"
> +#include <subcmd/parse-options.h>
> +#include "../builtin.h"
> +#include "bench.h"
> +#include <linux/time64.h>
> +
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <sys/time.h>
> +#include <sys/types.h>
> +#include <time.h>
> +#include <unistd.h>
> +#include <stdlib.h>
> +
> +#define LOOPS_DEFAULT 1000
> +static int loops = LOOPS_DEFAULT;
> +
> +static const struct option options[] = {
> +	OPT_INTEGER('l', "loop",	&loops,		"Specify number of loops"),
> +	OPT_END()
> +};
> +
> +static const char * const bench_uprobe_usage[] = {
> +	"perf bench uprobe <options>",
> +	NULL
> +};
> +
> +static int bench_uprobe(int argc, const char **argv)
> +{
> +	const char *name = "usleep(1000)", *unit = "usec";
> +	struct timespec start, end;
> +	u64 diff;
> +	int i;
> +
> +	argc = parse_options(argc, argv, options, bench_uprobe_usage, 0);
> +
> +	clock_gettime(CLOCK_REALTIME, &start);
> +
> +	for (i = 0; i < loops; i++) {
> +		usleep(USEC_PER_MSEC);
> +	}
> +
> +	clock_gettime(CLOCK_REALTIME, &end);
> +
> +	diff = end.tv_sec * NSEC_PER_SEC + end.tv_nsec - (start.tv_sec * NSEC_PER_SEC + start.tv_nsec);
> +	diff /= NSEC_PER_USEC;
> +
> +	switch (bench_format) {
> +	case BENCH_FORMAT_DEFAULT:
> +		printf("# Executed %'d %s calls\n", loops, name);
> +		printf(" %14s: %'" PRIu64 " %ss\n\n", "Total time", diff, unit);
> +		printf(" %'.3f %ss/op\n", (double)diff / (double)loops, unit);
> +		break;
> +
> +	case BENCH_FORMAT_SIMPLE:
> +		printf("%" PRIu64 "\n", diff);
> +		break;
> +
> +	default:
> +		/* reaching here is something of a disaster */
> +		fprintf(stderr, "Unknown format:%d\n", bench_format);
> +		exit(1);
> +	}
> +
> +	return 0;
> +}
> +
> +int bench_uprobe_baseline(int argc, const char **argv)
> +{
> +	return bench_uprobe(argc, argv);
> +}
> diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
> index db435b791a09b69b..09637aee83413e63 100644
> --- a/tools/perf/builtin-bench.c
> +++ b/tools/perf/builtin-bench.c
> @@ -104,6 +104,11 @@ static struct bench breakpoint_benchmarks[] = {
>  	{ NULL,	NULL, NULL },
>  };
>  
> +static struct bench uprobe_benchmarks[] = {
> +	{ "baseline",	"Baseline libc usleep(1000) call",	bench_uprobe_baseline,	},
> +	{ NULL,	NULL, NULL },
> +};
> +
>  struct collection {
>  	const char	*name;
>  	const char	*summary;
> @@ -123,6 +128,7 @@ static struct collection collections[] = {
>  #endif
>  	{ "internals",	"Perf-internals benchmarks",			internals_benchmarks	},
>  	{ "breakpoint",	"Breakpoint benchmarks",			breakpoint_benchmarks	},
> +	{ "uprobe",	"uprobe benchmarks",				uprobe_benchmarks	},
>  	{ "all",	"All benchmarks",				NULL			},
>  	{ NULL,		NULL,						NULL			}
>  };
> -- 
> 2.41.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/5] perf bench uprobe: Show diff to previous
  2023-07-19 20:49 ` [PATCH 3/5] perf bench uprobe: Show diff to previous Arnaldo Carvalho de Melo
@ 2023-07-21 14:48   ` Masami Hiramatsu
  0 siblings, 0 replies; 12+ messages in thread
From: Masami Hiramatsu @ 2023-07-21 14:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Ingo Molnar, Thomas Gleixner, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, Kate Carcia, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andre Fredette,
	Dave Tucker, Derek Barbosa, Masami Hiramatsu

On Wed, 19 Jul 2023 17:49:08 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> Will be useful to show the incremental overhead as we do more stuff in
> the BPF program attached to the uprobes.

Looks good to me. 

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

BTW, bench_uprobe_format__default_fprintf() looks like generic for micro benchmarks.
Can it be shared with other benchmarks?

Thank you,

> 
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Andre Fredette <anfredet@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: Dave Tucker <datucker@redhat.com>
> Cc: Derek Barbosa <debarbos@redhat.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/bench/uprobe.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/bench/uprobe.c b/tools/perf/bench/uprobe.c
> index 60e7c43298d8cf56..a90e09f791c540a9 100644
> --- a/tools/perf/bench/uprobe.c
> +++ b/tools/perf/bench/uprobe.c
> @@ -36,24 +36,35 @@ static const char * const bench_uprobe_usage[] = {
>  
>  static int bench_uprobe_format__default_fprintf(const char *name, const char *unit, u64 diff, FILE *fp)
>  {
> -	static u64 baseline;
> -	s64 diff_to_baseline = diff - baseline;
> +	static u64 baseline, previous;
> +	s64 diff_to_baseline = diff - baseline,
> +	    diff_to_previous = diff - previous;
>  	int printed = fprintf(fp, "# Executed %'d %s calls\n", loops, name);
>  
>  	printed += fprintf(fp, " %14s: %'" PRIu64 " %ss", "Total time", diff, unit);
>  
> -	if (baseline)
> +	if (baseline) {
>  		printed += fprintf(fp, " %s%'" PRId64 " to baseline", diff_to_baseline > 0 ? "+" : "", diff_to_baseline);
>  
> +		if (previous != baseline)
> +			fprintf(stdout, " %s%'" PRId64 " to previous", diff_to_previous > 0 ? "+" : "", diff_to_previous);
> +	}
> +
>  	printed += fprintf(fp, "\n\n %'.3f %ss/op", (double)diff / (double)loops, unit);
>  
> -	if (baseline)
> +	if (baseline) {
>  		printed += fprintf(fp, " %'.3f %ss/op to baseline", (double)diff_to_baseline / (double)loops, unit);
> -	else
> +
> +		if (previous != baseline)
> +			printed += fprintf(fp, " %'.3f %ss/op to previous", (double)diff_to_previous / (double)loops, unit);
> +	} else {
>  		baseline = diff;
> +	}
>  
>  	fputc('\n', fp);
>  
> +	previous = diff;
> +
>  	return printed + 1;
>  }
>  
> -- 
> 2.41.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-07-21 14:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-19 20:49 [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Arnaldo Carvalho de Melo
2023-07-19 20:49 ` [PATCH 1/5] perf bench uprobe: Add benchmark to test uprobe overhead Arnaldo Carvalho de Melo
2023-07-21 14:45   ` Masami Hiramatsu
2023-07-19 20:49 ` [PATCH 2/5] perf bench uprobe: Print diff to baseline Arnaldo Carvalho de Melo
2023-07-21 14:43   ` Masami Hiramatsu
2023-07-19 20:49 ` [PATCH 3/5] perf bench uprobe: Show diff to previous Arnaldo Carvalho de Melo
2023-07-21 14:48   ` Masami Hiramatsu
2023-07-19 20:49 ` [PATCH 4/5] perf bench uprobe empty: Add entry attaching an empty BPF program Arnaldo Carvalho de Melo
2023-07-19 20:49 ` [PATCH 5/5] perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk Arnaldo Carvalho de Melo
2023-07-19 22:41 ` [PATCHES/RFC 1/5] perf bench uprobe + BPF skel Ian Rogers
2023-07-20 13:56   ` Arnaldo Carvalho de Melo
2023-07-21 14:32 ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).