linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments
@ 2024-08-24 16:33 Howard Chu
  2024-08-24 16:33 ` [PATCH v3 1/8] perf trace: Fix perf trace -p <PID> Howard Chu
                   ` (7 more replies)
  0 siblings, 8 replies; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel

Changes in v3:
- Prefer customized pretty printers to BTF general pretty printer
- Add --force-btf for debugging purpose (use BTF printer instead of the
customized ones that already exist)
- Add trap-handling cleanup to tests

Changes in v2:
- Fix perf trace workload bug.
- Rename pids_filtered to pids_filtered_out, and add pids_allowed to
avoid confusion.
- Add tests.

Forgot to add some before & afters in v1, here they are:

before:

# struct
perf $ perf trace -e epoll_wait
     0.068 (500.192 ms): Hyprland/539 epoll_wait(epfd: 3, events: 0x7ffd9f6f1730, maxevents: 32, timeout: 4294967295) = 1

# string 
perf $ perf trace -e renameat2 -- mv /tmp/f1 /tmp/f2
     0.024 ( 0.012 ms): mv/294902 renameat2(olddfd: CWD, oldname: "/tmp/f1", newdfd: CWD, newname: "")  = 0

# buffer
perf $ perf trace -e write echo "Hikawa Sayo"
Hikawa Sayo
     0.000 ( 0.011 ms): echo/928215 write(fd: 1, buf: 0x5b292f307410, count: 12)                          = 12

after:

# struct
perf $ perf trace -e epoll_wait
     0.023 (500.128 ms): Hyprland/539 epoll_wait(epfd: 3, events: {1,102459045712424,}, maxevents: 32, timeout: 4294967295) = 1

# string 
perf $ perf trace -e renameat2 -- mv /tmp/f1 /tmp/f2
     0.039 ( 0.018 ms): mv/295046 renameat2(olddfd: CWD, oldname: "/tmp/f1", newdfd: CWD, newname: "/tmp/f2") = 0

# buffer
perf $ perf trace -e write echo "Hikawa Sayo"
Hikawa Sayo
     0.000 ( 0.013 ms): echo/929159 write(fd: 1, buf: "Hikawa Sayo\10", count: 12)                        = 12

Still debugging read-like syscalls augmentation such as read, readlinkat
and gettimeofday. The support for read-like syscalls will be added in a
separated patch.




v1:

This patch series adds augmentation feature to struct pointer, string
and buffer arguments all-in-one. It also fixes 'perf trace -p <PID>'.

With this patch series, perf trace will augment struct pointers well, it
can be applied to syscalls such as clone3, epoll_wait, write, and so on.
But unfortunately, it only collects the data once, when syscall enters.
This makes syscalls that pass a pointer in order to let it get
written, not to be augmented very well, I call them the read-like
syscalls, because it reads from the kernel, using the syscall. This
patch series only augments write-like syscalls well.

Unfortunately, there are more read-like syscalls(such as read,
readlinkat, even gettimeofday) than write-like syscalls(write, pwrite64,
epoll_wait, clone3).

Here are three test scripts that I find useful:

pwrite64
```
 #include <unistd.h>
 #include <sys/syscall.h>

int main()
{
	int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
	char s1[] = "DI\0NGZ\0HE\1N", s2[] = "XUEBAO";

	while (1) {
		syscall(SYS_pwrite64, i1, s1, sizeof(s1), i2);
		sleep(1);
	}

	return 0;
}
```

epoll_wait
```
 #include <unistd.h>
 #include <sys/epoll.h>
 #include <stdlib.h>
 #include <string.h>

#define MAXEVENTS 2

int main()
{
	int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
	char s1[] = "DINGZHEN", s2[] = "XUEBAO";

	struct epoll_event ee = {
		.events = 114,
		.data.ptr = NULL,
	};

	struct epoll_event *events = calloc(MAXEVENTS, sizeof(struct epoll_event));
	memcpy(events, &ee, sizeof(ee));

	while (1) {
		epoll_wait(i1, events, i2, i3);
		sleep(1);
	}

	return 0;
}
```

clone3
```
 #include <unistd.h>
 #include <sys/syscall.h>
 #include <linux/sched.h>
 #include <string.h>
 #include <stdio.h>
 #include <stdlib.h>

int main()
{
	int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
	char s1[] = "DINGZHEN", s2[] = "XUEBAO";

	struct clone_args cla = {
		.flags = 1,
		.pidfd = 1,
		.child_tid = 4,
		.parent_tid = 5,
		.exit_signal = 1,
		.stack = 4,
		.stack_size = 1,
		.tls = 9,
		.set_tid = 1,
		.set_tid_size = 9,
		.cgroup = 8,
	};

	while (1) {
		syscall(SYS_clone3, &cla, i1);
		sleep(1);
	}

	return 0;
}
```

Arnaldo Carvalho de Melo (1):
  perf trace: Pass the richer 'struct syscall_arg' pointer to
    trace__btf_scnprintf()

Howard Chu (7):
  perf trace: Fix perf trace -p <PID>
  perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for
    fetching data in BPF
  perf trace: Pretty print struct data
  perf trace: Pretty print buffer data
  perf trace: Collect augmented data using BPF
  perf trace: Add --force-btf for debugging
  perf trace: Add general tests for augmented syscalls

 tools/perf/builtin-trace.c                    | 217 +++++++++++++++++-
 tools/perf/tests/shell/trace_btf_general.sh   |  67 ++++++
 tools/perf/trace/beauty/perf_event_open.c     |   2 +-
 tools/perf/trace/beauty/sockaddr.c            |   2 +-
 tools/perf/trace/beauty/timespec.c            |   2 +-
 .../bpf_skel/augmented_raw_syscalls.bpf.c     | 114 ++++++++-
 tools/perf/util/evlist.c                      |   2 +-
 tools/perf/util/trace_augment.h               |   6 +
 8 files changed, 399 insertions(+), 13 deletions(-)
 create mode 100755 tools/perf/tests/shell/trace_btf_general.sh
 create mode 100644 tools/perf/util/trace_augment.h

-- 
2.45.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/8] perf trace: Fix perf trace -p <PID>
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-08-24 16:33 ` [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF Howard Chu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

'perf trace -p <PID>' work on a syscall that is unaugmented, but doesn't
work on a syscall that's augmented (when it calls perf_event_output() in
BPF).

Let's take open() as an example. open() is augmented in perf trace.

Before:

  $ perf trace -e open -p 3792392
     ? (         ):  ... [continued]: open()) = -1 ENOENT (No such file or directory)
     ? (         ):  ... [continued]: open()) = -1 ENOENT (No such file or directory)

We can see there's no output.

After:

   $ perf trace -e open -p 3792392
      0.000 ( 0.123 ms): a.out/3792392 open(filename: "DINGZHEN", flags: WRONLY) = -1 ENOENT (No such file or directory)
   1000.398 ( 0.116 ms): a.out/3792392 open(filename: "DINGZHEN", flags: WRONLY) = -1 ENOENT (No such file or directory)

Reason:

bpf_perf_event_output() will fail when you specify a pid in 'perf trace' (EOPNOTSUPP).

When using 'perf trace -p 114', before perf_event_open(), we'll have PID
= 114, and CPU = -1.

This is bad for bpf-output event, because the ring buffer won't accept
output from BPF's perf_event_output(), making it fail. I'm still trying
to find out why.

If we open bpf-output for every cpu, instead of setting it to -1, like
this:

  PID = <PID>, CPU = 0
  PID = <PID>, CPU = 1
  PID = <PID>, CPU = 2
  PID = <PID>, CPU = 3

Everything works.

You can test it with this script (open.c):

  #include <unistd.h>
  #include <sys/syscall.h>

  int main()
  {
	int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
	char s1[] = "DINGZHEN", s2[] = "XUEBAO";

	while (1) {
		syscall(SYS_open, s1, i1, i2);
		sleep(1);
	}

	return 0;
  }

save, compile:

  make open

perf trace:

  perf trace -e open <path-to-the-executable>

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-2-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ef58a7764318..f14b7e6ff1dc 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1067,7 +1067,7 @@ int evlist__create_maps(struct evlist *evlist, struct target *target)
 	if (!threads)
 		return -1;
 
-	if (target__uses_dummy_map(target))
+	if (target__uses_dummy_map(target) && !evlist__has_bpf_output(evlist))
 		cpus = perf_cpu_map__new_any_cpu();
 	else
 		cpus = perf_cpu_map__new(target->cpu_list);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
  2024-08-24 16:33 ` [PATCH v3 1/8] perf trace: Fix perf trace -p <PID> Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-09-09 19:45   ` Arnaldo Carvalho de Melo
  2024-08-24 16:33 ` [PATCH v3 3/8] perf trace: Pass the richer 'struct syscall_arg' pointer to trace__btf_scnprintf() Howard Chu
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Set up beauty_map, load it to BPF, in such format: if argument No.3 is a
struct of size 32 bytes (of syscall number 114) beauty_map[114][2] = 32;

if argument No.3 is a string (of syscall number 114) beauty_map[114][2] =
1;

if argument No.3 is a buffer, its size is indicated by argument No.4 (of
syscall number 114) beauty_map[114][2] = -4; /* -1 ~ -6, we'll read this
buffer size in BPF  */

Committer notes:

Moved syscall_arg_fmt__cache_btf_struct() from a ifdef
HAVE_LIBBPF_SUPPORT to closer to where it is used, that is ifdef'ed on
HAVE_BPF_SKEL and thus breaks the build when building with
BUILD_BPF_SKEL=0, as detected using 'make -C tools/perf build-test'.

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-4-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 106 +++++++++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index d6ca541fdc78..c26eab196623 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -113,6 +113,7 @@ struct syscall_arg_fmt {
 	bool	   show_zero;
 #ifdef HAVE_LIBBPF_SUPPORT
 	const struct btf_type *type;
+	int	   type_id; /* used in btf_dump */
 #endif
 };
 
@@ -3446,6 +3447,23 @@ static int trace__set_ev_qualifier_tp_filter(struct trace *trace)
 }
 
 #ifdef HAVE_BPF_SKEL
+static int syscall_arg_fmt__cache_btf_struct(struct syscall_arg_fmt *arg_fmt, struct btf *btf, char *type)
+{
+       int id;
+
+	if (arg_fmt->type != NULL)
+		return -1;
+
+       id = btf__find_by_name(btf, type);
+       if (id < 0)
+		return -1;
+
+       arg_fmt->type    = btf__type_by_id(btf, id);
+       arg_fmt->type_id = id;
+
+       return 0;
+}
+
 static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace, const char *name)
 {
 	struct bpf_program *pos, *prog = NULL;
@@ -3521,6 +3539,83 @@ static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
 	return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
 }
 
+static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
+{
+	struct tep_format_field *field;
+	struct syscall *sc = trace__syscall_info(trace, NULL, key);
+	const struct btf_type *bt;
+	char *struct_offset, *tmp, name[32];
+	bool can_augment = false;
+	int i, cnt;
+
+	if (sc == NULL)
+		return -1;
+
+	trace__load_vmlinux_btf(trace);
+	if (trace->btf == NULL)
+		return -1;
+
+	for (i = 0, field = sc->args; field; ++i, field = field->next) {
+		struct_offset = strstr(field->type, "struct ");
+
+		if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct */
+			struct_offset += 7;
+
+			/* for 'struct foo *', we only want 'foo' */
+			for (tmp = struct_offset, cnt = 0; *tmp != ' ' && *tmp != '\0'; ++tmp, ++cnt) {
+			}
+
+			strncpy(name, struct_offset, cnt);
+			name[cnt] = '\0';
+
+			/* cache struct's btf_type and type_id */
+			if (syscall_arg_fmt__cache_btf_struct(&sc->arg_fmt[i], trace->btf, name))
+				continue;
+
+			bt = sc->arg_fmt[i].type;
+			beauty_array[i] = bt->size;
+			can_augment = true;
+		} else if (field->flags & TEP_FIELD_IS_POINTER && /* string */
+		    strcmp(field->type, "const char *") == 0 &&
+		    (strstr(field->name, "name") ||
+		     strstr(field->name, "path") ||
+		     strstr(field->name, "file") ||
+		     strstr(field->name, "root") ||
+		     strstr(field->name, "key") ||
+		     strstr(field->name, "special") ||
+		     strstr(field->name, "type") ||
+		     strstr(field->name, "description"))) {
+			beauty_array[i] = 1;
+			can_augment = true;
+		} else if (field->flags & TEP_FIELD_IS_POINTER && /* buffer */
+			   strstr(field->type, "char *") &&
+			   (strstr(field->name, "buf") ||
+			    strstr(field->name, "val") ||
+			    strstr(field->name, "msg"))) {
+			int j;
+			struct tep_format_field *field_tmp;
+
+			/* find the size of the buffer that appears in pairs with buf */
+			for (j = 0, field_tmp = sc->args; field_tmp; ++j, field_tmp = field_tmp->next) {
+				if (!(field_tmp->flags & TEP_FIELD_IS_POINTER) && /* only integers */
+				    (strstr(field_tmp->name, "count") ||
+				     strstr(field_tmp->name, "siz") ||  /* size, bufsiz */
+				     (strstr(field_tmp->name, "len") && strcmp(field_tmp->name, "filename")))) {
+					 /* filename's got 'len' in it, we don't want that */
+					beauty_array[i] = -(j + 1);
+					can_augment = true;
+					break;
+				}
+			}
+		}
+	}
+
+	if (can_augment)
+		return 0;
+
+	return -1;
+}
+
 static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace, struct syscall *sc)
 {
 	struct tep_format_field *field, *candidate_field;
@@ -3625,7 +3720,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
 {
 	int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
 	int map_exit_fd  = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
+	int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
 	int err = 0;
+	unsigned int beauty_array[6];
 
 	for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
 		int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
@@ -3644,6 +3741,15 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
 		err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
 		if (err)
 			break;
+
+		/* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
+		memset(beauty_array, 0, sizeof(beauty_array));
+		err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
+		if (err)
+			continue;
+		err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
+		if (err)
+			break;
 	}
 
 	/*
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 3/8] perf trace: Pass the richer 'struct syscall_arg' pointer to trace__btf_scnprintf()
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
  2024-08-24 16:33 ` [PATCH v3 1/8] perf trace: Fix perf trace -p <PID> Howard Chu
  2024-08-24 16:33 ` [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-08-24 16:33 ` [PATCH v3 4/8] perf trace: Pretty print struct data Howard Chu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Since we'll need it later in the current patch series and we can get the
syscall_arg_fmt from syscall_arg->fmt.

Based-on-a-patch-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zsd8vqCrTh5h69rp@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
---
 tools/perf/builtin-trace.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c26eab196623..43b1f63415b4 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -990,9 +990,11 @@ static size_t btf_enum_scnprintf(const struct btf_type *type, struct btf *btf, c
 	return 0;
 }
 
-static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg_fmt *arg_fmt, char *bf,
+static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg, char *bf,
 				   size_t size, int val, char *type)
 {
+	struct syscall_arg_fmt *arg_fmt = arg->fmt;
+
 	if (trace->btf == NULL)
 		return 0;
 
@@ -1012,7 +1014,7 @@ static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg_fmt *
 }
 
 #else // HAVE_LIBBPF_SUPPORT
-static size_t trace__btf_scnprintf(struct trace *trace __maybe_unused, struct syscall_arg_fmt *arg_fmt __maybe_unused,
+static size_t trace__btf_scnprintf(struct trace *trace __maybe_unused, struct syscall_arg *arg __maybe_unused,
 				   char *bf __maybe_unused, size_t size __maybe_unused, int val __maybe_unused,
 				   char *type __maybe_unused)
 {
@@ -2261,7 +2263,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
 			if (trace->show_arg_names)
 				printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
 
-			btf_printed = trace__btf_scnprintf(trace, &sc->arg_fmt[arg.idx], bf + printed,
+			btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
 							   size - printed, val, field->type);
 			if (btf_printed) {
 				printed += btf_printed;
@@ -2964,7 +2966,7 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
 		if (trace->show_arg_names)
 			printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
 
-		btf_printed = trace__btf_scnprintf(trace, arg, bf + printed, size - printed, val, field->type);
+		btf_printed = trace__btf_scnprintf(trace, &syscall_arg, bf + printed, size - printed, val, field->type);
 		if (btf_printed) {
 			printed += btf_printed;
 			continue;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 4/8] perf trace: Pretty print struct data
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
                   ` (2 preceding siblings ...)
  2024-08-24 16:33 ` [PATCH v3 3/8] perf trace: Pass the richer 'struct syscall_arg' pointer to trace__btf_scnprintf() Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-08-28 21:04   ` Arnaldo Carvalho de Melo
  2024-08-24 16:33 ` [PATCH v3 5/8] perf trace: Pretty print buffer data Howard Chu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Change the arg->augmented.args to arg->augmented.args->value to skip the
header for customized pretty printers, since we collect data in BPF
using the general augment_sys_enter(), which always adds the header.

Use btf_dump API to pretty print augmented struct pointer.

Prefer existed pretty-printer than btf general pretty-printer.

set compact = true and skip_names = true, so that no newline character
and argument name are printed.

Committer notes:

Simplified the btf_dump_snprintf callback to avoid using multiple
buffers, as discussed in the thread accessible via the Link tag below.

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-7-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c                | 65 +++++++++++++++++++++--
 tools/perf/trace/beauty/perf_event_open.c |  2 +-
 tools/perf/trace/beauty/sockaddr.c        |  2 +-
 tools/perf/trace/beauty/timespec.c        |  2 +-
 4 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 43b1f63415b4..048bcb92624c 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -990,6 +990,54 @@ static size_t btf_enum_scnprintf(const struct btf_type *type, struct btf *btf, c
 	return 0;
 }
 
+struct trace_btf_dump_snprintf_ctx {
+	char   *bf;
+	size_t printed, size;
+};
+
+static void trace__btf_dump_snprintf(void *vctx, const char *fmt, va_list args)
+{
+	struct trace_btf_dump_snprintf_ctx *ctx = vctx;
+
+	ctx->printed += vscnprintf(ctx->bf + ctx->printed, ctx->size - ctx->printed, fmt, args);
+}
+
+static size_t btf_struct_scnprintf(const struct btf_type *type, struct btf *btf, char *bf, size_t size, struct syscall_arg *arg)
+{
+	struct trace_btf_dump_snprintf_ctx ctx = {
+		.bf   = bf,
+		.size = size,
+	};
+	struct augmented_arg *augmented_arg = arg->augmented.args;
+	int type_id = arg->fmt->type_id, consumed;
+	struct btf_dump *btf_dump;
+
+	LIBBPF_OPTS(btf_dump_opts, dump_opts);
+	LIBBPF_OPTS(btf_dump_type_data_opts, dump_data_opts);
+
+	if (arg == NULL || arg->augmented.args == NULL)
+		return 0;
+
+	dump_data_opts.compact     = true;
+	dump_data_opts.skip_names  = true;
+
+	btf_dump = btf_dump__new(btf, trace__btf_dump_snprintf, &ctx, &dump_opts);
+	if (btf_dump == NULL)
+		return 0;
+
+	/* pretty print the struct data here */
+	if (btf_dump__dump_type_data(btf_dump, type_id, arg->augmented.args->value, type->size, &dump_data_opts) == 0)
+		return 0;
+
+	consumed = sizeof(*augmented_arg) + augmented_arg->size;
+	arg->augmented.args = ((void *)arg->augmented.args) + consumed;
+	arg->augmented.size -= consumed;
+
+	btf_dump__free(btf_dump);
+
+	return ctx.printed;
+}
+
 static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg, char *bf,
 				   size_t size, int val, char *type)
 {
@@ -1009,6 +1057,8 @@ static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg,
 
 	if (btf_is_enum(arg_fmt->type))
 		return btf_enum_scnprintf(arg_fmt->type, trace->btf, bf, size, val);
+	else if (btf_is_struct(arg_fmt->type))
+		return btf_struct_scnprintf(arg_fmt->type, trace->btf, bf, size, arg);
 
 	return 0;
 }
@@ -2222,6 +2272,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
 		.show_string_prefix = trace->show_string_prefix,
 	};
 	struct thread_trace *ttrace = thread__priv(thread);
+	void *default_scnprintf;
 
 	/*
 	 * Things like fcntl will set this in its 'cmd' formatter to pick the
@@ -2263,11 +2314,15 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
 			if (trace->show_arg_names)
 				printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
 
-			btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
-							   size - printed, val, field->type);
-			if (btf_printed) {
-				printed += btf_printed;
-				continue;
+			default_scnprintf = sc->arg_fmt[arg.idx].scnprintf;
+
+			if (default_scnprintf == NULL || default_scnprintf == SCA_PTR) {
+				btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
+								   size - printed, val, field->type);
+				if (btf_printed) {
+					printed += btf_printed;
+					continue;
+				}
 			}
 
 			printed += syscall_arg_fmt__scnprintf_val(&sc->arg_fmt[arg.idx],
diff --git a/tools/perf/trace/beauty/perf_event_open.c b/tools/perf/trace/beauty/perf_event_open.c
index 01ee15fe9d0c..632237128640 100644
--- a/tools/perf/trace/beauty/perf_event_open.c
+++ b/tools/perf/trace/beauty/perf_event_open.c
@@ -76,7 +76,7 @@ static size_t perf_event_attr___scnprintf(struct perf_event_attr *attr, char *bf
 
 static size_t syscall_arg__scnprintf_augmented_perf_event_attr(struct syscall_arg *arg, char *bf, size_t size)
 {
-	return perf_event_attr___scnprintf((void *)arg->augmented.args, bf, size, arg->trace->show_zeros);
+	return perf_event_attr___scnprintf((void *)arg->augmented.args->value, bf, size, arg->trace->show_zeros);
 }
 
 static size_t syscall_arg__scnprintf_perf_event_attr(char *bf, size_t size, struct syscall_arg *arg)
diff --git a/tools/perf/trace/beauty/sockaddr.c b/tools/perf/trace/beauty/sockaddr.c
index 2e0e867c0c1b..6ecebf776899 100644
--- a/tools/perf/trace/beauty/sockaddr.c
+++ b/tools/perf/trace/beauty/sockaddr.c
@@ -47,7 +47,7 @@ static size_t (*af_scnprintfs[])(struct sockaddr *sa, char *bf, size_t size) = {
 
 static size_t syscall_arg__scnprintf_augmented_sockaddr(struct syscall_arg *arg, char *bf, size_t size)
 {
-	struct sockaddr *sa = (struct sockaddr *)arg->augmented.args;
+	struct sockaddr *sa = (struct sockaddr *)arg->augmented.args->value;
 	char family[32];
 	size_t printed;
 
diff --git a/tools/perf/trace/beauty/timespec.c b/tools/perf/trace/beauty/timespec.c
index e1a61f092aad..b14ab72a2738 100644
--- a/tools/perf/trace/beauty/timespec.c
+++ b/tools/perf/trace/beauty/timespec.c
@@ -7,7 +7,7 @@
 
 static size_t syscall_arg__scnprintf_augmented_timespec(struct syscall_arg *arg, char *bf, size_t size)
 {
-	struct timespec *ts = (struct timespec *)arg->augmented.args;
+	struct timespec *ts = (struct timespec *)arg->augmented.args->value;
 
 	return scnprintf(bf, size, "{ .tv_sec: %" PRIu64 ", .tv_nsec: %" PRIu64 " }", ts->tv_sec, ts->tv_nsec);
 }
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 5/8] perf trace: Pretty print buffer data
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
                   ` (3 preceding siblings ...)
  2024-08-24 16:33 ` [PATCH v3 4/8] perf trace: Pretty print struct data Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-09-09 16:33   ` Arnaldo Carvalho de Melo
  2024-08-24 16:33 ` [PATCH v3 6/8] perf trace: Collect augmented data using BPF Howard Chu
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Define TRACE_AUG_MAX_BUF in trace_augment.h data, which is the maximum
buffer size we can augment. BPF will include this header too.

Print buffer in a way that's different than just printing a string, we
print all the control characters in \digits (such as \0 for null, and
\10 for newline, LF).

For character that has a bigger value than 127, we print the digits
instead of the character itself as well.

Committer notes:

Simplified the buffer scnprintf to avoid using multiple buffers as
discussed in the patch review thread.

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-8-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c      | 33 +++++++++++++++++++++++++++++++++
 tools/perf/util/trace_augment.h |  6 ++++++
 2 files changed, 39 insertions(+)
 create mode 100644 tools/perf/util/trace_augment.h

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 048bcb92624c..470d74e3f875 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -65,6 +65,7 @@
 #include "syscalltbl.h"
 #include "rb_resort.h"
 #include "../perf.h"
+#include "trace_augment.h"
 
 #include <errno.h>
 #include <inttypes.h>
@@ -852,6 +853,10 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
 
 #define SCA_FILENAME syscall_arg__scnprintf_filename
 
+static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg);
+
+#define SCA_BUF syscall_arg__scnprintf_buf
+
 static size_t syscall_arg__scnprintf_pipe_flags(char *bf, size_t size,
 						struct syscall_arg *arg)
 {
@@ -1745,6 +1750,32 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
 	return 0;
 }
 
+#define MAX_CONTROL_CHAR 31
+#define MAX_ASCII 127
+
+static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg)
+{
+	struct augmented_arg *augmented_arg = arg->augmented.args;
+	unsigned char *orig = (unsigned char *)augmented_arg->value;
+	size_t printed = 0;
+	int consumed;
+
+	if (augmented_arg == NULL)
+		return 0;
+
+	for (int j = 0; j < augmented_arg->size; ++j) {
+		bool control_char = orig[j] <= MAX_CONTROL_CHAR || orig[j] >= MAX_ASCII;
+		/* print control characters (0~31 and 127), and non-ascii characters in \(digits) */
+		printed += scnprintf(bf + printed, size - printed, control_char ? "\\%d" : "%c", (int)orig[j]);
+	}
+
+	consumed = sizeof(*augmented_arg) + augmented_arg->size;
+	arg->augmented.args = ((void *)arg->augmented.args) + consumed;
+	arg->augmented.size -= consumed;
+
+	return printed;
+}
+
 static bool trace__filter_duration(struct trace *trace, double t)
 {
 	return t < (trace->duration_filter * NSEC_PER_MSEC);
@@ -1956,6 +1987,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
 		    ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
 		     strstr(field->name, "path") != NULL))
 			arg->scnprintf = SCA_FILENAME;
+		else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
+			arg->scnprintf = SCA_BUF;
 		else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
 			arg->scnprintf = SCA_PTR;
 		else if (strcmp(field->type, "pid_t") == 0)
diff --git a/tools/perf/util/trace_augment.h b/tools/perf/util/trace_augment.h
new file mode 100644
index 000000000000..57a3e5045937
--- /dev/null
+++ b/tools/perf/util/trace_augment.h
@@ -0,0 +1,6 @@
+#ifndef TRACE_AUGMENT_H
+#define TRACE_AUGMENT_H
+
+#define TRACE_AUG_MAX_BUF 32 /* for buffer augmentation in perf trace */
+
+#endif
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 6/8] perf trace: Collect augmented data using BPF
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
                   ` (4 preceding siblings ...)
  2024-08-24 16:33 ` [PATCH v3 5/8] perf trace: Pretty print buffer data Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-09-04 19:52   ` Arnaldo Carvalho de Melo
  2024-08-24 16:33 ` [PATCH v3 7/8] perf trace: Add --force-btf for debugging Howard Chu
  2024-08-24 16:33 ` [PATCH v3 8/8] perf trace: Add general tests for augmented syscalls Howard Chu
  7 siblings, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Include trace_augment.h for TRACE_AUG_MAX_BUF, so that BPF reads
TRACE_AUG_MAX_BUF bytes of buffer maximum.

Determine what type of argument and how many bytes to read from user space, us ing the
value in the beauty_map. This is the relation of parameter type and its corres ponding
value in the beauty map, and how many bytes we read eventually:

string: 1                          -> size of string (till null)
struct: size of struct             -> size of struct
buffer: -1 * (index of paired len) -> value of paired len (maximum: TRACE_AUG_ MAX_BUF)

After reading from user space, we output the augmented data using
bpf_perf_event_output().

If the struct augmenter, augment_sys_enter() failed, we fall back to
using bpf_tail_call().

I have to make the payload 6 times the size of augmented_arg, to pass the
BPF verifier.

Committer notes:

It works, but we need to wire it up to the userspace specialized pretty
printer, otherwise we get things like:

  root@number:~# perf trace -e connect ssh localhost
       0.000 ( 0.010 ms): :784442/784442 connect(fd: 3, uservaddr: {2,}, addrlen: 16)                          = 0
       0.016 ( 0.006 ms): :784442/784442 connect(fd: 3, uservaddr: {10,}, addrlen: 28)                         = 0
       0.033 ( 0.096 ms): :784442/784442 connect(fd: 3, uservaddr: {10,}, addrlen: 28)                         = 0
  root@localhost's password:     71.292 ( 0.037 ms): ssh/784442 connect(fd: 4, uservaddr: {1,{['/','v','a','r','/','r','u','n','/','.','h','e','i','m',],},}, addrlen: 110) = 0
      72.087 ( 0.013 ms): ssh/784442 connect(fd: 4, uservaddr: {1,{['/','v','a','r','/','r','u','n','/','.','h','e','i','m',],},}, addrlen: 110) = 0

  root@number:~#

When we used to have:

  root@number:~# perf trace -e connect ssh localhost
       0.000 ( 0.011 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET, port: 22, addr: 127.0.0.1 }, addrlen: 16) = 0
       0.017 ( 0.006 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
       0.263 ( 0.043 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
      63.770 ( 0.044 ms): ssh/786564 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
      65.467 ( 0.042 ms): ssh/786564 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
  root@localhost's password:

That is closer to what strace produces:

  root@number:~# strace -e connect ssh localhost
  connect(3, {sa_family=AF_INET, sin_port=htons(22), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
  connect(3, {sa_family=AF_INET6, sin6_port=htons(22), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
  connect(3, {sa_family=AF_INET6, sin6_port=htons(22), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
  connect(4, {sa_family=AF_UNIX, sun_path="/var/run/.heim_org.h5l.kcm-socket"}, 110) = 0
  connect(4, {sa_family=AF_UNIX, sun_path="/var/run/.heim_org.h5l.kcm-socket"}, 110) = 0
  root@localhost's password:

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-10-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../bpf_skel/augmented_raw_syscalls.bpf.c     | 114 +++++++++++++++++-
 1 file changed, 113 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
index 0acbd74e8c76..f29a8dfca044 100644
--- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
+++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
@@ -7,6 +7,8 @@
  */
 
 #include "vmlinux.h"
+#include "../trace_augment.h"
+
 #include <bpf/bpf_helpers.h>
 #include <linux/limits.h>
 
@@ -124,6 +126,25 @@ struct augmented_args_tmp {
 	__uint(max_entries, 1);
 } augmented_args_tmp SEC(".maps");
 
+struct beauty_payload_enter {
+	struct syscall_enter_args args;
+	struct augmented_arg aug_args[6];
+};
+
+struct beauty_map_enter {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__type(key, int);
+	__type(value, __u32[6]);
+	__uint(max_entries, 512);
+} beauty_map_enter SEC(".maps");
+
+struct beauty_payload_enter_map {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__type(key, int);
+	__type(value, struct beauty_payload_enter);
+	__uint(max_entries, 1);
+} beauty_payload_enter_map SEC(".maps");
+
 static inline struct augmented_args_payload *augmented_args_payload(void)
 {
 	int key = 0;
@@ -136,6 +157,11 @@ static inline int augmented__output(void *ctx, struct augmented_args_payload *ar
 	return bpf_perf_event_output(ctx, &__augmented_syscalls__, BPF_F_CURRENT_CPU, args, len);
 }
 
+static inline int augmented__beauty_output(void *ctx, void *data, int len)
+{
+	return bpf_perf_event_output(ctx, &__augmented_syscalls__, BPF_F_CURRENT_CPU, data, len);
+}
+
 static inline
 unsigned int augmented_arg__read_str(struct augmented_arg *augmented_arg, const void *arg, unsigned int arg_len)
 {
@@ -372,6 +398,91 @@ static bool pid_filter__has(struct pids_filtered *pids, pid_t pid)
 	return bpf_map_lookup_elem(pids, &pid) != NULL;
 }
 
+static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
+{
+	bool augmented, do_output = false;
+	int zero = 0, size, aug_size, index, output = 0,
+	    value_size = sizeof(struct augmented_arg) - offsetof(struct augmented_arg, value);
+	unsigned int nr, *beauty_map;
+	struct beauty_payload_enter *payload;
+	void *arg, *payload_offset;
+
+	/* fall back to do predefined tail call */
+	if (args == NULL)
+		return 1;
+
+	/* use syscall number to get beauty_map entry */
+	nr             = (__u32)args->syscall_nr;
+	beauty_map     = bpf_map_lookup_elem(&beauty_map_enter, &nr);
+
+	/* set up payload for output */
+	payload        = bpf_map_lookup_elem(&beauty_payload_enter_map, &zero);
+	payload_offset = (void *)&payload->aug_args;
+
+	if (beauty_map == NULL || payload == NULL)
+		return 1;
+
+	/* copy the sys_enter header, which has the syscall_nr */
+	__builtin_memcpy(&payload->args, args, sizeof(struct syscall_enter_args));
+
+	/*
+	 * Determine what type of argument and how many bytes to read from user space, using the
+	 * value in the beauty_map. This is the relation of parameter type and its corresponding
+	 * value in the beauty map, and how many bytes we read eventually:
+	 *
+	 * string: 1			      -> size of string
+	 * struct: size of struct	      -> size of struct
+	 * buffer: -1 * (index of paired len) -> value of paired len (maximum: TRACE_AUG_MAX_BUF)
+	 */
+	for (int i = 0; i < 6; i++) {
+		arg = (void *)args->args[i];
+		augmented = false;
+		size = beauty_map[i];
+		aug_size = size; /* size of the augmented data read from user space */
+
+		if (size == 0 || arg == NULL)
+			continue;
+
+		if (size == 1) { /* string */
+			aug_size = bpf_probe_read_user_str(((struct augmented_arg *)payload_offset)->value, value_size, arg);
+			/* minimum of 0 to pass the verifier */
+			if (aug_size < 0)
+				aug_size = 0;
+
+			augmented = true;
+		} else if (size > 0 && size <= value_size) { /* struct */
+			if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, size, arg))
+				augmented = true;
+		} else if (size < 0 && size >= -6) { /* buffer */
+			index = -(size + 1);
+			aug_size = args->args[index];
+
+			if (aug_size > TRACE_AUG_MAX_BUF)
+				aug_size = TRACE_AUG_MAX_BUF;
+
+			if (aug_size > 0) {
+				if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, aug_size, arg))
+					augmented = true;
+			}
+		}
+
+		/* write data to payload */
+		if (augmented) {
+			int written = offsetof(struct augmented_arg, value) + aug_size;
+
+			((struct augmented_arg *)payload_offset)->size = aug_size;
+			output += written;
+			payload_offset += written;
+			do_output = true;
+		}
+	}
+
+	if (!do_output)
+		return 1;
+
+	return augmented__beauty_output(ctx, payload, sizeof(struct syscall_enter_args) + output);
+}
+
 SEC("tp/raw_syscalls/sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
@@ -400,7 +511,8 @@ int sys_enter(struct syscall_enter_args *args)
 	 * "!raw_syscalls:unaugmented" that will just return 1 to return the
 	 * unaugmented tracepoint payload.
 	 */
-	bpf_tail_call(args, &syscalls_sys_enter, augmented_args->args.syscall_nr);
+	if (augment_sys_enter(args, &augmented_args->args))
+		bpf_tail_call(args, &syscalls_sys_enter, augmented_args->args.syscall_nr);
 
 	// If not found on the PROG_ARRAY syscalls map, then we're filtering it:
 	return 0;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 7/8] perf trace: Add --force-btf for debugging
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
                   ` (5 preceding siblings ...)
  2024-08-24 16:33 ` [PATCH v3 6/8] perf trace: Collect augmented data using BPF Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-08-24 16:33 ` [PATCH v3 8/8] perf trace: Add general tests for augmented syscalls Howard Chu
  7 siblings, 0 replies; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

If --force-btf is enabled, prefer btf_dump general pretty printer to
perf trace's customized pretty printers.

Mostly for debug purposes.

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 470d74e3f875..115f8dffb272 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -205,6 +205,7 @@ struct trace {
 	bool			show_string_prefix;
 	bool			force;
 	bool			vfs_getname;
+	bool			force_btf;
 	int			trace_pgfaults;
 	char			*perfconfig_events;
 	struct {
@@ -2349,7 +2350,9 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
 
 			default_scnprintf = sc->arg_fmt[arg.idx].scnprintf;
 
-			if (default_scnprintf == NULL || default_scnprintf == SCA_PTR) {
+			if (trace->force_btf ||
+			    (default_scnprintf == NULL ||
+			     (default_scnprintf == SCA_PTR && strstr(field->type, "struct")))) {
 				btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
 								   size - printed, val, field->type);
 				if (btf_printed) {
@@ -5153,6 +5156,8 @@ int cmd_trace(int argc, const char **argv)
 	OPT_INTEGER('D', "delay", &trace.opts.target.initial_delay,
 		     "ms to wait before starting measurement after program "
 		     "start"),
+	OPT_BOOLEAN(0, "force-btf", &trace.force_btf, "Prefer btf_dump general pretty printer"
+		       "to customized ones"),
 	OPTS_EVSWITCH(&trace.evswitch),
 	OPT_END()
 	};
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 8/8] perf trace: Add general tests for augmented syscalls
  2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
                   ` (6 preceding siblings ...)
  2024-08-24 16:33 ` [PATCH v3 7/8] perf trace: Add --force-btf for debugging Howard Chu
@ 2024-08-24 16:33 ` Howard Chu
  2024-09-09 22:11   ` Arnaldo Carvalho de Melo
  7 siblings, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-08-24 16:33 UTC (permalink / raw)
  To: acme
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel

In this test, augmentation for:

* string
* buffer
* struct

Is tested.

Please use this command:

perf test "perf trace general tests"

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Suggested-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/shell/trace_btf_general.sh | 67 +++++++++++++++++++++
 1 file changed, 67 insertions(+)
 create mode 100755 tools/perf/tests/shell/trace_btf_general.sh

diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
new file mode 100755
index 000000000000..0b4ea8462390
--- /dev/null
+++ b/tools/perf/tests/shell/trace_btf_general.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+# perf trace general tests
+# SPDX-License-Identifier: GPL-2.0
+
+err=0
+set -e
+
+. "$(dirname $0)"/lib/probe.sh
+skip_if_no_perf_trace || exit 2
+
+file1=$(mktemp /tmp/file1_XXXXX)
+file2=$(echo $file1 | sed 's/file1/file2/g')
+
+buffer="this is a buffer for testing"
+
+trap cleanup EXIT TERM INT HUP
+
+trace_test_string() {
+  echo "Testing perf trace's string augmentation"
+  if ! perf trace -e renameat* --max-events=1 -- mv ${file1} ${file2} 2>&1 | grep -q -E "renameat[2]*.*oldname: \"${file1}\".*newname: \"${file2}\".*"
+  then
+    echo "String augmentation test failed"
+    err=1
+  fi
+}
+
+trace_test_buffer() {
+  echo "Testing perf trace's buffer augmentation"
+  if ! perf trace -e write --max-events=1 -- echo "${buffer}" 2>&1 | grep -q -E ".*write.*buf: ${buffer}.*"
+  then
+    echo "Buffer augmentation test failed"
+    err=1
+  fi
+}
+
+trace_test_struct_btf() {
+  echo "Testing perf trace's struct augmentation"
+  if ! perf trace -e clock_nanosleep --force-btf --max-events=1 -- sleep 1 2>&1 | grep -q -E ".*clock_nanosleep\(rqtp: \{1,\}, rmtp: \{1,\}\).* = 0"
+  then
+    echo "BTF struct augmentation test failed"
+    err=1
+  fi
+}
+
+cleanup() {
+	rm -rf ${file1} ${file2}
+}
+
+trap_cleanup() {
+	echo "Unexpected signal in ${FUNCNAME[1]}"
+	cleanup
+	exit 1
+}
+
+trace_test_string
+
+if [ $err = 0 ]; then
+  trace_test_buffer
+fi
+
+if [ $err = 0 ]; then
+  trace_test_struct_btf
+fi
+
+cleanup
+
+exit $err
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/8] perf trace: Pretty print struct data
  2024-08-24 16:33 ` [PATCH v3 4/8] perf trace: Pretty print struct data Howard Chu
@ 2024-08-28 21:04   ` Arnaldo Carvalho de Melo
  2024-08-30  0:16     ` Howard Chu
  0 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-28 21:04 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Sun, Aug 25, 2024 at 12:33:18AM +0800, Howard Chu wrote:
> Change the arg->augmented.args to arg->augmented.args->value to skip the
> header for customized pretty printers, since we collect data in BPF
> using the general augment_sys_enter(), which always adds the header.
> 
> Use btf_dump API to pretty print augmented struct pointer.
> 
> Prefer existed pretty-printer than btf general pretty-printer.
> 
> set compact = true and skip_names = true, so that no newline character
> and argument name are printed.
> 
> Committer notes:
> 
> Simplified the btf_dump_snprintf callback to avoid using multiple
> buffers, as discussed in the thread accessible via the Link tag below.
> 
> Signed-off-by: Howard Chu <howardchu95@gmail.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Link: https://lore.kernel.org/r/20240815013626.935097-7-howardchu95@gmail.com
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/builtin-trace.c                | 65 +++++++++++++++++++++--
>  tools/perf/trace/beauty/perf_event_open.c |  2 +-
>  tools/perf/trace/beauty/sockaddr.c        |  2 +-
>  tools/perf/trace/beauty/timespec.c        |  2 +-
>  4 files changed, 63 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 43b1f63415b4..048bcb92624c 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -990,6 +990,54 @@ static size_t btf_enum_scnprintf(const struct btf_type *type, struct btf *btf, c
>  	return 0;
>  }
>  
> +struct trace_btf_dump_snprintf_ctx {
> +	char   *bf;
> +	size_t printed, size;
> +};
> +
> +static void trace__btf_dump_snprintf(void *vctx, const char *fmt, va_list args)
> +{
> +	struct trace_btf_dump_snprintf_ctx *ctx = vctx;
> +
> +	ctx->printed += vscnprintf(ctx->bf + ctx->printed, ctx->size - ctx->printed, fmt, args);
> +}
> +
> +static size_t btf_struct_scnprintf(const struct btf_type *type, struct btf *btf, char *bf, size_t size, struct syscall_arg *arg)
> +{
> +	struct trace_btf_dump_snprintf_ctx ctx = {
> +		.bf   = bf,
> +		.size = size,
> +	};
> +	struct augmented_arg *augmented_arg = arg->augmented.args;
> +	int type_id = arg->fmt->type_id, consumed;
> +	struct btf_dump *btf_dump;
> +
> +	LIBBPF_OPTS(btf_dump_opts, dump_opts);
> +	LIBBPF_OPTS(btf_dump_type_data_opts, dump_data_opts);
> +
> +	if (arg == NULL || arg->augmented.args == NULL)
> +		return 0;
> +
> +	dump_data_opts.compact     = true;
> +	dump_data_opts.skip_names  = true;
> +
> +	btf_dump = btf_dump__new(btf, trace__btf_dump_snprintf, &ctx, &dump_opts);
> +	if (btf_dump == NULL)
> +		return 0;
> +
> +	/* pretty print the struct data here */
> +	if (btf_dump__dump_type_data(btf_dump, type_id, arg->augmented.args->value, type->size, &dump_data_opts) == 0)
> +		return 0;
> +
> +	consumed = sizeof(*augmented_arg) + augmented_arg->size;
> +	arg->augmented.args = ((void *)arg->augmented.args) + consumed;
> +	arg->augmented.size -= consumed;
> +
> +	btf_dump__free(btf_dump);
> +
> +	return ctx.printed;
> +}
> +
>  static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg, char *bf,
>  				   size_t size, int val, char *type)
>  {
> @@ -1009,6 +1057,8 @@ static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg,
>  
>  	if (btf_is_enum(arg_fmt->type))
>  		return btf_enum_scnprintf(arg_fmt->type, trace->btf, bf, size, val);
> +	else if (btf_is_struct(arg_fmt->type))
> +		return btf_struct_scnprintf(arg_fmt->type, trace->btf, bf, size, arg);
>  
>  	return 0;
>  }
> @@ -2222,6 +2272,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
>  		.show_string_prefix = trace->show_string_prefix,
>  	};
>  	struct thread_trace *ttrace = thread__priv(thread);
> +	void *default_scnprintf;
>  
>  	/*
>  	 * Things like fcntl will set this in its 'cmd' formatter to pick the
> @@ -2263,11 +2314,15 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
>  			if (trace->show_arg_names)
>  				printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
>  
> -			btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
> -							   size - printed, val, field->type);
> -			if (btf_printed) {
> -				printed += btf_printed;
> -				continue;
> +			default_scnprintf = sc->arg_fmt[arg.idx].scnprintf;
> +
> +			if (default_scnprintf == NULL || default_scnprintf == SCA_PTR) {
> +				btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
> +								   size - printed, val, field->type);
> +				if (btf_printed) {
> +					printed += btf_printed;
> +					continue;
> +				}

Ok, we agreed on this one, and you noted that that in this cset comment,
good.

Next time make a note after the cset commit log message and before the
actual patch, something like:

vN: prefer pre-existing userspace scnprintf if explicitely specified,
only fallbacking to the generic BTF one when none is specified.

>  			}
>  
>  			printed += syscall_arg_fmt__scnprintf_val(&sc->arg_fmt[arg.idx],
> diff --git a/tools/perf/trace/beauty/perf_event_open.c b/tools/perf/trace/beauty/perf_event_open.c
> index 01ee15fe9d0c..632237128640 100644
> --- a/tools/perf/trace/beauty/perf_event_open.c
> +++ b/tools/perf/trace/beauty/perf_event_open.c
> @@ -76,7 +76,7 @@ static size_t perf_event_attr___scnprintf(struct perf_event_attr *attr, char *bf

But this part will work if we use the old collectors in the BPF skel?

I.e. isn't this a change in the protocol of the BPF colector with the
userpace augmented snprintf routines?

If I remember we discussed that first you make this change in the
protocol, test it with the pre-existing BPF collector, it works. Ok, now
we have a new protocol and we then use it in the generic BTF-based BPF
collector. This way that option of using the BTF-based collector or the
preexisting BPF collector works.

I'll try to do this.

- Arnaldo
  
>  static size_t syscall_arg__scnprintf_augmented_perf_event_attr(struct syscall_arg *arg, char *bf, size_t size)
>  {
> -	return perf_event_attr___scnprintf((void *)arg->augmented.args, bf, size, arg->trace->show_zeros);
> +	return perf_event_attr___scnprintf((void *)arg->augmented.args->value, bf, size, arg->trace->show_zeros);
>  }
>  
>  static size_t syscall_arg__scnprintf_perf_event_attr(char *bf, size_t size, struct syscall_arg *arg)
> diff --git a/tools/perf/trace/beauty/sockaddr.c b/tools/perf/trace/beauty/sockaddr.c
> index 2e0e867c0c1b..6ecebf776899 100644
> --- a/tools/perf/trace/beauty/sockaddr.c
> +++ b/tools/perf/trace/beauty/sockaddr.c
> @@ -47,7 +47,7 @@ static size_t (*af_scnprintfs[])(struct sockaddr *sa, char *bf, size_t size) = {
>  
>  static size_t syscall_arg__scnprintf_augmented_sockaddr(struct syscall_arg *arg, char *bf, size_t size)
>  {
> -	struct sockaddr *sa = (struct sockaddr *)arg->augmented.args;
> +	struct sockaddr *sa = (struct sockaddr *)arg->augmented.args->value;
>  	char family[32];
>  	size_t printed;
>  
> diff --git a/tools/perf/trace/beauty/timespec.c b/tools/perf/trace/beauty/timespec.c
> index e1a61f092aad..b14ab72a2738 100644
> --- a/tools/perf/trace/beauty/timespec.c
> +++ b/tools/perf/trace/beauty/timespec.c
> @@ -7,7 +7,7 @@
>  
>  static size_t syscall_arg__scnprintf_augmented_timespec(struct syscall_arg *arg, char *bf, size_t size)
>  {
> -	struct timespec *ts = (struct timespec *)arg->augmented.args;
> +	struct timespec *ts = (struct timespec *)arg->augmented.args->value;
>  
>  	return scnprintf(bf, size, "{ .tv_sec: %" PRIu64 ", .tv_nsec: %" PRIu64 " }", ts->tv_sec, ts->tv_nsec);
>  }
> -- 
> 2.45.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/8] perf trace: Pretty print struct data
  2024-08-28 21:04   ` Arnaldo Carvalho de Melo
@ 2024-08-30  0:16     ` Howard Chu
  2024-09-09 14:40       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-08-30  0:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Hello Arnaldo,

On Thu, Aug 29, 2024 at 5:05 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Sun, Aug 25, 2024 at 12:33:18AM +0800, Howard Chu wrote:
> > Change the arg->augmented.args to arg->augmented.args->value to skip the
> > header for customized pretty printers, since we collect data in BPF
> > using the general augment_sys_enter(), which always adds the header.
> >
> > Use btf_dump API to pretty print augmented struct pointer.
> >
> > Prefer existed pretty-printer than btf general pretty-printer.
> >
> > set compact = true and skip_names = true, so that no newline character
> > and argument name are printed.
> >
> > Committer notes:
> >
> > Simplified the btf_dump_snprintf callback to avoid using multiple
> > buffers, as discussed in the thread accessible via the Link tag below.
> >
> > Signed-off-by: Howard Chu <howardchu95@gmail.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Ian Rogers <irogers@google.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Kan Liang <kan.liang@linux.intel.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Link: https://lore.kernel.org/r/20240815013626.935097-7-howardchu95@gmail.com
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > ---
> >  tools/perf/builtin-trace.c                | 65 +++++++++++++++++++++--
> >  tools/perf/trace/beauty/perf_event_open.c |  2 +-
> >  tools/perf/trace/beauty/sockaddr.c        |  2 +-
> >  tools/perf/trace/beauty/timespec.c        |  2 +-
> >  4 files changed, 63 insertions(+), 8 deletions(-)
> >
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index 43b1f63415b4..048bcb92624c 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -990,6 +990,54 @@ static size_t btf_enum_scnprintf(const struct btf_type *type, struct btf *btf, c
> >       return 0;
> >  }
> >
> > +struct trace_btf_dump_snprintf_ctx {
> > +     char   *bf;
> > +     size_t printed, size;
> > +};
> > +
> > +static void trace__btf_dump_snprintf(void *vctx, const char *fmt, va_list args)
> > +{
> > +     struct trace_btf_dump_snprintf_ctx *ctx = vctx;
> > +
> > +     ctx->printed += vscnprintf(ctx->bf + ctx->printed, ctx->size - ctx->printed, fmt, args);
> > +}
> > +
> > +static size_t btf_struct_scnprintf(const struct btf_type *type, struct btf *btf, char *bf, size_t size, struct syscall_arg *arg)
> > +{
> > +     struct trace_btf_dump_snprintf_ctx ctx = {
> > +             .bf   = bf,
> > +             .size = size,
> > +     };
> > +     struct augmented_arg *augmented_arg = arg->augmented.args;
> > +     int type_id = arg->fmt->type_id, consumed;
> > +     struct btf_dump *btf_dump;
> > +
> > +     LIBBPF_OPTS(btf_dump_opts, dump_opts);
> > +     LIBBPF_OPTS(btf_dump_type_data_opts, dump_data_opts);
> > +
> > +     if (arg == NULL || arg->augmented.args == NULL)
> > +             return 0;
> > +
> > +     dump_data_opts.compact     = true;
> > +     dump_data_opts.skip_names  = true;
> > +
> > +     btf_dump = btf_dump__new(btf, trace__btf_dump_snprintf, &ctx, &dump_opts);
> > +     if (btf_dump == NULL)
> > +             return 0;
> > +
> > +     /* pretty print the struct data here */
> > +     if (btf_dump__dump_type_data(btf_dump, type_id, arg->augmented.args->value, type->size, &dump_data_opts) == 0)
> > +             return 0;
> > +
> > +     consumed = sizeof(*augmented_arg) + augmented_arg->size;
> > +     arg->augmented.args = ((void *)arg->augmented.args) + consumed;
> > +     arg->augmented.size -= consumed;
> > +
> > +     btf_dump__free(btf_dump);
> > +
> > +     return ctx.printed;
> > +}
> > +
> >  static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg, char *bf,
> >                                  size_t size, int val, char *type)
> >  {
> > @@ -1009,6 +1057,8 @@ static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg,
> >
> >       if (btf_is_enum(arg_fmt->type))
> >               return btf_enum_scnprintf(arg_fmt->type, trace->btf, bf, size, val);
> > +     else if (btf_is_struct(arg_fmt->type))
> > +             return btf_struct_scnprintf(arg_fmt->type, trace->btf, bf, size, arg);
> >
> >       return 0;
> >  }
> > @@ -2222,6 +2272,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
> >               .show_string_prefix = trace->show_string_prefix,
> >       };
> >       struct thread_trace *ttrace = thread__priv(thread);
> > +     void *default_scnprintf;
> >
> >       /*
> >        * Things like fcntl will set this in its 'cmd' formatter to pick the
> > @@ -2263,11 +2314,15 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
> >                       if (trace->show_arg_names)
> >                               printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
> >
> > -                     btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
> > -                                                        size - printed, val, field->type);
> > -                     if (btf_printed) {
> > -                             printed += btf_printed;
> > -                             continue;
> > +                     default_scnprintf = sc->arg_fmt[arg.idx].scnprintf;
> > +
> > +                     if (default_scnprintf == NULL || default_scnprintf == SCA_PTR) {
> > +                             btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
> > +                                                                size - printed, val, field->type);
> > +                             if (btf_printed) {
> > +                                     printed += btf_printed;
> > +                                     continue;
> > +                             }
>
> Ok, we agreed on this one, and you noted that that in this cset comment,
> good.
>
> Next time make a note after the cset commit log message and before the
> actual patch, something like:
>
> vN: prefer pre-existing userspace scnprintf if explicitely specified,
> only fallbacking to the generic BTF one when none is specified.
>
> >                       }
> >
> >                       printed += syscall_arg_fmt__scnprintf_val(&sc->arg_fmt[arg.idx],
> > diff --git a/tools/perf/trace/beauty/perf_event_open.c b/tools/perf/trace/beauty/perf_event_open.c
> > index 01ee15fe9d0c..632237128640 100644
> > --- a/tools/perf/trace/beauty/perf_event_open.c
> > +++ b/tools/perf/trace/beauty/perf_event_open.c
> > @@ -76,7 +76,7 @@ static size_t perf_event_attr___scnprintf(struct perf_event_attr *attr, char *bf
>
> But this part will work if we use the old collectors in the BPF skel?
>

You are right, it won't. The one scenario is that when a vmlinux file
is presented, the general collector (the new one, based on BTF) will
fallback to the old collectors, and now we have new
consumers(ptr->value). This will break.

> I.e. isn't this a change in the protocol of the BPF colector with the
> userpace augmented snprintf routines?
>
> If I remember we discussed that first you make this change in the
> protocol, test it with the pre-existing BPF collector, it works. Ok, now
> we have a new protocol and we then use it in the generic BTF-based BPF
> collector. This way that option of using the BTF-based collector or the
> preexisting BPF collector works.
>
> I'll try to do this.

Thank you so much.

Thanks,
Howard
>
> - Arnaldo
>
> >  static size_t syscall_arg__scnprintf_augmented_perf_event_attr(struct syscall_arg *arg, char *bf, size_t size)
> >  {
> > -     return perf_event_attr___scnprintf((void *)arg->augmented.args, bf, size, arg->trace->show_zeros);
> > +     return perf_event_attr___scnprintf((void *)arg->augmented.args->value, bf, size, arg->trace->show_zeros);
> >  }
> >
> >  static size_t syscall_arg__scnprintf_perf_event_attr(char *bf, size_t size, struct syscall_arg *arg)
> > diff --git a/tools/perf/trace/beauty/sockaddr.c b/tools/perf/trace/beauty/sockaddr.c
> > index 2e0e867c0c1b..6ecebf776899 100644
> > --- a/tools/perf/trace/beauty/sockaddr.c
> > +++ b/tools/perf/trace/beauty/sockaddr.c
> > @@ -47,7 +47,7 @@ static size_t (*af_scnprintfs[])(struct sockaddr *sa, char *bf, size_t size) = {
> >
> >  static size_t syscall_arg__scnprintf_augmented_sockaddr(struct syscall_arg *arg, char *bf, size_t size)
> >  {
> > -     struct sockaddr *sa = (struct sockaddr *)arg->augmented.args;
> > +     struct sockaddr *sa = (struct sockaddr *)arg->augmented.args->value;
> >       char family[32];
> >       size_t printed;
> >
> > diff --git a/tools/perf/trace/beauty/timespec.c b/tools/perf/trace/beauty/timespec.c
> > index e1a61f092aad..b14ab72a2738 100644
> > --- a/tools/perf/trace/beauty/timespec.c
> > +++ b/tools/perf/trace/beauty/timespec.c
> > @@ -7,7 +7,7 @@
> >
> >  static size_t syscall_arg__scnprintf_augmented_timespec(struct syscall_arg *arg, char *bf, size_t size)
> >  {
> > -     struct timespec *ts = (struct timespec *)arg->augmented.args;
> > +     struct timespec *ts = (struct timespec *)arg->augmented.args->value;
> >
> >       return scnprintf(bf, size, "{ .tv_sec: %" PRIu64 ", .tv_nsec: %" PRIu64 " }", ts->tv_sec, ts->tv_nsec);
> >  }
> > --
> > 2.45.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 6/8] perf trace: Collect augmented data using BPF
  2024-08-24 16:33 ` [PATCH v3 6/8] perf trace: Collect augmented data using BPF Howard Chu
@ 2024-09-04 19:52   ` Arnaldo Carvalho de Melo
  2024-09-04 21:11     ` Howard Chu
  0 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-04 19:52 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Sun, Aug 25, 2024 at 12:33:20AM +0800, Howard Chu wrote:
> Include trace_augment.h for TRACE_AUG_MAX_BUF, so that BPF reads
> TRACE_AUG_MAX_BUF bytes of buffer maximum.
> 
> Determine what type of argument and how many bytes to read from user space, us ing the
> value in the beauty_map. This is the relation of parameter type and its corres ponding
> value in the beauty map, and how many bytes we read eventually:
> 
> string: 1                          -> size of string (till null)
> struct: size of struct             -> size of struct
> buffer: -1 * (index of paired len) -> value of paired len (maximum: TRACE_AUG_ MAX_BUF)
> 
> After reading from user space, we output the augmented data using
> bpf_perf_event_output().
> 
> If the struct augmenter, augment_sys_enter() failed, we fall back to
> using bpf_tail_call().
> 
> I have to make the payload 6 times the size of augmented_arg, to pass the
> BPF verifier.
> 
> Committer notes:
> 
> It works, but we need to wire it up to the userspace specialized pretty
> printer, otherwise we get things like:
> 
>   root@number:~# perf trace -e connect ssh localhost
>        0.000 ( 0.010 ms): :784442/784442 connect(fd: 3, uservaddr: {2,}, addrlen: 16)                          = 0
>        0.016 ( 0.006 ms): :784442/784442 connect(fd: 3, uservaddr: {10,}, addrlen: 28)                         = 0
>        0.033 ( 0.096 ms): :784442/784442 connect(fd: 3, uservaddr: {10,}, addrlen: 28)                         = 0
>   root@localhost's password:     71.292 ( 0.037 ms): ssh/784442 connect(fd: 4, uservaddr: {1,{['/','v','a','r','/','r','u','n','/','.','h','e','i','m',],},}, addrlen: 110) = 0
>       72.087 ( 0.013 ms): ssh/784442 connect(fd: 4, uservaddr: {1,{['/','v','a','r','/','r','u','n','/','.','h','e','i','m',],},}, addrlen: 110) = 0
> 
>   root@number:~#
> 
> When we used to have:
> 
>   root@number:~# perf trace -e connect ssh localhost
>        0.000 ( 0.011 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET, port: 22, addr: 127.0.0.1 }, addrlen: 16) = 0
>        0.017 ( 0.006 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
>        0.263 ( 0.043 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
>       63.770 ( 0.044 ms): ssh/786564 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
>       65.467 ( 0.042 ms): ssh/786564 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
>   root@localhost's password:
> 
> That is closer to what strace produces:
> 
>   root@number:~# strace -e connect ssh localhost
>   connect(3, {sa_family=AF_INET, sin_port=htons(22), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
>   connect(3, {sa_family=AF_INET6, sin6_port=htons(22), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
>   connect(3, {sa_family=AF_INET6, sin6_port=htons(22), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
>   connect(4, {sa_family=AF_UNIX, sun_path="/var/run/.heim_org.h5l.kcm-socket"}, 110) = 0
>   connect(4, {sa_family=AF_UNIX, sun_path="/var/run/.heim_org.h5l.kcm-socket"}, 110) = 0
>   root@localhost's password:
> 
> Signed-off-by: Howard Chu <howardchu95@gmail.com>
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Link: https://lore.kernel.org/r/20240815013626.935097-10-howardchu95@gmail.com
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  .../bpf_skel/augmented_raw_syscalls.bpf.c     | 114 +++++++++++++++++-
>  1 file changed, 113 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> index 0acbd74e8c76..f29a8dfca044 100644
> --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> @@ -7,6 +7,8 @@
>   */
>  
>  #include "vmlinux.h"
> +#include "../trace_augment.h"
> +
>  #include <bpf/bpf_helpers.h>
>  #include <linux/limits.h>
>  
> @@ -124,6 +126,25 @@ struct augmented_args_tmp {
>  	__uint(max_entries, 1);
>  } augmented_args_tmp SEC(".maps");
>  
> +struct beauty_payload_enter {
> +	struct syscall_enter_args args;
> +	struct augmented_arg aug_args[6];
> +};

⬢[acme@toolbox perf-tools-next]$ pahole -C augmented_arg /tmp/build/perf-tools-next/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o 
struct augmented_arg {
	unsigned int               size;                 /*     0     4 */
	int                        err;                  /*     4     4 */
	char                       value[4096];          /*     8  4096 */

	/* size: 4104, cachelines: 65, members: 3 */
	/* last cacheline: 8 bytes */
};

⬢[acme@toolbox perf-tools-next]$

⬢[acme@toolbox perf-tools-next]$ pahole -C syscall_enter_args /tmp/build/perf-tools-next/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o 
struct syscall_enter_args {
	unsigned long long         common_tp_fields;     /*     0     8 */
	long                       syscall_nr;           /*     8     8 */
	unsigned long              args[6];              /*    16    48 */

	/* size: 64, cachelines: 1, members: 3 */
};

So the entry has the theoretical max limit for the augmented payload
which would be 6 * sizeof(struct augmented_arg) + the common header for
tracepoints (unaugmented), a lot of space, but...

> +static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> +{
> +	bool augmented, do_output = false;
> +	int zero = 0, size, aug_size, index, output = 0,
> +	    value_size = sizeof(struct augmented_arg) - offsetof(struct augmented_arg, value);
> +	unsigned int nr, *beauty_map;
> +	struct beauty_payload_enter *payload;
> +	void *arg, *payload_offset;
> +
> +	/* fall back to do predefined tail call */
> +	if (args == NULL)
> +		return 1;
> +
> +	/* use syscall number to get beauty_map entry */
> +	nr             = (__u32)args->syscall_nr;
> +	beauty_map     = bpf_map_lookup_elem(&beauty_map_enter, &nr);

If we don't set up the entry for this syscall, then there are no
pointers to collect, we return early and the tracepoint will not filter
it and there it goes, up unmodified, if not already filtered by a
tracepoint filter, ok, I'll add these comments here.

> +	/* set up payload for output */
> +	payload        = bpf_map_lookup_elem(&beauty_payload_enter_map, &zero);
> +	payload_offset = (void *)&payload->aug_args;
> > +	if (beauty_map == NULL || payload == NULL)
> +		return 1;

We can save the lookup for payload if the one for beauty_map returned
NULL, I'll do this fixup.

> +	/* copy the sys_enter header, which has the syscall_nr */
> +	__builtin_memcpy(&payload->args, args, sizeof(struct syscall_enter_args));


> +	/*
> +	 * Determine what type of argument and how many bytes to read from user space, using the
> +	 * value in the beauty_map. This is the relation of parameter type and its corresponding
> +	 * value in the beauty map, and how many bytes we read eventually:
> +	 *
> +	 * string: 1			      -> size of string
> +	 * struct: size of struct	      -> size of struct
> +	 * buffer: -1 * (index of paired len) -> value of paired len (maximum: TRACE_AUG_MAX_BUF)

'paired_len'? Ok, 1, size of string or struct


Will continue the detailed analysis later, as I need to change the
existing BPF collector, before applying this, to match the protocol here
(that will always have the size of each argument encoded in the ring
buffer, easier, but uses more space).

- Arnaldo

> +	 */
> +	for (int i = 0; i < 6; i++) {
> +		arg = (void *)args->args[i];
> +		augmented = false;
> +		size = beauty_map[i];
> +		aug_size = size; /* size of the augmented data read from user space */
> +
> +		if (size == 0 || arg == NULL)
> +			continue;
> +
> +		if (size == 1) { /* string */
> +			aug_size = bpf_probe_read_user_str(((struct augmented_arg *)payload_offset)->value, value_size, arg);
> +			/* minimum of 0 to pass the verifier */
> +			if (aug_size < 0)
> +				aug_size = 0;
> +
> +			augmented = true;
> +		} else if (size > 0 && size <= value_size) { /* struct */
> +			if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, size, arg))
> +				augmented = true;
> +		} else if (size < 0 && size >= -6) { /* buffer */
> +			index = -(size + 1);
> +			aug_size = args->args[index];
> +
> +			if (aug_size > TRACE_AUG_MAX_BUF)
> +				aug_size = TRACE_AUG_MAX_BUF;
> +
> +			if (aug_size > 0) {
> +				if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, aug_size, arg))
> +					augmented = true;
> +			}
> +		}
> +
> +		/* write data to payload */
> +		if (augmented) {
> +			int written = offsetof(struct augmented_arg, value) + aug_size;
> +
> +			((struct augmented_arg *)payload_offset)->size = aug_size;
> +			output += written;
> +			payload_offset += written;
> +			do_output = true;
> +		}
> +	}
> +
> +	if (!do_output)
> +		return 1;
> +
> +	return augmented__beauty_output(ctx, payload, sizeof(struct syscall_enter_args) + output);

We end up pushing up to the ring buffer all the way to 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 6/8] perf trace: Collect augmented data using BPF
  2024-09-04 19:52   ` Arnaldo Carvalho de Melo
@ 2024-09-04 21:11     ` Howard Chu
  2024-09-11 14:23       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-09-04 21:11 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Hello Arnaldo,

On Wed, Sep 4, 2024 at 12:53 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Sun, Aug 25, 2024 at 12:33:20AM +0800, Howard Chu wrote:
> > Include trace_augment.h for TRACE_AUG_MAX_BUF, so that BPF reads
> > TRACE_AUG_MAX_BUF bytes of buffer maximum.
> >
> > Determine what type of argument and how many bytes to read from user space, us ing the
> > value in the beauty_map. This is the relation of parameter type and its corres ponding
> > value in the beauty map, and how many bytes we read eventually:
> >
> > string: 1                          -> size of string (till null)
> > struct: size of struct             -> size of struct
> > buffer: -1 * (index of paired len) -> value of paired len (maximum: TRACE_AUG_ MAX_BUF)
> >
> > After reading from user space, we output the augmented data using
> > bpf_perf_event_output().
> >
> > If the struct augmenter, augment_sys_enter() failed, we fall back to
> > using bpf_tail_call().
> >
> > I have to make the payload 6 times the size of augmented_arg, to pass the
> > BPF verifier.
> >
> > Committer notes:
> >
> > It works, but we need to wire it up to the userspace specialized pretty
> > printer, otherwise we get things like:
> >
> >   root@number:~# perf trace -e connect ssh localhost
> >        0.000 ( 0.010 ms): :784442/784442 connect(fd: 3, uservaddr: {2,}, addrlen: 16)                          = 0
> >        0.016 ( 0.006 ms): :784442/784442 connect(fd: 3, uservaddr: {10,}, addrlen: 28)                         = 0
> >        0.033 ( 0.096 ms): :784442/784442 connect(fd: 3, uservaddr: {10,}, addrlen: 28)                         = 0
> >   root@localhost's password:     71.292 ( 0.037 ms): ssh/784442 connect(fd: 4, uservaddr: {1,{['/','v','a','r','/','r','u','n','/','.','h','e','i','m',],},}, addrlen: 110) = 0
> >       72.087 ( 0.013 ms): ssh/784442 connect(fd: 4, uservaddr: {1,{['/','v','a','r','/','r','u','n','/','.','h','e','i','m',],},}, addrlen: 110) = 0
> >
> >   root@number:~#
> >
> > When we used to have:
> >
> >   root@number:~# perf trace -e connect ssh localhost
> >        0.000 ( 0.011 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET, port: 22, addr: 127.0.0.1 }, addrlen: 16) = 0
> >        0.017 ( 0.006 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
> >        0.263 ( 0.043 ms): ssh/786564 connect(fd: 3, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
> >       63.770 ( 0.044 ms): ssh/786564 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
> >       65.467 ( 0.042 ms): ssh/786564 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
> >   root@localhost's password:
> >
> > That is closer to what strace produces:
> >
> >   root@number:~# strace -e connect ssh localhost
> >   connect(3, {sa_family=AF_INET, sin_port=htons(22), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
> >   connect(3, {sa_family=AF_INET6, sin6_port=htons(22), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
> >   connect(3, {sa_family=AF_INET6, sin6_port=htons(22), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
> >   connect(4, {sa_family=AF_UNIX, sun_path="/var/run/.heim_org.h5l.kcm-socket"}, 110) = 0
> >   connect(4, {sa_family=AF_UNIX, sun_path="/var/run/.heim_org.h5l.kcm-socket"}, 110) = 0
> >   root@localhost's password:
> >
> > Signed-off-by: Howard Chu <howardchu95@gmail.com>
> > Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Ian Rogers <irogers@google.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Kan Liang <kan.liang@linux.intel.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Link: https://lore.kernel.org/r/20240815013626.935097-10-howardchu95@gmail.com
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > ---
> >  .../bpf_skel/augmented_raw_syscalls.bpf.c     | 114 +++++++++++++++++-
> >  1 file changed, 113 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > index 0acbd74e8c76..f29a8dfca044 100644
> > --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > @@ -7,6 +7,8 @@
> >   */
> >
> >  #include "vmlinux.h"
> > +#include "../trace_augment.h"
> > +
> >  #include <bpf/bpf_helpers.h>
> >  #include <linux/limits.h>
> >
> > @@ -124,6 +126,25 @@ struct augmented_args_tmp {
> >       __uint(max_entries, 1);
> >  } augmented_args_tmp SEC(".maps");
> >
> > +struct beauty_payload_enter {
> > +     struct syscall_enter_args args;
> > +     struct augmented_arg aug_args[6];
> > +};
>
> ⬢[acme@toolbox perf-tools-next]$ pahole -C augmented_arg /tmp/build/perf-tools-next/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o
> struct augmented_arg {
>         unsigned int               size;                 /*     0     4 */
>         int                        err;                  /*     4     4 */
>         char                       value[4096];          /*     8  4096 */
>
>         /* size: 4104, cachelines: 65, members: 3 */
>         /* last cacheline: 8 bytes */
> };
>
> ⬢[acme@toolbox perf-tools-next]$
>
> ⬢[acme@toolbox perf-tools-next]$ pahole -C syscall_enter_args /tmp/build/perf-tools-next/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o
> struct syscall_enter_args {
>         unsigned long long         common_tp_fields;     /*     0     8 */
>         long                       syscall_nr;           /*     8     8 */
>         unsigned long              args[6];              /*    16    48 */
>
>         /* size: 64, cachelines: 1, members: 3 */
> };
>
> So the entry has the theoretical max limit for the augmented payload
> which would be 6 * sizeof(struct augmented_arg) + the common header for
> tracepoints (unaugmented), a lot of space, but...

Yes, if I don't use this much space, and try to do a flexible payload
length, I won't be able to pass the BPF verifier. I will try to fix
this wasting problem.

>
> > +static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> > +{
> > +     bool augmented, do_output = false;
> > +     int zero = 0, size, aug_size, index, output = 0,
> > +         value_size = sizeof(struct augmented_arg) - offsetof(struct augmented_arg, value);
> > +     unsigned int nr, *beauty_map;
> > +     struct beauty_payload_enter *payload;
> > +     void *arg, *payload_offset;
> > +
> > +     /* fall back to do predefined tail call */
> > +     if (args == NULL)
> > +             return 1;
> > +
> > +     /* use syscall number to get beauty_map entry */
> > +     nr             = (__u32)args->syscall_nr;
> > +     beauty_map     = bpf_map_lookup_elem(&beauty_map_enter, &nr);
>
> If we don't set up the entry for this syscall, then there are no
> pointers to collect, we return early and the tracepoint will not filter
> it and there it goes, up unmodified, if not already filtered by a
> tracepoint filter, ok, I'll add these comments here.

Thank you.

>
> > +     /* set up payload for output */
> > +     payload        = bpf_map_lookup_elem(&beauty_payload_enter_map, &zero);
> > +     payload_offset = (void *)&payload->aug_args;
> > > +   if (beauty_map == NULL || payload == NULL)
> > +             return 1;
>
> We can save the lookup for payload if the one for beauty_map returned
> NULL, I'll do this fixup.

Thanks that makes sense.

>
> > +     /* copy the sys_enter header, which has the syscall_nr */
> > +     __builtin_memcpy(&payload->args, args, sizeof(struct syscall_enter_args));
>
>
> > +     /*
> > +      * Determine what type of argument and how many bytes to read from user space, using the
> > +      * value in the beauty_map. This is the relation of parameter type and its corresponding
> > +      * value in the beauty map, and how many bytes we read eventually:
> > +      *
> > +      * string: 1                          -> size of string
> > +      * struct: size of struct             -> size of struct
> > +      * buffer: -1 * (index of paired len) -> value of paired len (maximum: TRACE_AUG_MAX_BUF)
>
> 'paired_len'? Ok, 1, size of string or struct

I mean the buffer always comes with a size indicator. For instance
read syscall's 'buf' and 'count', in this case 'count' will pair with
'buf'. Sorry for the confusion. See below:

read: (unsigned int fd, char *buf, size_t count)

>
>
> Will continue the detailed analysis later, as I need to change the
> existing BPF collector, before applying this, to match the protocol here
> (that will always have the size of each argument encoded in the ring
> buffer, easier, but uses more space).

Thank you for adding the protocol data size fix.

>
> - Arnaldo
>
> > +      */
> > +     for (int i = 0; i < 6; i++) {
> > +             arg = (void *)args->args[i];
> > +             augmented = false;
> > +             size = beauty_map[i];
> > +             aug_size = size; /* size of the augmented data read from user space */
> > +
> > +             if (size == 0 || arg == NULL)
> > +                     continue;
> > +
> > +             if (size == 1) { /* string */
> > +                     aug_size = bpf_probe_read_user_str(((struct augmented_arg *)payload_offset)->value, value_size, arg);
> > +                     /* minimum of 0 to pass the verifier */
> > +                     if (aug_size < 0)
> > +                             aug_size = 0;
> > +
> > +                     augmented = true;
> > +             } else if (size > 0 && size <= value_size) { /* struct */
> > +                     if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, size, arg))
> > +                             augmented = true;
> > +             } else if (size < 0 && size >= -6) { /* buffer */
> > +                     index = -(size + 1);
> > +                     aug_size = args->args[index];
> > +
> > +                     if (aug_size > TRACE_AUG_MAX_BUF)
> > +                             aug_size = TRACE_AUG_MAX_BUF;
> > +
> > +                     if (aug_size > 0) {
> > +                             if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, aug_size, arg))
> > +                                     augmented = true;
> > +                     }
> > +             }
> > +
> > +             /* write data to payload */
> > +             if (augmented) {
> > +                     int written = offsetof(struct augmented_arg, value) + aug_size;
> > +
> > +                     ((struct augmented_arg *)payload_offset)->size = aug_size;
> > +                     output += written;
> > +                     payload_offset += written;
> > +                     do_output = true;
> > +             }
> > +     }
> > +
> > +     if (!do_output)
> > +             return 1;
> > +
> > +     return augmented__beauty_output(ctx, payload, sizeof(struct syscall_enter_args) + output);
>
> We end up pushing up to the ring buffer all the way to

Thanks,
Howard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/8] perf trace: Pretty print struct data
  2024-08-30  0:16     ` Howard Chu
@ 2024-09-09 14:40       ` Arnaldo Carvalho de Melo
  2024-09-09 14:46         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 14:40 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Fri, Aug 30, 2024 at 08:16:38AM +0800, Howard Chu wrote:
> Hello Arnaldo,
> 
> On Thu, Aug 29, 2024 at 5:05 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Sun, Aug 25, 2024 at 12:33:18AM +0800, Howard Chu wrote:
> > > Change the arg->augmented.args to arg->augmented.args->value to skip the
> > > header for customized pretty printers, since we collect data in BPF
> > > using the general augment_sys_enter(), which always adds the header.
> > >
> > > Use btf_dump API to pretty print augmented struct pointer.
> > >
> > > Prefer existed pretty-printer than btf general pretty-printer.
> > >
> > > set compact = true and skip_names = true, so that no newline character
> > > and argument name are printed.
> > >
> > > Committer notes:
> > >
> > > Simplified the btf_dump_snprintf callback to avoid using multiple
> > > buffers, as discussed in the thread accessible via the Link tag below.
> > >
> > > Signed-off-by: Howard Chu <howardchu95@gmail.com>
> > > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > > Cc: Ian Rogers <irogers@google.com>
> > > Cc: Jiri Olsa <jolsa@kernel.org>
> > > Cc: Kan Liang <kan.liang@linux.intel.com>
> > > Cc: Namhyung Kim <namhyung@kernel.org>
> > > Link: https://lore.kernel.org/r/20240815013626.935097-7-howardchu95@gmail.com
> > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > ---
> > >  tools/perf/builtin-trace.c                | 65 +++++++++++++++++++++--
> > >  tools/perf/trace/beauty/perf_event_open.c |  2 +-
> > >  tools/perf/trace/beauty/sockaddr.c        |  2 +-
> > >  tools/perf/trace/beauty/timespec.c        |  2 +-
> > >  4 files changed, 63 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > > index 43b1f63415b4..048bcb92624c 100644
> > > --- a/tools/perf/builtin-trace.c
> > > +++ b/tools/perf/builtin-trace.c
> > > @@ -990,6 +990,54 @@ static size_t btf_enum_scnprintf(const struct btf_type *type, struct btf *btf, c
> > >       return 0;
> > >  }
> > >
> > > +struct trace_btf_dump_snprintf_ctx {
> > > +     char   *bf;
> > > +     size_t printed, size;
> > > +};
> > > +
> > > +static void trace__btf_dump_snprintf(void *vctx, const char *fmt, va_list args)
> > > +{
> > > +     struct trace_btf_dump_snprintf_ctx *ctx = vctx;
> > > +
> > > +     ctx->printed += vscnprintf(ctx->bf + ctx->printed, ctx->size - ctx->printed, fmt, args);
> > > +}
> > > +
> > > +static size_t btf_struct_scnprintf(const struct btf_type *type, struct btf *btf, char *bf, size_t size, struct syscall_arg *arg)
> > > +{
> > > +     struct trace_btf_dump_snprintf_ctx ctx = {
> > > +             .bf   = bf,
> > > +             .size = size,
> > > +     };
> > > +     struct augmented_arg *augmented_arg = arg->augmented.args;
> > > +     int type_id = arg->fmt->type_id, consumed;
> > > +     struct btf_dump *btf_dump;
> > > +
> > > +     LIBBPF_OPTS(btf_dump_opts, dump_opts);
> > > +     LIBBPF_OPTS(btf_dump_type_data_opts, dump_data_opts);
> > > +
> > > +     if (arg == NULL || arg->augmented.args == NULL)
> > > +             return 0;
> > > +
> > > +     dump_data_opts.compact     = true;
> > > +     dump_data_opts.skip_names  = true;
> > > +
> > > +     btf_dump = btf_dump__new(btf, trace__btf_dump_snprintf, &ctx, &dump_opts);
> > > +     if (btf_dump == NULL)
> > > +             return 0;
> > > +
> > > +     /* pretty print the struct data here */
> > > +     if (btf_dump__dump_type_data(btf_dump, type_id, arg->augmented.args->value, type->size, &dump_data_opts) == 0)
> > > +             return 0;
> > > +
> > > +     consumed = sizeof(*augmented_arg) + augmented_arg->size;
> > > +     arg->augmented.args = ((void *)arg->augmented.args) + consumed;
> > > +     arg->augmented.size -= consumed;
> > > +
> > > +     btf_dump__free(btf_dump);
> > > +
> > > +     return ctx.printed;
> > > +}
> > > +
> > >  static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg, char *bf,
> > >                                  size_t size, int val, char *type)
> > >  {
> > > @@ -1009,6 +1057,8 @@ static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg,
> > >
> > >       if (btf_is_enum(arg_fmt->type))
> > >               return btf_enum_scnprintf(arg_fmt->type, trace->btf, bf, size, val);
> > > +     else if (btf_is_struct(arg_fmt->type))
> > > +             return btf_struct_scnprintf(arg_fmt->type, trace->btf, bf, size, arg);
> > >
> > >       return 0;
> > >  }
> > > @@ -2222,6 +2272,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
> > >               .show_string_prefix = trace->show_string_prefix,
> > >       };
> > >       struct thread_trace *ttrace = thread__priv(thread);
> > > +     void *default_scnprintf;
> > >
> > >       /*
> > >        * Things like fcntl will set this in its 'cmd' formatter to pick the
> > > @@ -2263,11 +2314,15 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
> > >                       if (trace->show_arg_names)
> > >                               printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
> > >
> > > -                     btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
> > > -                                                        size - printed, val, field->type);
> > > -                     if (btf_printed) {
> > > -                             printed += btf_printed;
> > > -                             continue;
> > > +                     default_scnprintf = sc->arg_fmt[arg.idx].scnprintf;
> > > +
> > > +                     if (default_scnprintf == NULL || default_scnprintf == SCA_PTR) {
> > > +                             btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
> > > +                                                                size - printed, val, field->type);
> > > +                             if (btf_printed) {
> > > +                                     printed += btf_printed;
> > > +                                     continue;
> > > +                             }
> >
> > Ok, we agreed on this one, and you noted that that in this cset comment,
> > good.
> >
> > Next time make a note after the cset commit log message and before the
> > actual patch, something like:
> >
> > vN: prefer pre-existing userspace scnprintf if explicitely specified,
> > only fallbacking to the generic BTF one when none is specified.
> >
> > >                       }
> > >
> > >                       printed += syscall_arg_fmt__scnprintf_val(&sc->arg_fmt[arg.idx],
> > > diff --git a/tools/perf/trace/beauty/perf_event_open.c b/tools/perf/trace/beauty/perf_event_open.c
> > > index 01ee15fe9d0c..632237128640 100644
> > > --- a/tools/perf/trace/beauty/perf_event_open.c
> > > +++ b/tools/perf/trace/beauty/perf_event_open.c
> > > @@ -76,7 +76,7 @@ static size_t perf_event_attr___scnprintf(struct perf_event_attr *attr, char *bf
> >
> > But this part will work if we use the old collectors in the BPF skel?
> >
> 
> You are right, it won't. The one scenario is that when a vmlinux file
> is presented, the general collector (the new one, based on BTF) will
> fallback to the old collectors, and now we have new
> consumers(ptr->value). This will break.
> 
> > I.e. isn't this a change in the protocol of the BPF colector with the
> > userpace augmented snprintf routines?
> >
> > If I remember we discussed that first you make this change in the
> > protocol, test it with the pre-existing BPF collector, it works. Ok, now
> > we have a new protocol and we then use it in the generic BTF-based BPF
> > collector. This way that option of using the BTF-based collector or the
> > preexisting BPF collector works.
> >
> > I'll try to do this.
> 
> Thank you so much.

Now that I have:

6b22c2b502a1c21b (x1/perf-tools-next, perf-tools-next/tmp.perf-tools-next, acme/tmp.perf-tools-next) perf trace: Use a common encoding for augmented arguments, with size + error + payload
5d9cd24924f57066 perf trace augmented_syscalls.bpf: Move the renameat aumenter to renameat2, temporarily

And also:

7bedcbaefdf5d4f7 perf trace: Pass the richer 'struct syscall_arg' pointer to trace__btf_scnprintf()
8df1d8c6cbd6825b perf trace: Fix perf trace -p <PID>

I can move to this patch, but then it doesn't build as it uses things
that will only be available when we get to adding the new BPF components
in a later patch:

builtin-trace.c: In function ‘trace__init_syscalls_bpf_prog_array_maps’:
builtin-trace.c:3725:58: error: ‘struct <anonymous>’ has no member named ‘beauty_map_enter’
 3725 |         int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
      |                                                          ^
make[3]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:105: /tmp/build/perf-tools-next/builtin-trace.o] Error 1
make[3]: *** Waiting for unfinished jobs....

It will only come when we add:

Subject: [PATCH v3 6/8] perf trace: Collect augmented data using BPF

Where we have:

+struct beauty_map_enter {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__type(key, int);
+	__type(value, __u32[6]);
+	__uint(max_entries, 512);
+} beauty_map_enter SEC(".maps");


So I'll try to combine the addition of the map with the code that uses
it in the builtin-trace.c codebase.

- Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/8] perf trace: Pretty print struct data
  2024-09-09 14:40       ` Arnaldo Carvalho de Melo
@ 2024-09-09 14:46         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 14:46 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Mon, Sep 09, 2024 at 11:40:14AM -0300, Arnaldo Carvalho de Melo wrote:
> On Fri, Aug 30, 2024 at 08:16:38AM +0800, Howard Chu wrote:
> > On Thu, Aug 29, 2024 at 5:05 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > I.e. isn't this a change in the protocol of the BPF colector with the
> > > userpace augmented snprintf routines?

> > > If I remember we discussed that first you make this change in the
> > > protocol, test it with the pre-existing BPF collector, it works. Ok, now
> > > we have a new protocol and we then use it in the generic BTF-based BPF
> > > collector. This way that option of using the BTF-based collector or the
> > > preexisting BPF collector works.

> > > I'll try to do this.

> > Thank you so much.

> Now that I have:

> 6b22c2b502a1c21b (x1/perf-tools-next, perf-tools-next/tmp.perf-tools-next, acme/tmp.perf-tools-next) perf trace: Use a common encoding for augmented arguments, with size + error + payload
> 5d9cd24924f57066 perf trace augmented_syscalls.bpf: Move the renameat aumenter to renameat2, temporarily

> And also:

> 7bedcbaefdf5d4f7 perf trace: Pass the richer 'struct syscall_arg' pointer to trace__btf_scnprintf()
> 8df1d8c6cbd6825b perf trace: Fix perf trace -p <PID>

> I can move to this patch, but then it doesn't build as it uses things
> that will only be available when we get to adding the new BPF components
> in a later patch:
> 
> builtin-trace.c: In function ‘trace__init_syscalls_bpf_prog_array_maps’:
> builtin-trace.c:3725:58: error: ‘struct <anonymous>’ has no member named ‘beauty_map_enter’
>  3725 |         int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
>       |                                                          ^
> make[3]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:105: /tmp/build/perf-tools-next/builtin-trace.o] Error 1
> make[3]: *** Waiting for unfinished jobs....
> 
> It will only come when we add:
> 
> Subject: [PATCH v3 6/8] perf trace: Collect augmented data using BPF
> 
> Where we have:
> 
> +struct beauty_map_enter {
> +	__uint(type, BPF_MAP_TYPE_HASH);
> +	__type(key, int);
> +	__type(value, __u32[6]);
> +	__uint(max_entries, 512);
> +} beauty_map_enter SEC(".maps");
> 
> 
> So I'll try to combine the addition of the map with the code that uses
> it in the builtin-trace.c codebase.
> 

diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
index 9c7d2f8552945695..4ebce67637435192 100644
--- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
+++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
@@ -124,6 +124,13 @@ struct augmented_args_tmp {
 	__uint(max_entries, 1);
 } augmented_args_tmp SEC(".maps");
 
+struct beauty_map_enter {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__type(key, int);
+	__type(value, __u32[6]);
+	__uint(max_entries, 512);
+} beauty_map_enter SEC(".maps");
+
 static inline struct augmented_args_payload *augmented_args_payload(void)
 {
 	int key = 0;
This did the trick:

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] perf trace: Pretty print buffer data
  2024-08-24 16:33 ` [PATCH v3 5/8] perf trace: Pretty print buffer data Howard Chu
@ 2024-09-09 16:33   ` Arnaldo Carvalho de Melo
  2024-09-09 16:45     ` Arnaldo Carvalho de Melo
  2024-09-09 17:14     ` Howard Chu
  0 siblings, 2 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 16:33 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Sun, Aug 25, 2024 at 12:33:19AM +0800, Howard Chu wrote:
> Define TRACE_AUG_MAX_BUF in trace_augment.h data, which is the maximum
> buffer size we can augment. BPF will include this header too.
> 
> Print buffer in a way that's different than just printing a string, we
> print all the control characters in \digits (such as \0 for null, and
> \10 for newline, LF).
> 
> For character that has a bigger value than 127, we print the digits
> instead of the character itself as well.
> 
> Committer notes:
> 
> Simplified the buffer scnprintf to avoid using multiple buffers as
> discussed in the patch review thread.
> 
> Signed-off-by: Howard Chu <howardchu95@gmail.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Link: https://lore.kernel.org/r/20240815013626.935097-8-howardchu95@gmail.com
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/builtin-trace.c      | 33 +++++++++++++++++++++++++++++++++
>  tools/perf/util/trace_augment.h |  6 ++++++
>  2 files changed, 39 insertions(+)
>  create mode 100644 tools/perf/util/trace_augment.h
> 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 048bcb92624c..470d74e3f875 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -65,6 +65,7 @@
>  #include "syscalltbl.h"
>  #include "rb_resort.h"
>  #include "../perf.h"
> +#include "trace_augment.h"
>  
>  #include <errno.h>
>  #include <inttypes.h>
> @@ -852,6 +853,10 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
>  
>  #define SCA_FILENAME syscall_arg__scnprintf_filename
>  
> +static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg);
> +
> +#define SCA_BUF syscall_arg__scnprintf_buf
> +
>  static size_t syscall_arg__scnprintf_pipe_flags(char *bf, size_t size,
>  						struct syscall_arg *arg)
>  {
> @@ -1745,6 +1750,32 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
>  	return 0;
>  }
>  
> +#define MAX_CONTROL_CHAR 31
> +#define MAX_ASCII 127
> +
> +static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg)
> +{
> +	struct augmented_arg *augmented_arg = arg->augmented.args;
> +	unsigned char *orig = (unsigned char *)augmented_arg->value;
> +	size_t printed = 0;
> +	int consumed;
> +
> +	if (augmented_arg == NULL)
> +		return 0;
> +
> +	for (int j = 0; j < augmented_arg->size; ++j) {
> +		bool control_char = orig[j] <= MAX_CONTROL_CHAR || orig[j] >= MAX_ASCII;
> +		/* print control characters (0~31 and 127), and non-ascii characters in \(digits) */
> +		printed += scnprintf(bf + printed, size - printed, control_char ? "\\%d" : "%c", (int)orig[j]);
> +	}
> +
> +	consumed = sizeof(*augmented_arg) + augmented_arg->size;
> +	arg->augmented.args = ((void *)arg->augmented.args) + consumed;
> +	arg->augmented.size -= consumed;
> +
> +	return printed;
> +}
> +
>  static bool trace__filter_duration(struct trace *trace, double t)
>  {
>  	return t < (trace->duration_filter * NSEC_PER_MSEC);
> @@ -1956,6 +1987,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
>  		    ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
>  		     strstr(field->name, "path") != NULL))
>  			arg->scnprintf = SCA_FILENAME;
> +		else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> +			arg->scnprintf = SCA_BUF;

You can't really do this for things like 'read' as we would be printing
whatever is in the buffer when we enter the syscall, right? As we can
see testing after applying the following patch:

root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
     0.000 ( 0.004 ms): cat/291442 read(fd: 3, buf: \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0, count: 832) = 832
     0.231 ( 0.004 ms): cat/291442 read(fd: 3, buf: , count: 131072)                                     = 3224
     0.236 ( 0.001 ms): cat/291442 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      = 3224
     0.239 ( 0.001 ms): cat/291442 read(fd: 3, buf: root:x:0:0:Super User:/root:/bin, count: 131072)     = 0
root@number:~#

So we can't really do it at this point, we have to do it, for now, by
doing it on that syscall table initialization, for instance, for the
'write' syscall:

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 5f0877e891c2047d..1bcb45e737d830bf 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1379,6 +1379,8 @@ static const struct syscall_fmt syscall_fmts[] = {
 	  .arg = { [2] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
 	{ .name	    = "waitid",	    .errpid = true,
 	  .arg = { [3] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
+	{ .name	    = "write",	    .errpid = true,
+	  .arg = { [1] = { .scnprintf = SCA_BUF, /* buf */ }, }, },
 };
 
 static int syscall_fmt__cmp(const void *name, const void *fmtp)
@@ -1987,8 +1989,6 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
 		    ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
 		     strstr(field->name, "path") != NULL))
 			arg->scnprintf = SCA_FILENAME;
-		else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
-			arg->scnprintf = SCA_BUF;
 		else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
 			arg->scnprintf = SCA_PTR;
 		else if (strcmp(field->type, "pid_t") == 0)

With that we get:

root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
     0.000 ( 0.005 ms): cat/296870 read(fd: 3, buf: 0x7ffe9cb8df98, count: 832)                          = 832
     0.268 ( 0.004 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 3224
     0.273 ( 0.002 ms): cat/296870 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      = 
     0.276 ( 0.001 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 0
root@number:~#

After the following patch is applied.

- Arnaldo

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] perf trace: Pretty print buffer data
  2024-09-09 16:33   ` Arnaldo Carvalho de Melo
@ 2024-09-09 16:45     ` Arnaldo Carvalho de Melo
  2024-09-09 16:50       ` Arnaldo Carvalho de Melo
  2024-09-09 17:17       ` Howard Chu
  2024-09-09 17:14     ` Howard Chu
  1 sibling, 2 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 16:45 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Mon, Sep 09, 2024 at 01:33:17PM -0300, Arnaldo Carvalho de Melo wrote:
> >  static bool trace__filter_duration(struct trace *trace, double t)
> >  {
> >  	return t < (trace->duration_filter * NSEC_PER_MSEC);
> > @@ -1956,6 +1987,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> >  		    ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> >  		     strstr(field->name, "path") != NULL))
> >  			arg->scnprintf = SCA_FILENAME;
> > +		else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> > +			arg->scnprintf = SCA_BUF;
> 
> You can't really do this for things like 'read' as we would be printing
> whatever is in the buffer when we enter the syscall, right? As we can
> see testing after applying the following patch:

This is also valid for the struct dumper, where I'll have to add some
indication in the syscall_fmt table when the pointer should be read in
the BPF augmenter, and thus we shouldn't bother to get it in the
sys_enter if it is, say, fstat().

- Arnaldo
 
> root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
>      0.000 ( 0.004 ms): cat/291442 read(fd: 3, buf: \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0, count: 832) = 832
>      0.231 ( 0.004 ms): cat/291442 read(fd: 3, buf: , count: 131072)                                     = 3224
>      0.236 ( 0.001 ms): cat/291442 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      = 3224
>      0.239 ( 0.001 ms): cat/291442 read(fd: 3, buf: root:x:0:0:Super User:/root:/bin, count: 131072)     = 0
> root@number:~#
> 
> So we can't really do it at this point, we have to do it, for now, by
> doing it on that syscall table initialization, for instance, for the
> 'write' syscall:
> 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 5f0877e891c2047d..1bcb45e737d830bf 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1379,6 +1379,8 @@ static const struct syscall_fmt syscall_fmts[] = {
>  	  .arg = { [2] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
>  	{ .name	    = "waitid",	    .errpid = true,
>  	  .arg = { [3] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
> +	{ .name	    = "write",	    .errpid = true,
> +	  .arg = { [1] = { .scnprintf = SCA_BUF, /* buf */ }, }, },
>  };
>  
>  static int syscall_fmt__cmp(const void *name, const void *fmtp)
> @@ -1987,8 +1989,6 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
>  		    ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
>  		     strstr(field->name, "path") != NULL))
>  			arg->scnprintf = SCA_FILENAME;
> -		else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> -			arg->scnprintf = SCA_BUF;
>  		else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
>  			arg->scnprintf = SCA_PTR;
>  		else if (strcmp(field->type, "pid_t") == 0)
> 
> With that we get:
> 
> root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
>      0.000 ( 0.005 ms): cat/296870 read(fd: 3, buf: 0x7ffe9cb8df98, count: 832)                          = 832
>      0.268 ( 0.004 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 3224
>      0.273 ( 0.002 ms): cat/296870 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      = 
>      0.276 ( 0.001 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 0
> root@number:~#
> 
> After the following patch is applied.
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] perf trace: Pretty print buffer data
  2024-09-09 16:45     ` Arnaldo Carvalho de Melo
@ 2024-09-09 16:50       ` Arnaldo Carvalho de Melo
  2024-09-09 17:17       ` Howard Chu
  1 sibling, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 16:50 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Mon, Sep 09, 2024 at 01:45:41PM -0300, Arnaldo Carvalho de Melo wrote:
> On Mon, Sep 09, 2024 at 01:33:17PM -0300, Arnaldo Carvalho de Melo wrote:
> > >  static bool trace__filter_duration(struct trace *trace, double t)
> > >  {
> > >  	return t < (trace->duration_filter * NSEC_PER_MSEC);
> > > @@ -1956,6 +1987,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> > >  		    ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> > >  		     strstr(field->name, "path") != NULL))
> > >  			arg->scnprintf = SCA_FILENAME;
> > > +		else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> > > +			arg->scnprintf = SCA_BUF;

> > You can't really do this for things like 'read' as we would be printing
> > whatever is in the buffer when we enter the syscall, right? As we can
> > see testing after applying the following patch:
 
> This is also valid for the struct dumper, where I'll have to add some
> indication in the syscall_fmt table when the pointer should be read in
> the BPF augmenter, and thus we shouldn't bother to get it in the
> sys_enter if it is, say, fstat().

BTW, what I have so far, without addressing this last comment about the
struct pretty printing/collection is at the tmp.perf-tools-next branch:

⬢[acme@toolbox perf-tools-next]$ git log --oneline perf-tools-next/perf-tools-next..
14053de1be2bf3b1 (HEAD -> perf-tools-next, x1/perf-tools-next, perf-tools-next/tmp.perf-tools-next, acme/tmp.perf-tools-next) perf trace: Collect augmented data using BPF
3f060c2fe93b8298 perf trace: Pretty print buffer data
01ca0102b80c7b5c perf trace: Pretty print struct data
bb0819cf7392b797 perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF
6b22c2b502a1c21b perf trace: Use a common encoding for augmented arguments, with size + error + payload
5d9cd24924f57066 perf trace augmented_syscalls.bpf: Move the renameat aumenter to renameat2, temporarily
⬢[acme@toolbox perf-tools-next]$

⬢[acme@toolbox perf-tools-next]$ git remote -v | grep ^perf-tools-next
perf-tools-next	https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git (fetch)
perf-tools-next	https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git (push)
⬢[acme@toolbox perf-tools-next]$
 
- Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] perf trace: Pretty print buffer data
  2024-09-09 16:33   ` Arnaldo Carvalho de Melo
  2024-09-09 16:45     ` Arnaldo Carvalho de Melo
@ 2024-09-09 17:14     ` Howard Chu
  1 sibling, 0 replies; 27+ messages in thread
From: Howard Chu @ 2024-09-09 17:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Hello Arnaldo,

On Mon, Sep 9, 2024 at 9:33 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> On Sun, Aug 25, 2024 at 12:33:19AM +0800, Howard Chu wrote:
> > Define TRACE_AUG_MAX_BUF in trace_augment.h data, which is the maximum
> > buffer size we can augment. BPF will include this header too.
> >
> > Print buffer in a way that's different than just printing a string, we
> > print all the control characters in \digits (such as \0 for null, and
> > \10 for newline, LF).
> >
> > For character that has a bigger value than 127, we print the digits
> > instead of the character itself as well.
> >
> > Committer notes:
> >
> > Simplified the buffer scnprintf to avoid using multiple buffers as
> > discussed in the patch review thread.
> >
> > Signed-off-by: Howard Chu <howardchu95@gmail.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Ian Rogers <irogers@google.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Kan Liang <kan.liang@linux.intel.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Link: https://lore.kernel.org/r/20240815013626.935097-8-howardchu95@gmail.com
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > ---
> >  tools/perf/builtin-trace.c      | 33 +++++++++++++++++++++++++++++++++
> >  tools/perf/util/trace_augment.h |  6 ++++++
> >  2 files changed, 39 insertions(+)
> >  create mode 100644 tools/perf/util/trace_augment.h
> >
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index 048bcb92624c..470d74e3f875 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -65,6 +65,7 @@
> >  #include "syscalltbl.h"
> >  #include "rb_resort.h"
> >  #include "../perf.h"
> > +#include "trace_augment.h"
> >
> >  #include <errno.h>
> >  #include <inttypes.h>
> > @@ -852,6 +853,10 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
> >
> >  #define SCA_FILENAME syscall_arg__scnprintf_filename
> >
> > +static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg);
> > +
> > +#define SCA_BUF syscall_arg__scnprintf_buf
> > +
> >  static size_t syscall_arg__scnprintf_pipe_flags(char *bf, size_t size,
> >                                               struct syscall_arg *arg)
> >  {
> > @@ -1745,6 +1750,32 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
> >       return 0;
> >  }
> >
> > +#define MAX_CONTROL_CHAR 31
> > +#define MAX_ASCII 127
> > +
> > +static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg)
> > +{
> > +     struct augmented_arg *augmented_arg = arg->augmented.args;
> > +     unsigned char *orig = (unsigned char *)augmented_arg->value;
> > +     size_t printed = 0;
> > +     int consumed;
> > +
> > +     if (augmented_arg == NULL)
> > +             return 0;
> > +
> > +     for (int j = 0; j < augmented_arg->size; ++j) {
> > +             bool control_char = orig[j] <= MAX_CONTROL_CHAR || orig[j] >= MAX_ASCII;
> > +             /* print control characters (0~31 and 127), and non-ascii characters in \(digits) */
> > +             printed += scnprintf(bf + printed, size - printed, control_char ? "\\%d" : "%c", (int)orig[j]);
> > +     }
> > +
> > +     consumed = sizeof(*augmented_arg) + augmented_arg->size;
> > +     arg->augmented.args = ((void *)arg->augmented.args) + consumed;
> > +     arg->augmented.size -= consumed;
> > +
> > +     return printed;
> > +}
> > +
> >  static bool trace__filter_duration(struct trace *trace, double t)
> >  {
> >       return t < (trace->duration_filter * NSEC_PER_MSEC);
> > @@ -1956,6 +1987,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> >                   ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> >                    strstr(field->name, "path") != NULL))
> >                       arg->scnprintf = SCA_FILENAME;
> > +             else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> > +                     arg->scnprintf = SCA_BUF;
>
> You can't really do this for things like 'read' as we would be printing
> whatever is in the buffer when we enter the syscall, right? As we can

No, we can't do it just now. Same with 'read family' such as
readlinkat, and getrandom.

> see testing after applying the following patch:
>
> root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
>      0.000 ( 0.004 ms): cat/291442 read(fd: 3, buf: \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0, count: 832) = 832
>      0.231 ( 0.004 ms): cat/291442 read(fd: 3, buf: , count: 131072)                                     = 3224
>      0.236 ( 0.001 ms): cat/291442 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      = 3224
>      0.239 ( 0.001 ms): cat/291442 read(fd: 3, buf: root:x:0:0:Super User:/root:/bin, count: 131072)     = 0
> root@number:~#
>
> So we can't really do it at this point, we have to do it, for now, by
> doing it on that syscall table initialization, for instance, for the
> 'write' syscall:
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 5f0877e891c2047d..1bcb45e737d830bf 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1379,6 +1379,8 @@ static const struct syscall_fmt syscall_fmts[] = {
>           .arg = { [2] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
>         { .name     = "waitid",     .errpid = true,
>           .arg = { [3] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
> +       { .name     = "write",      .errpid = true,
> +         .arg = { [1] = { .scnprintf = SCA_BUF, /* buf */ }, }, },
>  };
>
>  static int syscall_fmt__cmp(const void *name, const void *fmtp)
> @@ -1987,8 +1989,6 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
>                     ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
>                      strstr(field->name, "path") != NULL))
>                         arg->scnprintf = SCA_FILENAME;
> -               else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> -                       arg->scnprintf = SCA_BUF;
>                 else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
>                         arg->scnprintf = SCA_PTR;
>                 else if (strcmp(field->type, "pid_t") == 0)
>
> With that we get:
>
> root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
>      0.000 ( 0.005 ms): cat/296870 read(fd: 3, buf: 0x7ffe9cb8df98, count: 832)                          = 832
>      0.268 ( 0.004 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 3224
>      0.273 ( 0.002 ms): cat/296870 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      =
>      0.276 ( 0.001 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 0
> root@number:~#
>
> After the following patch is applied.

Thank you so much!
>
> - Arnaldo

Thanks,
Howard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] perf trace: Pretty print buffer data
  2024-09-09 16:45     ` Arnaldo Carvalho de Melo
  2024-09-09 16:50       ` Arnaldo Carvalho de Melo
@ 2024-09-09 17:17       ` Howard Chu
  2024-09-09 19:19         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 27+ messages in thread
From: Howard Chu @ 2024-09-09 17:17 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Mon, Sep 9, 2024 at 9:45 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> On Mon, Sep 09, 2024 at 01:33:17PM -0300, Arnaldo Carvalho de Melo wrote:
> > >  static bool trace__filter_duration(struct trace *trace, double t)
> > >  {
> > >     return t < (trace->duration_filter * NSEC_PER_MSEC);
> > > @@ -1956,6 +1987,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> > >                 ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> > >                  strstr(field->name, "path") != NULL))
> > >                     arg->scnprintf = SCA_FILENAME;
> > > +           else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> > > +                   arg->scnprintf = SCA_BUF;
> >
> > You can't really do this for things like 'read' as we would be printing
> > whatever is in the buffer when we enter the syscall, right? As we can
> > see testing after applying the following patch:
>
> This is also valid for the struct dumper, where I'll have to add some
> indication in the syscall_fmt table when the pointer should be read in
> the BPF augmenter, and thus we shouldn't bother to get it in the
> sys_enter if it is, say, fstat().

Yes you are right, problems with read buffers happen in structs too.
I'll get my read family patch out soon to ease your pain...

Thanks,
Howard
>
> - Arnaldo
>
> > root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
> >      0.000 ( 0.004 ms): cat/291442 read(fd: 3, buf: \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0, count: 832) = 832
> >      0.231 ( 0.004 ms): cat/291442 read(fd: 3, buf: , count: 131072)                                     = 3224
> >      0.236 ( 0.001 ms): cat/291442 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      = 3224
> >      0.239 ( 0.001 ms): cat/291442 read(fd: 3, buf: root:x:0:0:Super User:/root:/bin, count: 131072)     = 0
> > root@number:~#
> >
> > So we can't really do it at this point, we have to do it, for now, by
> > doing it on that syscall table initialization, for instance, for the
> > 'write' syscall:
> >
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index 5f0877e891c2047d..1bcb45e737d830bf 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -1379,6 +1379,8 @@ static const struct syscall_fmt syscall_fmts[] = {
> >         .arg = { [2] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
> >       { .name     = "waitid",     .errpid = true,
> >         .arg = { [3] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
> > +     { .name     = "write",      .errpid = true,
> > +       .arg = { [1] = { .scnprintf = SCA_BUF, /* buf */ }, }, },
> >  };
> >
> >  static int syscall_fmt__cmp(const void *name, const void *fmtp)
> > @@ -1987,8 +1989,6 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> >                   ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> >                    strstr(field->name, "path") != NULL))
> >                       arg->scnprintf = SCA_FILENAME;
> > -             else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> > -                     arg->scnprintf = SCA_BUF;
> >               else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
> >                       arg->scnprintf = SCA_PTR;
> >               else if (strcmp(field->type, "pid_t") == 0)
> >
> > With that we get:
> >
> > root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
> >      0.000 ( 0.005 ms): cat/296870 read(fd: 3, buf: 0x7ffe9cb8df98, count: 832)                          = 832
> >      0.268 ( 0.004 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 3224
> >      0.273 ( 0.002 ms): cat/296870 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      =
> >      0.276 ( 0.001 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 0
> > root@number:~#
> >
> > After the following patch is applied.
> >
> > - Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] perf trace: Pretty print buffer data
  2024-09-09 17:17       ` Howard Chu
@ 2024-09-09 19:19         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 19:19 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Mon, Sep 09, 2024 at 10:17:34AM -0700, Howard Chu wrote:
> On Mon, Sep 9, 2024 at 9:45 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > On Mon, Sep 09, 2024 at 01:33:17PM -0300, Arnaldo Carvalho de Melo wrote:
> > > >  static bool trace__filter_duration(struct trace *trace, double t)
> > > >  {
> > > >     return t < (trace->duration_filter * NSEC_PER_MSEC);
> > > > @@ -1956,6 +1987,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> > > >                 ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> > > >                  strstr(field->name, "path") != NULL))
> > > >                     arg->scnprintf = SCA_FILENAME;
> > > > +           else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> > > > +                   arg->scnprintf = SCA_BUF;
> > >
> > > You can't really do this for things like 'read' as we would be printing
> > > whatever is in the buffer when we enter the syscall, right? As we can
> > > see testing after applying the following patch:
> >
> > This is also valid for the struct dumper, where I'll have to add some
> > indication in the syscall_fmt table when the pointer should be read in
> > the BPF augmenter, and thus we shouldn't bother to get it in the
> > sys_enter if it is, say, fstat().
> 
> Yes you are right, problems with read buffers happen in structs too.
> I'll get my read family patch out soon to ease your pain...

And then we have things like prctl, arch_prctl where it depends on the
command to see if the arg is to go from user to kernel or the other way
around :-)

But lets leave that for later...
 
> Thanks,
> Howard
> >
> > - Arnaldo
> >
> > > root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
> > >      0.000 ( 0.004 ms): cat/291442 read(fd: 3, buf: \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0, count: 832) = 832
> > >      0.231 ( 0.004 ms): cat/291442 read(fd: 3, buf: , count: 131072)                                     = 3224
> > >      0.236 ( 0.001 ms): cat/291442 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      = 3224
> > >      0.239 ( 0.001 ms): cat/291442 read(fd: 3, buf: root:x:0:0:Super User:/root:/bin, count: 131072)     = 0
> > > root@number:~#
> > >
> > > So we can't really do it at this point, we have to do it, for now, by
> > > doing it on that syscall table initialization, for instance, for the
> > > 'write' syscall:
> > >
> > > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > > index 5f0877e891c2047d..1bcb45e737d830bf 100644
> > > --- a/tools/perf/builtin-trace.c
> > > +++ b/tools/perf/builtin-trace.c
> > > @@ -1379,6 +1379,8 @@ static const struct syscall_fmt syscall_fmts[] = {
> > >         .arg = { [2] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
> > >       { .name     = "waitid",     .errpid = true,
> > >         .arg = { [3] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
> > > +     { .name     = "write",      .errpid = true,
> > > +       .arg = { [1] = { .scnprintf = SCA_BUF, /* buf */ }, }, },
> > >  };
> > >
> > >  static int syscall_fmt__cmp(const void *name, const void *fmtp)
> > > @@ -1987,8 +1989,6 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> > >                   ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> > >                    strstr(field->name, "path") != NULL))
> > >                       arg->scnprintf = SCA_FILENAME;
> > > -             else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
> > > -                     arg->scnprintf = SCA_BUF;
> > >               else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
> > >                       arg->scnprintf = SCA_PTR;
> > >               else if (strcmp(field->type, "pid_t") == 0)
> > >
> > > With that we get:
> > >
> > > root@number:~# perf trace -e read,write cat /etc/passwd > /dev/null
> > >      0.000 ( 0.005 ms): cat/296870 read(fd: 3, buf: 0x7ffe9cb8df98, count: 832)                          = 832
> > >      0.268 ( 0.004 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 3224
> > >      0.273 ( 0.002 ms): cat/296870 write(fd: 1, buf: root:x:0:0:Super User:/root:/bin, count: 3224)      =
> > >      0.276 ( 0.001 ms): cat/296870 read(fd: 3, buf: 0x7fa7d700a000, count: 131072)                       = 0
> > > root@number:~#
> > >
> > > After the following patch is applied.
> > >
> > > - Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF
  2024-08-24 16:33 ` [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF Howard Chu
@ 2024-09-09 19:45   ` Arnaldo Carvalho de Melo
  2024-09-09 20:14     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 19:45 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Sun, Aug 25, 2024 at 12:33:16AM +0800, Howard Chu wrote:
> Set up beauty_map, load it to BPF, in such format: if argument No.3 is a
> struct of size 32 bytes (of syscall number 114) beauty_map[114][2] = 32;
> 
> if argument No.3 is a string (of syscall number 114) beauty_map[114][2] =
> 1;
> 
> if argument No.3 is a buffer, its size is indicated by argument No.4 (of
> syscall number 114) beauty_map[114][2] = -4; /* -1 ~ -6, we'll read this
> buffer size in BPF  */
> 
> Committer notes:
> 
> Moved syscall_arg_fmt__cache_btf_struct() from a ifdef
> HAVE_LIBBPF_SUPPORT to closer to where it is used, that is ifdef'ed on
> HAVE_BPF_SKEL and thus breaks the build when building with
> BUILD_BPF_SKEL=0, as detected using 'make -C tools/perf build-test'.

Here we have to have this:

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index ba7b1338123dc5f1..588305b26a064edf 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3572,6 +3572,10 @@ static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigne
        for (i = 0, field = sc->args; field; ++i, field = field->next) {
                struct_offset = strstr(field->type, "struct ");
 
+               // XXX We're only collecting pointer payloads _from_ user space
+               if (!sc->arg_fmt[i].from_user)
+                       continue;
+
                if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct */
                        struct_offset += 7;

I added some patches before this one that add that .from_user field and
marks the syscall args that flow from user to kernel space, we'll
probably need to tune that a bit, but then its "better" not to collect
old contents of buffers in syscalls that flows from kernel to userspace
than to lose the opportunity to collect data flowing from user to kernel
space.

I.e. its better to not show something useful than to show something
misleading/utterly bogus.

- Arnaldo

> Signed-off-by: Howard Chu <howardchu95@gmail.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Link: https://lore.kernel.org/r/20240815013626.935097-4-howardchu95@gmail.com
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/builtin-trace.c | 106 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 106 insertions(+)
> 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index d6ca541fdc78..c26eab196623 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -113,6 +113,7 @@ struct syscall_arg_fmt {
>  	bool	   show_zero;
>  #ifdef HAVE_LIBBPF_SUPPORT
>  	const struct btf_type *type;
> +	int	   type_id; /* used in btf_dump */
>  #endif
>  };
>  
> @@ -3446,6 +3447,23 @@ static int trace__set_ev_qualifier_tp_filter(struct trace *trace)
>  }
>  
>  #ifdef HAVE_BPF_SKEL
> +static int syscall_arg_fmt__cache_btf_struct(struct syscall_arg_fmt *arg_fmt, struct btf *btf, char *type)
> +{
> +       int id;
> +
> +	if (arg_fmt->type != NULL)
> +		return -1;
> +
> +       id = btf__find_by_name(btf, type);
> +       if (id < 0)
> +		return -1;
> +
> +       arg_fmt->type    = btf__type_by_id(btf, id);
> +       arg_fmt->type_id = id;
> +
> +       return 0;
> +}
> +
>  static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace, const char *name)
>  {
>  	struct bpf_program *pos, *prog = NULL;
> @@ -3521,6 +3539,83 @@ static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
>  	return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
>  }
>  
> +static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
> +{
> +	struct tep_format_field *field;
> +	struct syscall *sc = trace__syscall_info(trace, NULL, key);
> +	const struct btf_type *bt;
> +	char *struct_offset, *tmp, name[32];
> +	bool can_augment = false;
> +	int i, cnt;
> +
> +	if (sc == NULL)
> +		return -1;
> +
> +	trace__load_vmlinux_btf(trace);
> +	if (trace->btf == NULL)
> +		return -1;
> +
> +	for (i = 0, field = sc->args; field; ++i, field = field->next) {
> +		struct_offset = strstr(field->type, "struct ");
> +
> +		if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct */
> +			struct_offset += 7;
> +
> +			/* for 'struct foo *', we only want 'foo' */
> +			for (tmp = struct_offset, cnt = 0; *tmp != ' ' && *tmp != '\0'; ++tmp, ++cnt) {
> +			}
> +
> +			strncpy(name, struct_offset, cnt);
> +			name[cnt] = '\0';
> +
> +			/* cache struct's btf_type and type_id */
> +			if (syscall_arg_fmt__cache_btf_struct(&sc->arg_fmt[i], trace->btf, name))
> +				continue;
> +
> +			bt = sc->arg_fmt[i].type;
> +			beauty_array[i] = bt->size;
> +			can_augment = true;
> +		} else if (field->flags & TEP_FIELD_IS_POINTER && /* string */
> +		    strcmp(field->type, "const char *") == 0 &&
> +		    (strstr(field->name, "name") ||
> +		     strstr(field->name, "path") ||
> +		     strstr(field->name, "file") ||
> +		     strstr(field->name, "root") ||
> +		     strstr(field->name, "key") ||
> +		     strstr(field->name, "special") ||
> +		     strstr(field->name, "type") ||
> +		     strstr(field->name, "description"))) {
> +			beauty_array[i] = 1;
> +			can_augment = true;
> +		} else if (field->flags & TEP_FIELD_IS_POINTER && /* buffer */
> +			   strstr(field->type, "char *") &&
> +			   (strstr(field->name, "buf") ||
> +			    strstr(field->name, "val") ||
> +			    strstr(field->name, "msg"))) {
> +			int j;
> +			struct tep_format_field *field_tmp;
> +
> +			/* find the size of the buffer that appears in pairs with buf */
> +			for (j = 0, field_tmp = sc->args; field_tmp; ++j, field_tmp = field_tmp->next) {
> +				if (!(field_tmp->flags & TEP_FIELD_IS_POINTER) && /* only integers */
> +				    (strstr(field_tmp->name, "count") ||
> +				     strstr(field_tmp->name, "siz") ||  /* size, bufsiz */
> +				     (strstr(field_tmp->name, "len") && strcmp(field_tmp->name, "filename")))) {
> +					 /* filename's got 'len' in it, we don't want that */
> +					beauty_array[i] = -(j + 1);
> +					can_augment = true;
> +					break;
> +				}
> +			}
> +		}
> +	}
> +
> +	if (can_augment)
> +		return 0;
> +
> +	return -1;
> +}
> +
>  static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace, struct syscall *sc)
>  {
>  	struct tep_format_field *field, *candidate_field;
> @@ -3625,7 +3720,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
>  {
>  	int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
>  	int map_exit_fd  = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
> +	int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
>  	int err = 0;
> +	unsigned int beauty_array[6];
>  
>  	for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
>  		int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
> @@ -3644,6 +3741,15 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
>  		err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
>  		if (err)
>  			break;
> +
> +		/* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
> +		memset(beauty_array, 0, sizeof(beauty_array));
> +		err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
> +		if (err)
> +			continue;
> +		err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
> +		if (err)
> +			break;
>  	}
>  
>  	/*
> -- 
> 2.45.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF
  2024-09-09 19:45   ` Arnaldo Carvalho de Melo
@ 2024-09-09 20:14     ` Arnaldo Carvalho de Melo
  2024-09-10  4:59       ` Howard Chu
  0 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 20:14 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Mon, Sep 09, 2024 at 04:45:47PM -0300, Arnaldo Carvalho de Melo wrote:
> On Sun, Aug 25, 2024 at 12:33:16AM +0800, Howard Chu wrote:
> > Set up beauty_map, load it to BPF, in such format: if argument No.3 is a
> > struct of size 32 bytes (of syscall number 114) beauty_map[114][2] = 32;
> > 
> > if argument No.3 is a string (of syscall number 114) beauty_map[114][2] =
> > 1;
> > 
> > if argument No.3 is a buffer, its size is indicated by argument No.4 (of
> > syscall number 114) beauty_map[114][2] = -4; /* -1 ~ -6, we'll read this
> > buffer size in BPF  */
> > 
> > Committer notes:
> > 
> > Moved syscall_arg_fmt__cache_btf_struct() from a ifdef
> > HAVE_LIBBPF_SUPPORT to closer to where it is used, that is ifdef'ed on
> > HAVE_BPF_SKEL and thus breaks the build when building with
> > BUILD_BPF_SKEL=0, as detected using 'make -C tools/perf build-test'.
> 
> Here we have to have this:

My pahole hat was not working, we have this:

SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
                size_t, count)
{
        return ksys_write(fd, buf, count);
}

ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
                     loff_t pos)


See that __user?

That should be available in BTF as a tag and then we can use it... For
now I'll continue adding a flag to the syscall_arg_fmt, as I have that
much time before flying to Europe, but that is an avenue we _have_ to
investigate and use that info.

It will not cover all cases, and I think we'll even have to improve BTF
in that regard with some sort of __user_maybe thing for args that get
thing from userspace according to some other argument (you was overly
clever on the heuristic to find the size of buffers, more on that at
some point, but I'm running out of time).

- Arnaldo
 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index ba7b1338123dc5f1..588305b26a064edf 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -3572,6 +3572,10 @@ static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigne
>         for (i = 0, field = sc->args; field; ++i, field = field->next) {
>                 struct_offset = strstr(field->type, "struct ");
>  
> +               // XXX We're only collecting pointer payloads _from_ user space
> +               if (!sc->arg_fmt[i].from_user)
> +                       continue;
> +
>                 if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct */
>                         struct_offset += 7;
> 
> I added some patches before this one that add that .from_user field and
> marks the syscall args that flow from user to kernel space, we'll
> probably need to tune that a bit, but then its "better" not to collect
> old contents of buffers in syscalls that flows from kernel to userspace
> than to lose the opportunity to collect data flowing from user to kernel
> space.
> 
> I.e. its better to not show something useful than to show something
> misleading/utterly bogus.
> 
> - Arnaldo
> 
> > Signed-off-by: Howard Chu <howardchu95@gmail.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Ian Rogers <irogers@google.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Kan Liang <kan.liang@linux.intel.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Link: https://lore.kernel.org/r/20240815013626.935097-4-howardchu95@gmail.com
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > ---
> >  tools/perf/builtin-trace.c | 106 +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 106 insertions(+)
> > 
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index d6ca541fdc78..c26eab196623 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -113,6 +113,7 @@ struct syscall_arg_fmt {
> >  	bool	   show_zero;
> >  #ifdef HAVE_LIBBPF_SUPPORT
> >  	const struct btf_type *type;
> > +	int	   type_id; /* used in btf_dump */
> >  #endif
> >  };
> >  
> > @@ -3446,6 +3447,23 @@ static int trace__set_ev_qualifier_tp_filter(struct trace *trace)
> >  }
> >  
> >  #ifdef HAVE_BPF_SKEL
> > +static int syscall_arg_fmt__cache_btf_struct(struct syscall_arg_fmt *arg_fmt, struct btf *btf, char *type)
> > +{
> > +       int id;
> > +
> > +	if (arg_fmt->type != NULL)
> > +		return -1;
> > +
> > +       id = btf__find_by_name(btf, type);
> > +       if (id < 0)
> > +		return -1;
> > +
> > +       arg_fmt->type    = btf__type_by_id(btf, id);
> > +       arg_fmt->type_id = id;
> > +
> > +       return 0;
> > +}
> > +
> >  static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace, const char *name)
> >  {
> >  	struct bpf_program *pos, *prog = NULL;
> > @@ -3521,6 +3539,83 @@ static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
> >  	return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
> >  }
> >  
> > +static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
> > +{
> > +	struct tep_format_field *field;
> > +	struct syscall *sc = trace__syscall_info(trace, NULL, key);
> > +	const struct btf_type *bt;
> > +	char *struct_offset, *tmp, name[32];
> > +	bool can_augment = false;
> > +	int i, cnt;
> > +
> > +	if (sc == NULL)
> > +		return -1;
> > +
> > +	trace__load_vmlinux_btf(trace);
> > +	if (trace->btf == NULL)
> > +		return -1;
> > +
> > +	for (i = 0, field = sc->args; field; ++i, field = field->next) {
> > +		struct_offset = strstr(field->type, "struct ");
> > +
> > +		if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct */
> > +			struct_offset += 7;
> > +
> > +			/* for 'struct foo *', we only want 'foo' */
> > +			for (tmp = struct_offset, cnt = 0; *tmp != ' ' && *tmp != '\0'; ++tmp, ++cnt) {
> > +			}
> > +
> > +			strncpy(name, struct_offset, cnt);
> > +			name[cnt] = '\0';
> > +
> > +			/* cache struct's btf_type and type_id */
> > +			if (syscall_arg_fmt__cache_btf_struct(&sc->arg_fmt[i], trace->btf, name))
> > +				continue;
> > +
> > +			bt = sc->arg_fmt[i].type;
> > +			beauty_array[i] = bt->size;
> > +			can_augment = true;
> > +		} else if (field->flags & TEP_FIELD_IS_POINTER && /* string */
> > +		    strcmp(field->type, "const char *") == 0 &&
> > +		    (strstr(field->name, "name") ||
> > +		     strstr(field->name, "path") ||
> > +		     strstr(field->name, "file") ||
> > +		     strstr(field->name, "root") ||
> > +		     strstr(field->name, "key") ||
> > +		     strstr(field->name, "special") ||
> > +		     strstr(field->name, "type") ||
> > +		     strstr(field->name, "description"))) {
> > +			beauty_array[i] = 1;
> > +			can_augment = true;
> > +		} else if (field->flags & TEP_FIELD_IS_POINTER && /* buffer */
> > +			   strstr(field->type, "char *") &&
> > +			   (strstr(field->name, "buf") ||
> > +			    strstr(field->name, "val") ||
> > +			    strstr(field->name, "msg"))) {
> > +			int j;
> > +			struct tep_format_field *field_tmp;
> > +
> > +			/* find the size of the buffer that appears in pairs with buf */
> > +			for (j = 0, field_tmp = sc->args; field_tmp; ++j, field_tmp = field_tmp->next) {
> > +				if (!(field_tmp->flags & TEP_FIELD_IS_POINTER) && /* only integers */
> > +				    (strstr(field_tmp->name, "count") ||
> > +				     strstr(field_tmp->name, "siz") ||  /* size, bufsiz */
> > +				     (strstr(field_tmp->name, "len") && strcmp(field_tmp->name, "filename")))) {
> > +					 /* filename's got 'len' in it, we don't want that */
> > +					beauty_array[i] = -(j + 1);
> > +					can_augment = true;
> > +					break;
> > +				}
> > +			}
> > +		}
> > +	}
> > +
> > +	if (can_augment)
> > +		return 0;
> > +
> > +	return -1;
> > +}
> > +
> >  static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace, struct syscall *sc)
> >  {
> >  	struct tep_format_field *field, *candidate_field;
> > @@ -3625,7 +3720,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> >  {
> >  	int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
> >  	int map_exit_fd  = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
> > +	int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
> >  	int err = 0;
> > +	unsigned int beauty_array[6];
> >  
> >  	for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
> >  		int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
> > @@ -3644,6 +3741,15 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> >  		err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
> >  		if (err)
> >  			break;
> > +
> > +		/* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
> > +		memset(beauty_array, 0, sizeof(beauty_array));
> > +		err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
> > +		if (err)
> > +			continue;
> > +		err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
> > +		if (err)
> > +			break;
> >  	}
> >  
> >  	/*
> > -- 
> > 2.45.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] perf trace: Add general tests for augmented syscalls
  2024-08-24 16:33 ` [PATCH v3 8/8] perf trace: Add general tests for augmented syscalls Howard Chu
@ 2024-09-09 22:11   ` Arnaldo Carvalho de Melo
  2024-09-10  5:00     ` Howard Chu
  0 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-09 22:11 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel

On Sun, Aug 25, 2024 at 12:33:22AM +0800, Howard Chu wrote:
> In this test, augmentation for:
> 
> * string
> * buffer
> * struct
> 
> Is tested.

Please install ShellCheck to test it:

  GEN     perf-archive
  GEN     perf-iostat
  TEST    /tmp/build/perf-tools-next/tests/shell/trace_btf_general.sh.shellcheck_log
  CC      /tmp/build/perf-tools-next/util/header.o

In tests/shell/trace_btf_general.sh line 50:
	echo "Unexpected signal in ${FUNCNAME[1]}"
                                   ^------------^ SC3028 (warning): In POSIX sh, FUNCNAME is undefined.
                                   ^------------^ SC3054 (warning): In POSIX sh, array references are undefined.

For more information:
  https://www.shellcheck.net/wiki/SC3028 -- In POSIX sh, FUNCNAME is undefined.
  https://www.shellcheck.net/wiki/SC3054 -- In POSIX sh, array references are...
make[4]: *** [tests/Build:91: /tmp/build/perf-tools-next/tests/shell/trace_btf_general.sh.shellcheck_log] Error 1
make[3]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:158: tests] Error 2
make[2]: *** [Makefile.perf:777: /tmp/build/perf-tools-next/perf-test-in.o] Error 2
make[2]: *** Waiting for unfinished jobs....
  LD      /tmp/build/perf-tools-next/util/perf-util-in.o
  LD      /tmp/build/perf-tools-next/perf-util-in.o
make[1]: *** [Makefile.perf:292: sub-make] Error 2
make: *** [Makefile:119: install-bin] Error 2
make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf'
⬢[acme@toolbox perf-tools-next]$
 
> Please use this command:
> 
> perf test "perf trace general tests"
> 
> Signed-off-by: Howard Chu <howardchu95@gmail.com>
> Suggested-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/tests/shell/trace_btf_general.sh | 67 +++++++++++++++++++++
>  1 file changed, 67 insertions(+)
>  create mode 100755 tools/perf/tests/shell/trace_btf_general.sh
> 
> diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
> new file mode 100755
> index 000000000000..0b4ea8462390
> --- /dev/null
> +++ b/tools/perf/tests/shell/trace_btf_general.sh
> @@ -0,0 +1,67 @@
> +#!/bin/sh
> +# perf trace general tests
> +# SPDX-License-Identifier: GPL-2.0
> +
> +err=0
> +set -e
> +
> +. "$(dirname $0)"/lib/probe.sh
> +skip_if_no_perf_trace || exit 2
> +
> +file1=$(mktemp /tmp/file1_XXXXX)
> +file2=$(echo $file1 | sed 's/file1/file2/g')
> +
> +buffer="this is a buffer for testing"
> +
> +trap cleanup EXIT TERM INT HUP
> +
> +trace_test_string() {
> +  echo "Testing perf trace's string augmentation"
> +  if ! perf trace -e renameat* --max-events=1 -- mv ${file1} ${file2} 2>&1 | grep -q -E "renameat[2]*.*oldname: \"${file1}\".*newname: \"${file2}\".*"
> +  then
> +    echo "String augmentation test failed"
> +    err=1
> +  fi
> +}
> +
> +trace_test_buffer() {
> +  echo "Testing perf trace's buffer augmentation"
> +  if ! perf trace -e write --max-events=1 -- echo "${buffer}" 2>&1 | grep -q -E ".*write.*buf: ${buffer}.*"
> +  then
> +    echo "Buffer augmentation test failed"
> +    err=1
> +  fi
> +}
> +
> +trace_test_struct_btf() {
> +  echo "Testing perf trace's struct augmentation"
> +  if ! perf trace -e clock_nanosleep --force-btf --max-events=1 -- sleep 1 2>&1 | grep -q -E ".*clock_nanosleep\(rqtp: \{1,\}, rmtp: \{1,\}\).* = 0"
> +  then
> +    echo "BTF struct augmentation test failed"
> +    err=1
> +  fi
> +}
> +
> +cleanup() {
> +	rm -rf ${file1} ${file2}
> +}
> +
> +trap_cleanup() {
> +	echo "Unexpected signal in ${FUNCNAME[1]}"
> +	cleanup
> +	exit 1
> +}
> +
> +trace_test_string
> +
> +if [ $err = 0 ]; then
> +  trace_test_buffer
> +fi
> +
> +if [ $err = 0 ]; then
> +  trace_test_struct_btf
> +fi
> +
> +cleanup
> +
> +exit $err
> -- 
> 2.45.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF
  2024-09-09 20:14     ` Arnaldo Carvalho de Melo
@ 2024-09-10  4:59       ` Howard Chu
  0 siblings, 0 replies; 27+ messages in thread
From: Howard Chu @ 2024-09-10  4:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

Hello Arnaldo,

On Mon, Sep 9, 2024 at 1:14 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> On Mon, Sep 09, 2024 at 04:45:47PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Sun, Aug 25, 2024 at 12:33:16AM +0800, Howard Chu wrote:
> > > Set up beauty_map, load it to BPF, in such format: if argument No.3 is a
> > > struct of size 32 bytes (of syscall number 114) beauty_map[114][2] = 32;
> > >
> > > if argument No.3 is a string (of syscall number 114) beauty_map[114][2] =
> > > 1;
> > >
> > > if argument No.3 is a buffer, its size is indicated by argument No.4 (of
> > > syscall number 114) beauty_map[114][2] = -4; /* -1 ~ -6, we'll read this
> > > buffer size in BPF  */
> > >
> > > Committer notes:
> > >
> > > Moved syscall_arg_fmt__cache_btf_struct() from a ifdef
> > > HAVE_LIBBPF_SUPPORT to closer to where it is used, that is ifdef'ed on
> > > HAVE_BPF_SKEL and thus breaks the build when building with
> > > BUILD_BPF_SKEL=0, as detected using 'make -C tools/perf build-test'.
> >
> > Here we have to have this:
>
> My pahole hat was not working, we have this:
>
> SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
>                 size_t, count)
> {
>         return ksys_write(fd, buf, count);
> }
>
> ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
>                      loff_t pos)
>
>
> See that __user?

Yeah __user is a good tag. Can you please tell me what you mean here?
If I'm understanding correctly, you mean __user is a tag that
indicates the flow of buffer data is from user to kernel. But wouldn't
'const char' and 'char' work? (for read it's char, and write it's
const char) Okay, maybe you are referring to prctl syscall then we
really don't know where the data flows...

>
> That should be available in BTF as a tag and then we can use it... For
> now I'll continue adding a flag to the syscall_arg_fmt, as I have that
> much time before flying to Europe, but that is an avenue we _have_ to
> investigate and use that info.
>
> It will not cover all cases, and I think we'll even have to improve BTF
> in that regard with some sort of __user_maybe thing for args that get
> thing from userspace according to some other argument (you was overly
> clever on the heuristic to find the size of buffers, more on that at
> some point, but I'm running out of time).
>
> - Arnaldo
>
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index ba7b1338123dc5f1..588305b26a064edf 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -3572,6 +3572,10 @@ static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigne
> >         for (i = 0, field = sc->args; field; ++i, field = field->next) {
> >                 struct_offset = strstr(field->type, "struct ");
> >
> > +               // XXX We're only collecting pointer payloads _from_ user space
> > +               if (!sc->arg_fmt[i].from_user)
> > +                       continue;
> > +
> >                 if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct */
> >                         struct_offset += 7;
> >
> > I added some patches before this one that add that .from_user field and
> > marks the syscall args that flow from user to kernel space, we'll
> > probably need to tune that a bit, but then its "better" not to collect
> > old contents of buffers in syscalls that flows from kernel to userspace
> > than to lose the opportunity to collect data flowing from user to kernel
> > space.
> >
> > I.e. its better to not show something useful than to show something
> > misleading/utterly bogus.

I agree. Less is better than useless.

Thanks,
Howard
> >
> > - Arnaldo
> >
> > > Signed-off-by: Howard Chu <howardchu95@gmail.com>
> > > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > > Cc: Ian Rogers <irogers@google.com>
> > > Cc: Jiri Olsa <jolsa@kernel.org>
> > > Cc: Kan Liang <kan.liang@linux.intel.com>
> > > Cc: Namhyung Kim <namhyung@kernel.org>
> > > Link: https://lore.kernel.org/r/20240815013626.935097-4-howardchu95@gmail.com
> > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > ---
> > >  tools/perf/builtin-trace.c | 106 +++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 106 insertions(+)
> > >
> > > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > > index d6ca541fdc78..c26eab196623 100644
> > > --- a/tools/perf/builtin-trace.c
> > > +++ b/tools/perf/builtin-trace.c
> > > @@ -113,6 +113,7 @@ struct syscall_arg_fmt {
> > >     bool       show_zero;
> > >  #ifdef HAVE_LIBBPF_SUPPORT
> > >     const struct btf_type *type;
> > > +   int        type_id; /* used in btf_dump */
> > >  #endif
> > >  };
> > >
> > > @@ -3446,6 +3447,23 @@ static int trace__set_ev_qualifier_tp_filter(struct trace *trace)
> > >  }
> > >
> > >  #ifdef HAVE_BPF_SKEL
> > > +static int syscall_arg_fmt__cache_btf_struct(struct syscall_arg_fmt *arg_fmt, struct btf *btf, char *type)
> > > +{
> > > +       int id;
> > > +
> > > +   if (arg_fmt->type != NULL)
> > > +           return -1;
> > > +
> > > +       id = btf__find_by_name(btf, type);
> > > +       if (id < 0)
> > > +           return -1;
> > > +
> > > +       arg_fmt->type    = btf__type_by_id(btf, id);
> > > +       arg_fmt->type_id = id;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > >  static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace, const char *name)
> > >  {
> > >     struct bpf_program *pos, *prog = NULL;
> > > @@ -3521,6 +3539,83 @@ static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
> > >     return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
> > >  }
> > >
> > > +static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
> > > +{
> > > +   struct tep_format_field *field;
> > > +   struct syscall *sc = trace__syscall_info(trace, NULL, key);
> > > +   const struct btf_type *bt;
> > > +   char *struct_offset, *tmp, name[32];
> > > +   bool can_augment = false;
> > > +   int i, cnt;
> > > +
> > > +   if (sc == NULL)
> > > +           return -1;
> > > +
> > > +   trace__load_vmlinux_btf(trace);
> > > +   if (trace->btf == NULL)
> > > +           return -1;
> > > +
> > > +   for (i = 0, field = sc->args; field; ++i, field = field->next) {
> > > +           struct_offset = strstr(field->type, "struct ");
> > > +
> > > +           if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct */
> > > +                   struct_offset += 7;
> > > +
> > > +                   /* for 'struct foo *', we only want 'foo' */
> > > +                   for (tmp = struct_offset, cnt = 0; *tmp != ' ' && *tmp != '\0'; ++tmp, ++cnt) {
> > > +                   }
> > > +
> > > +                   strncpy(name, struct_offset, cnt);
> > > +                   name[cnt] = '\0';
> > > +
> > > +                   /* cache struct's btf_type and type_id */
> > > +                   if (syscall_arg_fmt__cache_btf_struct(&sc->arg_fmt[i], trace->btf, name))
> > > +                           continue;
> > > +
> > > +                   bt = sc->arg_fmt[i].type;
> > > +                   beauty_array[i] = bt->size;
> > > +                   can_augment = true;
> > > +           } else if (field->flags & TEP_FIELD_IS_POINTER && /* string */
> > > +               strcmp(field->type, "const char *") == 0 &&
> > > +               (strstr(field->name, "name") ||
> > > +                strstr(field->name, "path") ||
> > > +                strstr(field->name, "file") ||
> > > +                strstr(field->name, "root") ||
> > > +                strstr(field->name, "key") ||
> > > +                strstr(field->name, "special") ||
> > > +                strstr(field->name, "type") ||
> > > +                strstr(field->name, "description"))) {
> > > +                   beauty_array[i] = 1;
> > > +                   can_augment = true;
> > > +           } else if (field->flags & TEP_FIELD_IS_POINTER && /* buffer */
> > > +                      strstr(field->type, "char *") &&
> > > +                      (strstr(field->name, "buf") ||
> > > +                       strstr(field->name, "val") ||
> > > +                       strstr(field->name, "msg"))) {
> > > +                   int j;
> > > +                   struct tep_format_field *field_tmp;
> > > +
> > > +                   /* find the size of the buffer that appears in pairs with buf */
> > > +                   for (j = 0, field_tmp = sc->args; field_tmp; ++j, field_tmp = field_tmp->next) {
> > > +                           if (!(field_tmp->flags & TEP_FIELD_IS_POINTER) && /* only integers */
> > > +                               (strstr(field_tmp->name, "count") ||
> > > +                                strstr(field_tmp->name, "siz") ||  /* size, bufsiz */
> > > +                                (strstr(field_tmp->name, "len") && strcmp(field_tmp->name, "filename")))) {
> > > +                                    /* filename's got 'len' in it, we don't want that */
> > > +                                   beauty_array[i] = -(j + 1);
> > > +                                   can_augment = true;
> > > +                                   break;
> > > +                           }
> > > +                   }
> > > +           }
> > > +   }
> > > +
> > > +   if (can_augment)
> > > +           return 0;
> > > +
> > > +   return -1;
> > > +}
> > > +
> > >  static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace, struct syscall *sc)
> > >  {
> > >     struct tep_format_field *field, *candidate_field;
> > > @@ -3625,7 +3720,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> > >  {
> > >     int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
> > >     int map_exit_fd  = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
> > > +   int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
> > >     int err = 0;
> > > +   unsigned int beauty_array[6];
> > >
> > >     for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
> > >             int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
> > > @@ -3644,6 +3741,15 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> > >             err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
> > >             if (err)
> > >                     break;
> > > +
> > > +           /* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
> > > +           memset(beauty_array, 0, sizeof(beauty_array));
> > > +           err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
> > > +           if (err)
> > > +                   continue;
> > > +           err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
> > > +           if (err)
> > > +                   break;
> > >     }
> > >
> > >     /*
> > > --
> > > 2.45.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] perf trace: Add general tests for augmented syscalls
  2024-09-09 22:11   ` Arnaldo Carvalho de Melo
@ 2024-09-10  5:00     ` Howard Chu
  0 siblings, 0 replies; 27+ messages in thread
From: Howard Chu @ 2024-09-10  5:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel

Hello,

On Mon, Sep 9, 2024 at 3:11 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> On Sun, Aug 25, 2024 at 12:33:22AM +0800, Howard Chu wrote:
> > In this test, augmentation for:
> >
> > * string
> > * buffer
> > * struct
> >
> > Is tested.
>
> Please install ShellCheck to test it:
>
>   GEN     perf-archive
>   GEN     perf-iostat
>   TEST    /tmp/build/perf-tools-next/tests/shell/trace_btf_general.sh.shellcheck_log
>   CC      /tmp/build/perf-tools-next/util/header.o
>
> In tests/shell/trace_btf_general.sh line 50:
>         echo "Unexpected signal in ${FUNCNAME[1]}"
>                                    ^------------^ SC3028 (warning): In POSIX sh, FUNCNAME is undefined.
>                                    ^------------^ SC3054 (warning): In POSIX sh, array references are undefined.

Sorry my bad, will fix it.

Thanks,
Howard
>
> For more information:
>   https://www.shellcheck.net/wiki/SC3028 -- In POSIX sh, FUNCNAME is undefined.
>   https://www.shellcheck.net/wiki/SC3054 -- In POSIX sh, array references are...
> make[4]: *** [tests/Build:91: /tmp/build/perf-tools-next/tests/shell/trace_btf_general.sh.shellcheck_log] Error 1
> make[3]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:158: tests] Error 2
> make[2]: *** [Makefile.perf:777: /tmp/build/perf-tools-next/perf-test-in.o] Error 2
> make[2]: *** Waiting for unfinished jobs....
>   LD      /tmp/build/perf-tools-next/util/perf-util-in.o
>   LD      /tmp/build/perf-tools-next/perf-util-in.o
> make[1]: *** [Makefile.perf:292: sub-make] Error 2
> make: *** [Makefile:119: install-bin] Error 2
> make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf'
> ⬢[acme@toolbox perf-tools-next]$
>
> > Please use this command:
> >
> > perf test "perf trace general tests"
> >
> > Signed-off-by: Howard Chu <howardchu95@gmail.com>
> > Suggested-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/tests/shell/trace_btf_general.sh | 67 +++++++++++++++++++++
> >  1 file changed, 67 insertions(+)
> >  create mode 100755 tools/perf/tests/shell/trace_btf_general.sh
> >
> > diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
> > new file mode 100755
> > index 000000000000..0b4ea8462390
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/trace_btf_general.sh
> > @@ -0,0 +1,67 @@
> > +#!/bin/sh
> > +# perf trace general tests
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +err=0
> > +set -e
> > +
> > +. "$(dirname $0)"/lib/probe.sh
> > +skip_if_no_perf_trace || exit 2
> > +
> > +file1=$(mktemp /tmp/file1_XXXXX)
> > +file2=$(echo $file1 | sed 's/file1/file2/g')
> > +
> > +buffer="this is a buffer for testing"
> > +
> > +trap cleanup EXIT TERM INT HUP
> > +
> > +trace_test_string() {
> > +  echo "Testing perf trace's string augmentation"
> > +  if ! perf trace -e renameat* --max-events=1 -- mv ${file1} ${file2} 2>&1 | grep -q -E "renameat[2]*.*oldname: \"${file1}\".*newname: \"${file2}\".*"
> > +  then
> > +    echo "String augmentation test failed"
> > +    err=1
> > +  fi
> > +}
> > +
> > +trace_test_buffer() {
> > +  echo "Testing perf trace's buffer augmentation"
> > +  if ! perf trace -e write --max-events=1 -- echo "${buffer}" 2>&1 | grep -q -E ".*write.*buf: ${buffer}.*"
> > +  then
> > +    echo "Buffer augmentation test failed"
> > +    err=1
> > +  fi
> > +}
> > +
> > +trace_test_struct_btf() {
> > +  echo "Testing perf trace's struct augmentation"
> > +  if ! perf trace -e clock_nanosleep --force-btf --max-events=1 -- sleep 1 2>&1 | grep -q -E ".*clock_nanosleep\(rqtp: \{1,\}, rmtp: \{1,\}\).* = 0"
> > +  then
> > +    echo "BTF struct augmentation test failed"
> > +    err=1
> > +  fi
> > +}
> > +
> > +cleanup() {
> > +     rm -rf ${file1} ${file2}
> > +}
> > +
> > +trap_cleanup() {
> > +     echo "Unexpected signal in ${FUNCNAME[1]}"
> > +     cleanup
> > +     exit 1
> > +}
> > +
> > +trace_test_string
> > +
> > +if [ $err = 0 ]; then
> > +  trace_test_buffer
> > +fi
> > +
> > +if [ $err = 0 ]; then
> > +  trace_test_struct_btf
> > +fi
> > +
> > +cleanup
> > +
> > +exit $err
> > --
> > 2.45.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 6/8] perf trace: Collect augmented data using BPF
  2024-09-04 21:11     ` Howard Chu
@ 2024-09-11 14:23       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-11 14:23 UTC (permalink / raw)
  To: Howard Chu
  Cc: adrian.hunter, irogers, jolsa, kan.liang, namhyung,
	linux-perf-users, linux-kernel, Arnaldo Carvalho de Melo

On Wed, Sep 04, 2024 at 02:11:35PM -0700, Howard Chu wrote:
> On Wed, Sep 4, 2024 at 12:53 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > On Sun, Aug 25, 2024 at 12:33:20AM +0800, Howard Chu wrote:
> > ⬢[acme@toolbox perf-tools-next]$ pahole -C syscall_enter_args /tmp/build/perf-tools-next/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o
> > struct syscall_enter_args {
> >         unsigned long long         common_tp_fields;     /*     0     8 */
> >         long                       syscall_nr;           /*     8     8 */
> >         unsigned long              args[6];              /*    16    48 */
> >
> >         /* size: 64, cachelines: 1, members: 3 */
> > };
> >
> > So the entry has the theoretical max limit for the augmented payload
> > which would be 6 * sizeof(struct augmented_arg) + the common header for
> > tracepoints (unaugmented), a lot of space, but...
> 
> Yes, if I don't use this much space, and try to do a flexible payload
> length, I won't be able to pass the BPF verifier. I will try to fix
> this wasting problem.

But then we don't use it, I need to look at the implementation of BPF
maps, to see how bad is this, but I doubt it is that much, well see.

- Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-09-11 14:23 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-24 16:33 [PATCH v3 0/8] perf trace: Enhanced augmentation for pointer arguments Howard Chu
2024-08-24 16:33 ` [PATCH v3 1/8] perf trace: Fix perf trace -p <PID> Howard Chu
2024-08-24 16:33 ` [PATCH v3 2/8] perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching data in BPF Howard Chu
2024-09-09 19:45   ` Arnaldo Carvalho de Melo
2024-09-09 20:14     ` Arnaldo Carvalho de Melo
2024-09-10  4:59       ` Howard Chu
2024-08-24 16:33 ` [PATCH v3 3/8] perf trace: Pass the richer 'struct syscall_arg' pointer to trace__btf_scnprintf() Howard Chu
2024-08-24 16:33 ` [PATCH v3 4/8] perf trace: Pretty print struct data Howard Chu
2024-08-28 21:04   ` Arnaldo Carvalho de Melo
2024-08-30  0:16     ` Howard Chu
2024-09-09 14:40       ` Arnaldo Carvalho de Melo
2024-09-09 14:46         ` Arnaldo Carvalho de Melo
2024-08-24 16:33 ` [PATCH v3 5/8] perf trace: Pretty print buffer data Howard Chu
2024-09-09 16:33   ` Arnaldo Carvalho de Melo
2024-09-09 16:45     ` Arnaldo Carvalho de Melo
2024-09-09 16:50       ` Arnaldo Carvalho de Melo
2024-09-09 17:17       ` Howard Chu
2024-09-09 19:19         ` Arnaldo Carvalho de Melo
2024-09-09 17:14     ` Howard Chu
2024-08-24 16:33 ` [PATCH v3 6/8] perf trace: Collect augmented data using BPF Howard Chu
2024-09-04 19:52   ` Arnaldo Carvalho de Melo
2024-09-04 21:11     ` Howard Chu
2024-09-11 14:23       ` Arnaldo Carvalho de Melo
2024-08-24 16:33 ` [PATCH v3 7/8] perf trace: Add --force-btf for debugging Howard Chu
2024-08-24 16:33 ` [PATCH v3 8/8] perf trace: Add general tests for augmented syscalls Howard Chu
2024-09-09 22:11   ` Arnaldo Carvalho de Melo
2024-09-10  5:00     ` Howard Chu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).