Linux Trace Kernel
 help / color / mirror / Atom feed
* Re: [PATCH v2] riscv: Only consider swbp/ss handlers for correct privileged mode
From: Puranjay Mohan @ 2023-08-28 15:35 UTC (permalink / raw)
  To: bjorn
  Cc: aou, bjorn, guoren, linux-kernel, linux-riscv, linux-trace-kernel,
	lkp, namcaov, palmer, paul.walmsley, Puranjay Mohan
In-Reply-To: <20230827114003.224958-1-bjorn@kernel.org>

Tested-by: Puranjay Mohan <puranjay12@gmail.com>

^ permalink raw reply

* Re: (subset) [PATCH 00/17] -Wmissing-prototype warning fixes
From: Michael Schmitz @ 2023-08-28  8:07 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Martin K. Petersen, Andrew Morton, linux-kernel, Arnd Bergmann,
	Arnd Bergmann, Matt Turner, Vineet Gupta, Russell King,
	Catalin Marinas, Will Deacon, Guo Ren, Brian Cain, Huacai Chen,
	WANG Xuerui, Michal Simek, Thomas Bogendoerfer, Dinh Nguyen,
	Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	James E.J. Bottomley, Helge Deller, Michael Ellerman,
	Christophe Leroy, Palmer Dabbelt, Heiko Carstens,
	John Paul Adrian Glaubitz, x86, Borislav Petkov, Max Filippov,
	Jens Axboe, Sudip Mukherjee, Richard Weinberger, Bjorn Helgaas,
	Masahiro Yamada, Nathan Chancellor, Nick Desaulniers,
	Guenter Roeck, Stephen Rothwell, linux-next, linux-alpha,
	linux-snps-arc, linux-arm-kernel, linux-csky, linux-hexagon,
	linux-ia64, loongarch, linux-m68k, linux-mips, linux-openrisc,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390, linux-sh,
	sparclinux, linux-block, linux-scsi, linux-mtd,
	linux-trace-kernel, linux-pci, linux-kbuild
In-Reply-To: <CAMuHMdUkZmkBSksvaGcDCKz2tsgkwyWgDa+WwCJm2UxFMCj1jw@mail.gmail.com>

Hi Geert,

Am 28.08.2023 um 18:42 schrieb Geert Uytterhoeven:
> On Sat, Aug 26, 2023 at 12:44 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> (Incidentally - did you ever publish the m68k full history tree anywhere
>> in git?)
>
> You mean the gitified version of the Linux/m68k CVS tree Ralf created
> for me because my machine wasn't powerful enough?

The very same ...

> No, and I should look into doing that...

No pressure!

Cheers,

	Michael

>
> Gr{oetje,eeting}s,
>
>                         Geert
>

^ permalink raw reply

* [PATCH v3] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
From: Alessandro Carminati (Red Hat) @ 2023-08-28  8:04 UTC (permalink / raw)
  Cc: Masami Hiramatsu, Steven Rostedt, Daniel Bristot de Oliveira,
	Josh Poimboeuf, Masahiro Yamada, Luis Chamberlain,
	Nathan Chancellor, Nick Desaulniers, Nicolas Schier,
	Alexander Lobakin, Nick Alcock, Kris Van Hees, Eugene Loh,
	Francis Laniel, Viktor Malik, Alessandro Carminati, linux-kbuild,
	linux-kernel, linux-trace-kernel, live-patching

From: Alessandro Carminati <alessandro.carminati@gmail.com>

It is not uncommon for drivers or modules related to similar peripherals
to have symbols with the exact same name.
While this is not a problem for the kernel's binary itself, it becomes an
issue when attempting to trace or probe specific functions using
infrastructure like ftrace or kprobe.

The tracing subsystem relies on the `nm -n vmlinux` output, which provides
symbol information from the kernel's ELF binary. However, when multiple
symbols share the same name, the standard nm output does not differentiate
between them. This can lead to confusion and difficulty when trying to
probe the intended symbol.

 ~ # cat /proc/kallsyms | grep " name_show"
 ffffffff8c4f76d0 t name_show
 ffffffff8c9cccb0 t name_show
 ffffffff8cb0ac20 t name_show
 ffffffff8cc728c0 t name_show
 ffffffff8ce0efd0 t name_show
 ffffffff8ce126c0 t name_show
 ffffffff8ce1dd20 t name_show
 ffffffff8ce24e70 t name_show
 ffffffff8d1104c0 t name_show
 ffffffff8d1fe480 t name_show

**kas_alias** addresses this challenge by extending the symbol names with
unique suffixes during the kernel build process.
The newly created aliases for these duplicated symbols are unique names
that can be fed to the ftracefs interface. By doing so, it enables
previously unreachable symbols to be probed.

 ~ # cat /proc/kallsyms | grep " name_show"
 ffffffff974f76d0 t name_show
 ffffffff974f76d0 t name_show__alias__6340
 ffffffff979cccb0 t name_show
 ffffffff979cccb0 t name_show__alias__6341
 ffffffff97b0ac20 t name_show
 ffffffff97b0ac20 t name_show__alias__6342
 ffffffff97c728c0 t name_show
 ffffffff97c728c0 t name_show__alias__6343
 ffffffff97e0efd0 t name_show
 ffffffff97e0efd0 t name_show__alias__6344
 ffffffff97e126c0 t name_show
 ffffffff97e126c0 t name_show__alias__6345
 ffffffff97e1dd20 t name_show
 ffffffff97e1dd20 t name_show__alias__6346
 ffffffff97e24e70 t name_show
 ffffffff97e24e70 t name_show__alias__6347
 ffffffff981104c0 t name_show
 ffffffff981104c0 t name_show__alias__6348
 ffffffff981fe480 t name_show
 ffffffff981fe480 t name_show__alias__6349
 ~ # echo "p:kprobes/evnt1 name_show__alias__6349" \
 > >/sys/kernel/tracing/kprobe_events
 ~ # cat /sys/kernel/tracing/kprobe_events
 p:kprobes/evnt1 name_show__alias__6349

Changes from v1:
- Integrated changes requested by Masami to exclude symbols with prefixes
  "_cfi" and "_pfx".
- Introduced a small framework to handle patterns that need to be excluded
  from the alias production.
- Excluded other symbols using the framework.
- Introduced the ability to discriminate between text and data symbols.
- Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA,
  which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which
  excludes all filters and provides an alias for each duplicated symbol.

https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@gmail.com/

Changes from v2:
- Alias tags are created by querying DWARF information from the vmlinux.
- The filename + line number is normalized and appended to the original name.
- The tag begins with '@' to indicate the symbol source.
- Not a change, but worth mentioning, since the alias is added to the existing
  list, the old duplicated name is preserved, and the livepatch way of dealing
  with duplicates is maintained.
- Acknowledging the existence of scenarios where inlined functions declared in
  header files may result in multiple copies due to compiler behavior, though
   it is not actionable as it does not pose an operational issue.
- Highlighting a single exception where the same name refers to different
  functions: the case of "compat_binfmt_elf.c," which directly includes
  "binfmt_elf.c" producing identical function copies in two separate
  modules.

sample from new v3

 ~ # cat /proc/kallsyms | grep gic_mask_irq
 ffffd0b03c04dae4 t gic_mask_irq
 ffffd0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167
 ffffd0b03c050960 t gic_mask_irq
 ffffd0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404
 ~ #

https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/

Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com>
---
 init/Kconfig                        |  36 ++++
 scripts/Makefile                    |   4 +
 scripts/kas_alias/Makefile          |   4 +
 scripts/kas_alias/a2l.c             | 268 ++++++++++++++++++++++++++++
 scripts/kas_alias/a2l.h             |  32 ++++
 scripts/kas_alias/duplicates_list.c |  70 ++++++++
 scripts/kas_alias/duplicates_list.h |  15 ++
 scripts/kas_alias/item_list.c       | 230 ++++++++++++++++++++++++
 scripts/kas_alias/item_list.h       |  26 +++
 scripts/kas_alias/kas_alias.c       | 217 ++++++++++++++++++++++
 scripts/link-vmlinux.sh             |  11 +-
 11 files changed, 910 insertions(+), 3 deletions(-)
 create mode 100644 scripts/kas_alias/Makefile
 create mode 100644 scripts/kas_alias/a2l.c
 create mode 100644 scripts/kas_alias/a2l.h
 create mode 100644 scripts/kas_alias/duplicates_list.c
 create mode 100644 scripts/kas_alias/duplicates_list.h
 create mode 100644 scripts/kas_alias/item_list.c
 create mode 100644 scripts/kas_alias/item_list.h
 create mode 100644 scripts/kas_alias/kas_alias.c

diff --git a/init/Kconfig b/init/Kconfig
index f7f65af4ee12..bc69fcd9cbc8 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1737,6 +1737,42 @@ config KALLSYMS_BASE_RELATIVE
 	  time constants, and no relocation pass is required at runtime to fix
 	  up the entries based on the runtime load address of the kernel.
 
+config KALLSYMS_ALIAS
+	bool "Produces alias for duplicated symbols" if EXPERT
+	depends on KALLSYMS && (DEBUG_INFO_DWARF4 || DEBUG_INFO_DWARF5)
+	help
+	  It is not uncommon for drivers or modules related to similar
+	  peripherals to have symbols with the exact same name.
+	  While this is not a problem for the kernel's binary itself, it
+	  becomes an issue when attempting to trace or probe specific
+	  functions using infrastructure like ftrace or kprobe.
+
+	  This option addresses this challenge by extending the symbol names
+	  with unique suffixes during the kernel build process.
+	  The newly created aliases for these duplicated symbols are unique
+	  names that can be fed to the ftrace sysfs interface. By doing so, it
+	  enables previously unreachable symbols to be probed.
+
+config CONFIG_KALLSYMS_ALIAS_DATA
+	bool "Produces alias also for data"
+	depends on KALLSYMS_ALIAS
+	help
+	  Sometimes it can be useful to refer to data. In live patch scenarios,
+	  you may find yourself needing to use symbols that are shared with
+	  other functions. Since symbols face the same issue as functions, this
+	  option allows you to create aliases for data as well.
+
+config CONFIG_KALLSYMS_ALIAS_DATA_ALL
+	bool "Removes all filter when producing data alias"
+	depends on CONFIG_KALLSYMS_ALIAS_DATA
+	help
+	  When selecting data aliases, not all symbols are included in the set
+	  This is because many symbols are unlikely to be used. If you choose
+	  to have an alias for all data symbols, be aware that it will
+	  significantly increase the size.
+
+	  If unsure, say N.
+
 # end of the "standard kernel features (expert users)" menu
 
 # syscall, maps, verifier
diff --git a/scripts/Makefile b/scripts/Makefile
index 32b6ba722728..65fafe17cfe5 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -49,3 +49,7 @@ subdir-$(CONFIG_SECURITY_SELINUX) += selinux
 
 # Let clean descend into subdirs
 subdir-	+= basic dtc gdb kconfig mod
+
+# KALLSyms alias
+subdir-$(CONFIG_KALLSYMS_ALIAS) += kas_alias
+
diff --git a/scripts/kas_alias/Makefile b/scripts/kas_alias/Makefile
new file mode 100644
index 000000000000..e1fde69232b4
--- /dev/null
+++ b/scripts/kas_alias/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+hostprogs-always-$(CONFIG_KALLSYMS_ALIAS)    += kas_alias
+
+kas_alias-objs        := duplicates_list.o item_list.o kas_alias.o a2l.o
diff --git a/scripts/kas_alias/a2l.c b/scripts/kas_alias/a2l.c
new file mode 100644
index 000000000000..a9692ac30180
--- /dev/null
+++ b/scripts/kas_alias/a2l.c
@@ -0,0 +1,268 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <string.h>
+#include <stdint.h>
+#include <stdbool.h>
+
+#include "a2l.h"
+
+int addr2line_pid = -1;
+int a2l_in[2];
+int a2l_out[2];
+char line[MAX_BUF];
+char vmlinux_path[MAX_BUF];
+char addr2line_cmd[MAX_CMD_LEN];
+FILE *a2l_stdin, *a2l_stdout;
+
+static char *normalize_path(const char *input_path, char *output_path)
+{
+	char *prev_token = NULL;
+	char *delimiter = "/";
+	char inbuf[MAX_BUF];
+	char *token;
+	char *pos;
+
+	memset(inbuf, 0, MAX_BUF);
+	*output_path = '\0';
+	strncpy(inbuf, input_path, MAX_BUF);
+	if (!input_path || !output_path || strlen(input_path) == 0)
+		return NULL;
+
+	token = strtok(inbuf, delimiter);
+	while (token) {
+		if (strcmp(token, "..") == 0 && prev_token) {
+			pos = strrchr(output_path, '/');
+			if (pos)
+				*pos = '\0';
+
+		} else if (strcmp(token, ".") != 0) {
+			strcat(output_path, "/");
+			strcat(output_path, token);
+		}
+
+		prev_token = token;
+		token = strtok(NULL, delimiter);
+	}
+
+	return output_path;
+}
+
+static void path_of(const char *full_path, char *path)
+{
+	const char *last_slash = strrchr(full_path, '/');
+	size_t path_length;
+	char cwd[MAX_BUF];
+
+	if (!last_slash) {
+		if (getcwd(cwd, sizeof(cwd)))
+			strcpy(path, cwd);
+		else
+			strcpy(path, ".");
+	} else {
+		path_length = last_slash - full_path;
+		strncpy(path, full_path, path_length);
+		path[path_length] = '\0';
+	}
+}
+
+static bool file_exists(const char *file_path)
+{
+	FILE *file;
+
+	file = fopen(file_path, "r");
+	if (file) {
+		fclose(file);
+		return true;
+	}
+	return false;
+}
+
+int addr2line_init(const char *cmd, const char *vmlinux)
+{
+	if ((!file_exists(cmd)) || (!file_exists(vmlinux))) {
+		printf("file not found\n");
+		return 0;
+		}
+
+	path_of(vmlinux, vmlinux_path);
+	if (pipe(a2l_in) == -1) {
+		printf("Failed to create pipe\n");
+		return 0;
+	}
+
+	if (pipe(a2l_out) == -1) {
+		printf("Failed to create pipe\n");
+		return 0;
+	}
+
+	addr2line_pid = fork();
+	if (addr2line_pid == -1) {
+		printf("Failed to fork process\n");
+		close(a2l_in[P_READ]);
+		close(a2l_in[P_WRITE]);
+		close(a2l_out[P_READ]);
+		close(a2l_out[P_WRITE]);
+		return 0;
+	}
+
+	if (addr2line_pid == 0) {
+		dup2(a2l_in[P_READ], 0);
+		dup2(a2l_out[P_WRITE], 1);
+		close(a2l_in[P_WRITE]);
+		close(a2l_out[P_READ]);
+
+		execlp(cmd, cmd, ADDR2LINE_ARGS, vmlinux, NULL);
+
+		printf("Failed to execute addr2line command\n");
+		exit(1);
+	} else {
+		close(a2l_in[P_READ]);
+		close(a2l_out[P_WRITE]);
+	}
+
+	a2l_stdin = fdopen(a2l_in[P_WRITE], "w");
+	if (!a2l_stdin) {
+		printf("Failed to open pipe a2l_in\n");
+		return 0;
+	}
+
+	a2l_stdout = fdopen(a2l_out[P_READ], "r");
+	if (!a2l_stdout) {
+		printf("Failed to open pipe a2l_out\n");
+		fclose(a2l_stdin);
+		return 0;
+	}
+
+	return 1;
+}
+
+const char *remove_subdir(const char *home, const char *f_path)
+{
+	int i = 0;
+
+	while (*(home + i) == *(f_path + i))
+		i++;
+
+	return (strlen(home) != i) ? NULL : f_path + i;
+}
+
+char *addr2line_get_lines(uint64_t address)
+{
+	char buf[MAX_BUF];
+
+	fprintf(a2l_stdin, "%08lx\n", address);
+	fflush(a2l_stdin);
+
+	if (!fgets(line, sizeof(line), a2l_stdout)) {
+		printf("Failed to read lines from addr2line\n");
+		return NULL;
+	}
+
+	if (!fgets(line, sizeof(line), a2l_stdout)) {
+		printf("Failed to read lines from addr2line\n");
+		return NULL;
+	}
+
+	line[strcspn(line, "\n")] = '\0';
+	strncpy(buf, line, MAX_BUF);
+	return normalize_path(buf, line);
+}
+
+int addr2line_cleanup(void)
+{
+	int status;
+
+	if (addr2line_pid != -1) {
+		kill(addr2line_pid, SIGKILL);
+		waitpid(addr2line_pid, &status, 0);
+		fclose(a2l_stdin);
+		fclose(a2l_stdout);
+		addr2line_pid = -1;
+	}
+
+	return 1;
+}
+
+static char *find_executable(const char *command)
+{
+	char *path_env = getenv("PATH");
+	char *executable_path;
+	char *path_copy;
+	char *path;
+	int n;
+
+	if (!path_env)
+		return NULL;
+
+	path_copy = strdup(path_env);
+	if (!path_copy)
+		return NULL;
+
+	path = strtok(path_copy, ":");
+	while (path) {
+		n = snprintf(0, 0, "%s/%s", path, command);
+		executable_path = (char *)malloc(n + 1);
+		snprintf(executable_path, n + 1, "%s/%s", path, command);
+		if (access(executable_path, X_OK) == 0) {
+			free(path_copy);
+			return executable_path;
+		}
+
+	path = strtok(NULL, ":");
+	free(executable_path);
+	executable_path = NULL;
+	}
+
+	free(path_copy);
+	if (executable_path)
+		free(executable_path);
+	return NULL;
+}
+
+const char *get_addr2line(int mode)
+{
+	char *buf = "";
+
+	switch (mode) {
+	case A2L_CROSS:
+		buf = getenv("CROSS_COMPILE");
+		memcpy(addr2line_cmd, buf, strlen(buf));
+	case A2L_DEFAULT:
+		memcpy(addr2line_cmd + strlen(buf), ADDR2LINE, strlen(ADDR2LINE));
+		buf = find_executable(addr2line_cmd);
+		if (buf) {
+			memcpy(addr2line_cmd, buf, strlen(buf));
+			free(buf);
+		}
+		return addr2line_cmd;
+	case A2L_LLVM:
+	default:
+		return NULL;
+	}
+}
+
+char *get_vmlinux(char *input)
+{
+	const char *match_string1 = ".syms";
+	const char *match_string2 = ".tmp_vmlinux.kallsyms";
+	char *result = NULL;
+	char *match_pos;
+
+	match_pos = strstr(input, match_string1);
+	if (!match_pos)
+		return NULL;
+
+	match_pos = strstr(input, match_string2);
+	if (!match_pos)
+		return NULL;
+
+	result = strdup(input);
+	match_pos = strstr(result, match_string1);
+	*match_pos = '\0';
+	return result;
+}
diff --git a/scripts/kas_alias/a2l.h b/scripts/kas_alias/a2l.h
new file mode 100644
index 000000000000..ca6419229dde
--- /dev/null
+++ b/scripts/kas_alias/a2l.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef A2L_H
+#define A2L_H
+#include <stdint.h>
+
+#define ADDR2LINE "addr2line"
+#define ADDR2LINE_ARGS "-fe"
+//#define VMLINUX "vmlinux"
+#define MAX_BUF 4096
+#define MAX_CMD_LEN 256
+#define P_READ 0
+#define P_WRITE 1
+#define A2L_DEFAULT 1
+#define A2L_CROSS 2
+#define A2L_LLVM 3
+#define A2L_MAKE_VALUE 2
+
+extern int addr2line_pid;
+extern int a2l_in[2];
+extern int a2l_out[2];
+extern char line[MAX_BUF];
+extern char vmlinux_path[MAX_BUF];
+extern char addr2line_cmd[MAX_CMD_LEN];
+
+int addr2line_init(const char *cmd, const char *vmlinux);
+char *addr2line_get_lines(uint64_t address);
+int addr2line_cleanup(void);
+const char *remove_subdir(const char *home, const char *f_path);
+const char *get_addr2line(int mode);
+char *get_vmlinux(char *input);
+
+#endif
diff --git a/scripts/kas_alias/duplicates_list.c b/scripts/kas_alias/duplicates_list.c
new file mode 100644
index 000000000000..e7a3d2917937
--- /dev/null
+++ b/scripts/kas_alias/duplicates_list.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <stdbool.h>
+
+#include "item_list.h"
+#include "duplicates_list.h"
+
+struct duplicate_item *find_duplicates(struct item *list)
+{
+	struct duplicate_item *current_duplicate = NULL;
+	struct duplicate_item *duplicates = NULL;
+	struct duplicate_item *new_duplicate;
+	struct item *current_item = list;
+	bool prev_was_duplicate = false;
+	struct item *prev_item = NULL;
+
+	while (current_item) {
+		if ((prev_item && (strcmp(current_item->symb_name, prev_item->symb_name) == 0)) ||
+		    prev_was_duplicate) {
+			if (!duplicates) {
+				duplicates = malloc(sizeof(struct duplicate_item));
+				if (!duplicates)
+					return NULL;
+
+				duplicates->original_item = prev_item;
+				duplicates->next = NULL;
+				current_duplicate = duplicates;
+			} else {
+				new_duplicate = malloc(sizeof(struct duplicate_item));
+				if (!new_duplicate) {
+					free_duplicates(&duplicates);
+					return NULL;
+				}
+
+				new_duplicate->original_item = prev_item;
+				new_duplicate->next = NULL;
+				current_duplicate->next = new_duplicate;
+				current_duplicate = new_duplicate;
+
+				if ((strcmp(current_item->symb_name, prev_item->symb_name) != 0) &&
+				    (prev_was_duplicate))
+					prev_was_duplicate = false;
+				else
+					prev_was_duplicate = true;
+			}
+		}
+
+		prev_item = current_item;
+		current_item = current_item->next;
+	}
+
+	return duplicates;
+}
+
+void free_duplicates(struct duplicate_item **duplicates)
+{
+	struct duplicate_item *duplicates_iterator = *duplicates;
+	struct duplicate_item *app;
+
+	while (duplicates_iterator) {
+		app = duplicates_iterator;
+		duplicates_iterator = duplicates_iterator->next;
+		free(app);
+	}
+
+	*duplicates = NULL;
+}
diff --git a/scripts/kas_alias/duplicates_list.h b/scripts/kas_alias/duplicates_list.h
new file mode 100644
index 000000000000..76aa73e584bc
--- /dev/null
+++ b/scripts/kas_alias/duplicates_list.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef DUPLICATES_LIST_H
+#define DUPLICATES_LIST_H
+
+#include "item_list.h"
+
+struct duplicate_item {
+	struct item *original_item;
+	struct duplicate_item *next;
+};
+
+struct duplicate_item *find_duplicates(struct item *list);
+void free_duplicates(struct duplicate_item **duplicates);
+
+#endif
diff --git a/scripts/kas_alias/item_list.c b/scripts/kas_alias/item_list.c
new file mode 100644
index 000000000000..48f2e525592a
--- /dev/null
+++ b/scripts/kas_alias/item_list.c
@@ -0,0 +1,230 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <string.h>
+#include <stdbool.h>
+#include <assert.h>
+#include "item_list.h"
+
+#define CHECK_ORDER_BY_ADDRESS(sort_by, current, temp, op) \
+	((sort_by) == BY_ADDRESS && (current)->addr op (temp)->addr)
+#define CHECK_ORDER_BY_NAME(sort_by, current, temp, op) \
+	((sort_by) == BY_NAME && strcmp((current)->symb_name, (temp)->symb_name) op 0)
+
+struct item *list_index[96] = {0};
+
+void build_index(struct item *list)
+{
+	char current_first_letter = ' ';
+	struct item *current = list;
+
+	while (current) {
+		if (current->symb_name[0] != current_first_letter) {
+			current_first_letter = current->symb_name[0];
+			list_index[current_first_letter - 32] = current;
+		}
+		current = current->next;
+	}
+}
+
+struct item *add_item(struct item **list, const char *name, char stype, uint64_t addr)
+{
+	struct item *new_item;
+	struct item *current;
+
+	new_item = malloc(sizeof(struct item));
+	if (!new_item)
+		return NULL;
+
+	strncpy(new_item->symb_name, name, MAX_NAME_SIZE);
+	new_item->symb_name[MAX_NAME_SIZE - 1] = '\0';
+	new_item->addr = addr;
+	new_item->stype = stype;
+	new_item->next = NULL;
+
+	if (!(*list)) {
+		*list = new_item;
+	} else {
+		current = *list;
+		while (current->next)
+			current = current->next;
+
+		current->next = new_item;
+	}
+	return new_item;
+}
+
+void sort_list(struct item **list, int sort_by)
+{
+	struct item *current = *list;
+	struct item *sorted = NULL;
+	struct item *next_item;
+	struct item *temp;
+
+	if (!(*list) || !((*list)->next))
+		return;
+
+	while (current) {
+		next_item = current->next;
+		if (!sorted ||
+		    (CHECK_ORDER_BY_ADDRESS(sort_by, current, sorted, <) ||
+		    CHECK_ORDER_BY_NAME(sort_by, current, sorted, >=))) {
+			current->next = sorted;
+			sorted = current;
+		} else {
+			temp = sorted;
+			while (temp->next &&
+			       (CHECK_ORDER_BY_ADDRESS(sort_by, current, temp->next, >=) ||
+			       CHECK_ORDER_BY_NAME(sort_by, current, temp->next, >=)))
+				temp = temp->next;
+
+			current->next = temp->next;
+			temp->next = current;
+		}
+		current = next_item;
+	}
+
+	*list = sorted;
+}
+
+struct item *merge(struct item *left, struct item *right, int sort_by)
+{
+	struct item *current = NULL;
+	struct item *result = NULL;
+
+	if (!left)
+		return right;
+	if (!right)
+		return left;
+
+	if (sort_by == BY_NAME) {
+		if (strcmp(left->symb_name, right->symb_name) <= 0) {
+			result = left;
+			left = left->next;
+		} else {
+			result = right;
+			right = right->next;
+		}
+	} else {
+		if (sort_by == BY_ADDRESS) {
+			if (left->addr <= right->addr) {
+				result = left;
+				left = left->next;
+			} else {
+				result = right;
+				right = right->next;
+			}
+		}
+	}
+
+	current = result;
+
+	while (left && right) {
+		if (sort_by == BY_NAME) {
+			if (strcmp(left->symb_name, right->symb_name) <= 0) {
+				current->next = left;
+				left = left->next;
+			} else {
+				current->next = right;
+				right = right->next;
+			}
+		} else {
+			if (sort_by == BY_ADDRESS) {
+				if (left->addr <= right->addr) {
+					current->next = left;
+					left = left->next;
+				} else {
+					current->next = right;
+					right = right->next;
+				}
+			}
+		}
+
+		current = current->next;
+	}
+
+	if (left) {
+		current->next = left;
+	} else {
+		if (right)
+			current->next = right;
+	}
+
+	return result;
+}
+
+struct item *merge_sort(struct item *head, int sort_by)
+{
+	struct item *right;
+	struct item *slow;
+	struct item *fast;
+	struct item *left;
+
+	if (!head || !head->next)
+		return head;
+
+	slow = head;
+	fast = head->next;
+
+	while (fast && fast->next) {
+		slow = slow->next;
+		fast = fast->next->next;
+	}
+
+	left = head;
+	right = slow->next;
+	slow->next = NULL;
+
+	left = merge_sort(left, sort_by);
+	right = merge_sort(right, sort_by);
+
+	return merge(left, right, sort_by);
+}
+
+void sort_list_m(struct item **head, int sort_by)
+{
+	if (!(*head) || !((*head)->next))
+		return;
+
+	*head = merge_sort(*head, sort_by);
+}
+
+int insert_after(struct item *list, const uint64_t search_addr,
+		 const char *name, uint64_t addr, char stype)
+{
+	struct item *new_item;
+	struct item *current;
+	int ret = 0;
+
+	current = (list_index[name[0] - 32]) ? list_index[name[0] - 32] : list;
+	while (current) {
+		if (current->addr == search_addr) {
+			new_item = malloc(sizeof(struct item));
+			if (!new_item)
+				return ret;
+			strncpy(new_item->symb_name, name, MAX_NAME_SIZE);
+			new_item->symb_name[MAX_NAME_SIZE - 1] = '\0';
+			new_item->addr = addr;
+			new_item->stype = stype;
+			new_item->next = current->next;
+			current->next = new_item;
+			ret = 1;
+			break;
+		}
+		current = current->next;
+	}
+	return ret;
+}
+
+void free_items(struct item **head)
+{
+	struct item *app, *item_iterator = *head;
+
+	while (item_iterator) {
+		app = item_iterator;
+		item_iterator = item_iterator->next;
+		free(app);
+	}
+	*head = NULL;
+}
diff --git a/scripts/kas_alias/item_list.h b/scripts/kas_alias/item_list.h
new file mode 100644
index 000000000000..b4891cb088ee
--- /dev/null
+++ b/scripts/kas_alias/item_list.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef ITEM_LIST_H
+#define ITEM_LIST_H
+#include <stdint.h>
+
+#define MAX_NAME_SIZE 256
+#define BY_ADDRESS 1
+#define BY_NAME 2
+
+struct item {
+	char		symb_name[MAX_NAME_SIZE];
+	uint64_t	addr;
+	char		stype;
+	struct item	*next;
+};
+
+void build_index(struct item *list);
+struct item *add_item(struct item **list, const char *name, char stype, uint64_t addr);
+void sort_list(struct item **list, int sort_by);
+struct item *merge(struct item *left, struct item *right, int sort_by);
+struct item *merge_sort(struct item *head, int sort_by);
+void sort_list_m(struct item **head, int sort_by);
+int insert_after(struct item *list, const uint64_t search_addr,
+		 const char *name, uint64_t addr, char stype);
+void free_items(struct item **head);
+#endif
diff --git a/scripts/kas_alias/kas_alias.c b/scripts/kas_alias/kas_alias.c
new file mode 100644
index 000000000000..532aeb39f851
--- /dev/null
+++ b/scripts/kas_alias/kas_alias.c
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdarg.h>
+#include <regex.h>
+
+#include "item_list.h"
+#include "duplicates_list.h"
+#include "a2l.h"
+
+#define SYMB_IS_TEXT(s) ((((s)->stype) == 't') ||  (((s)->stype) == 'T'))
+#define SYMB_IS_DATA(s) ((((s)->stype) == 'b') ||  (((s)->stype) == 'B') || \
+			 (((s)->stype) == 'd') ||  (((s)->stype) == 'D') || \
+			 (((s)->stype) == 'r') ||  (((s)->stype) == 'R'))
+#ifdef CONFIG_KALLSYMS_ALIAS_DATA
+#define SYMB_NEEDS_ALIAS(s) (SYMB_IS_TEXT(s) || SYMB_IS_DATA(s))
+#else
+#define SYMB_NEEDS_ALIAS(s) SYMB_IS_TEXT(s)
+#endif
+#define FNOMATCH 0
+#define FMATCH 1
+#define EREGEX 2
+
+const char *ignore_regex[] = {
+	"^__cfi_.*$",				// __cfi_ preamble
+#ifndef CONFIG_KALLSYMS_ALIAS_DATA_ALL
+	"^_*TRACE_SYSTEM.*$",
+	"^__already_done\\.[0-9]+$",		// Call a function once data
+	"^___tp_str\\.[0-9]+$",
+	"^___done\\.[0-9]+$",
+	"^__print_once\\.[0-9]+$",
+	"^_rs\\.[0-9]+$",
+	"^__compound_literal\\.[0-9]+$",
+	"^___once_key\\.[0-9]+$",
+	"^__func__\\.[0-9]+$",
+	"^__msg\\.[0-9]+$",
+	"^CSWTCH\\.[0-9]+$",
+	"^__flags\\.[0-9]+$",
+	"^__wkey.*$",
+	"^__mkey.*$",
+	"^__key.*$",
+#endif
+	"^__pfx_.*$"				// NOP-padding
+};
+
+int suffix_serial;
+
+static inline void verbose_msg(bool verbose, const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	if (verbose)
+		printf(fmt, args);
+
+	va_end(args);
+}
+
+static void create_suffix(const char *name, char *output_suffix)
+{
+	sprintf(output_suffix, "%s__alias__%d", name, suffix_serial++);
+}
+
+static void create_file_suffix(const char *name, uint64_t address, char *output_suffix, char *cwd)
+{
+	const char *f_path;
+	char *buf;
+	int i = 0;
+
+	buf = addr2line_get_lines(address);
+	f_path = remove_subdir(cwd, buf);
+	if (f_path) {
+		sprintf(output_suffix, "%s@%s", name, f_path);
+		while (*(output_suffix + i) != '\0') {
+			switch (*(output_suffix + i)) {
+			case '/':
+			case ':':
+			case '.':
+				*(output_suffix + i) = '_';
+				break;
+			default:
+			}
+		i++;
+		}
+	} else {
+		create_suffix(name, output_suffix);
+	}
+}
+
+static int filter_symbols(char *symbol, const char **ignore_list, int regex_no)
+{
+	regex_t regex;
+	int res, i;
+
+	for (i = 0; i < regex_no; i++) {
+		res = regcomp(&regex, ignore_list[i], REG_EXTENDED);
+		if (res)
+			return -EREGEX;
+
+		res = regexec(&regex, symbol, 0, NULL, 0);
+		regfree(&regex);
+		switch (res) {
+		case 0:
+			return FMATCH;
+		case REG_NOMATCH:
+			break;
+		default:
+			return -EREGEX;
+		}
+	}
+
+	return FNOMATCH;
+}
+
+int main(int argc, char *argv[])
+{
+	char t, sym_name[MAX_NAME_SIZE], new_name[MAX_NAME_SIZE + 15];
+	struct duplicate_item  *duplicate_iterator;
+	struct duplicate_item *duplicate;
+	struct item *head = {NULL};
+	bool need_2_process = true;
+	struct item *last = {NULL};
+	struct item  *current;
+	int verbose_mode = 0;
+	uint64_t address;
+	FILE *fp;
+	int res;
+
+	if (argc < 2 || argc > 3) {
+		printf("Usage: %s <nmfile> [-verbose]\n", argv[0]);
+		return 1;
+	}
+
+	if (argc == 3 && strcmp(argv[2], "-verbose") == 0)
+		verbose_mode = 1;
+
+	verbose_msg(verbose_mode, "Scanning nm data(%s)\n", argv[1]);
+
+	fp = fopen(argv[1], "r");
+	if (!fp) {
+		printf("Can't open input file.\n");
+		return 1;
+	}
+
+	if (!addr2line_init(get_addr2line(A2L_DEFAULT), get_vmlinux(argv[1])))
+		return 1;
+
+	while (fscanf(fp, "%lx %c %99s\n", &address, &t, sym_name) == 3) {
+		if (strstr(sym_name, "@_")) {
+			if (verbose_mode && need_2_process)
+				printf("Already processed\n");
+			need_2_process = false;
+			}
+		last = add_item(&last, sym_name, t, address);
+		if (!last) {
+			printf("Error in allocate memory\n");
+			free_items(&head);
+			return 1;
+		}
+
+		if (!head)
+			head = last;
+	}
+
+	fclose(fp);
+
+	if (need_2_process) {
+		verbose_msg(verbose_mode, "Sorting nm data\n");
+		sort_list_m(&head, BY_NAME);
+		verbose_msg(verbose_mode, "Scanning nm data for duplicates\n");
+		duplicate = find_duplicates(head);
+		if (!duplicate) {
+			printf("Error in duplicates list\n");
+			return 1;
+		}
+
+		verbose_msg(verbose_mode, "Applying suffixes\n");
+		build_index(head);
+		duplicate_iterator = duplicate;
+		while (duplicate_iterator) {
+			res = filter_symbols(duplicate_iterator->original_item->symb_name,
+					     ignore_regex, sizeof(ignore_regex) /
+					     sizeof(ignore_regex[0]));
+			if (res != FMATCH &&
+			    SYMB_NEEDS_ALIAS(duplicate_iterator->original_item)) {
+				if (res < 0)
+					return 1;
+
+				create_file_suffix(duplicate_iterator->original_item->symb_name,
+						   duplicate_iterator->original_item->addr,
+						   new_name, vmlinux_path);
+				if (!insert_after(head, duplicate_iterator->original_item->addr,
+						  new_name, duplicate_iterator->original_item->addr,
+						  duplicate_iterator->original_item->stype))
+					return 1;
+			}
+
+			duplicate_iterator = duplicate_iterator->next;
+		}
+
+		sort_list_m(&head, BY_ADDRESS);
+	}
+	current = head;
+	while (current) {
+		printf("%08lx %c %s\n", current->addr, current->stype, current->symb_name);
+		current = current->next;
+	}
+
+	free_items(&head);
+	free_duplicates(&duplicate);
+	addr2line_cleanup();
+	return 0;
+}
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index a432b171be82..cacf60b597ce 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -89,8 +89,9 @@ vmlinux_link()
 
 	ldflags="${ldflags} ${wl}--script=${objtree}/${KBUILD_LDS}"
 
-	# The kallsyms linking does not need debug symbols included.
-	if [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
+	# The kallsyms linking does not need debug symbols included, unless the KALLSYMS_ALIAS.
+	if [ ! is_enabled CONFIG_KALLSYMS_ALIAS ] && \
+	    [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
 		ldflags="${ldflags} ${wl}--strip-debug"
 	fi
 
@@ -161,7 +162,11 @@ kallsyms()
 	fi
 
 	info KSYMS ${2}
-	scripts/kallsyms ${kallsymopt} ${1} > ${2}
+	if is_enabled CONFIG_KALLSYMS_ALIAS; then
+		ALIAS=".alias"
+		scripts/kas_alias/kas_alias ${1} >${1}${ALIAS}
+		fi
+	scripts/kallsyms ${kallsymopt} ${1}${ALIAS} > ${2}
 }
 
 # Perform one step in kallsyms generation, including temporary linking of
-- 
2.34.1


^ permalink raw reply related

* Re: (subset) [PATCH 00/17] -Wmissing-prototype warning fixes
From: Geert Uytterhoeven @ 2023-08-28  6:42 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Martin K. Petersen, Andrew Morton, linux-kernel, Arnd Bergmann,
	Arnd Bergmann, Matt Turner, Vineet Gupta, Russell King,
	Catalin Marinas, Will Deacon, Guo Ren, Brian Cain, Huacai Chen,
	WANG Xuerui, Michal Simek, Thomas Bogendoerfer, Dinh Nguyen,
	Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	James E.J. Bottomley, Helge Deller, Michael Ellerman,
	Christophe Leroy, Palmer Dabbelt, Heiko Carstens,
	John Paul Adrian Glaubitz, x86, Borislav Petkov, Max Filippov,
	Jens Axboe, Sudip Mukherjee, Richard Weinberger, Bjorn Helgaas,
	Masahiro Yamada, Nathan Chancellor, Nick Desaulniers,
	Guenter Roeck, Stephen Rothwell, linux-next, linux-alpha,
	linux-snps-arc, linux-arm-kernel, linux-csky, linux-hexagon,
	linux-ia64, loongarch, linux-m68k, linux-mips, linux-openrisc,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390, linux-sh,
	sparclinux, linux-block, linux-scsi, linux-mtd,
	linux-trace-kernel, linux-pci, linux-kbuild
In-Reply-To: <130b3b57-edb0-184d-5b5f-69b013715773@gmail.com>

On Sat, Aug 26, 2023 at 12:44 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
> (Incidentally - did you ever publish the m68k full history tree anywhere
> in git?)

You mean the gitified version of the Linux/m68k CVS tree Ralf created
for me because my machine wasn't powerful enough?
No, and I should look into doing that...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v2] riscv: Only consider swbp/ss handlers for correct privileged mode
From: Nam Cao @ 2023-08-27 20:35 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Guo Ren,
	Björn Töpel, linux-kernel, linux-trace-kernel,
	kernel test robot
In-Reply-To: <20230827114003.224958-1-bjorn@kernel.org>

On Sun, Aug 27, 2023 at 01:40:03PM +0200, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
> 
> RISC-V software breakpoint trap handlers are used for {k,u}probes.
> 
> When trapping from kernelmode, only the kernelmode handlers should be
> considered. Vice versa, only usermode handlers for usermode
> traps. This is not the case on RISC-V, which can trigger a bug if a
> userspace process uses uprobes, and a WARN() is triggered from
> kernelmode (which is implemented via {c.,}ebreak).
> 
> The kernel will trap on the kernelmode {c.,}ebreak, look for uprobes
> handlers, realize incorrectly that uprobes need to be handled, and
> exit the trap handler early. The trap returns to re-executing the
> {c.,}ebreak, and enter an infinite trap-loop.
> 
> The issue was found running the BPF selftest [1].
> 
> Fix this issue by only considering the swbp/ss handlers for
> kernel/usermode respectively. Also, move CONFIG ifdeffery from traps.c
> to the asm/{k,u}probes.h headers.
> 
> Note that linux/uprobes.h only include asm/uprobes.h if CONFIG_UPROBES
> is defined, which is why asm/uprobes.h needs to be unconditionally
> included in traps.c
> 
> Link: https://lore.kernel.org/linux-riscv/87v8d19aun.fsf@all.your.base.are.belong.to.us/ # [1]
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202308271841.HlnnHFL7-lkp@intel.com/
> Reviewed-by: Guo Ren <guoren@kernel.org>
> Fixes: 74784081aac8 ("riscv: Add uprobes supported")
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Reviewed-by: Nam Cao <namcaov@gmail.com>

Best regards,
Nam

^ permalink raw reply

* Re: [PATCH v2] riscv: Only consider swbp/ss handlers for correct privileged mode
From: Björn Töpel @ 2023-08-27 17:57 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Guo Ren,
	Björn Töpel, linux-kernel, linux-trace-kernel, Nam Cao,
	kernel test robot
In-Reply-To: <20230827-dispute-platform-83b494acf090@spud>

Conor Dooley <conor@kernel.org> writes:

>> Reported-by: kernel test robot <lkp@intel.com>
>> Closes: https://lore.kernel.org/oe-kbuild-all/202308271841.HlnnHFL7-lkp@intel.com/
>
> Delete these, LKP did not report the probes issue. The LKP bot says:
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags

Ugh, sloppy. Thanks for clearing that up.

I'll wait and see if there's more comments. If not, Palmer, can you
remove these two tags?

Björn

^ permalink raw reply

* Re: [PATCH v2] riscv: Only consider swbp/ss handlers for correct privileged mode
From: Conor Dooley @ 2023-08-27 11:44 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Guo Ren,
	Björn Töpel, linux-kernel, linux-trace-kernel, Nam Cao,
	kernel test robot
In-Reply-To: <20230827114003.224958-1-bjorn@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 1631 bytes --]

On Sun, Aug 27, 2023 at 01:40:03PM +0200, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
> 
> RISC-V software breakpoint trap handlers are used for {k,u}probes.
> 
> When trapping from kernelmode, only the kernelmode handlers should be
> considered. Vice versa, only usermode handlers for usermode
> traps. This is not the case on RISC-V, which can trigger a bug if a
> userspace process uses uprobes, and a WARN() is triggered from
> kernelmode (which is implemented via {c.,}ebreak).
> 
> The kernel will trap on the kernelmode {c.,}ebreak, look for uprobes
> handlers, realize incorrectly that uprobes need to be handled, and
> exit the trap handler early. The trap returns to re-executing the
> {c.,}ebreak, and enter an infinite trap-loop.
> 
> The issue was found running the BPF selftest [1].
> 
> Fix this issue by only considering the swbp/ss handlers for
> kernel/usermode respectively. Also, move CONFIG ifdeffery from traps.c
> to the asm/{k,u}probes.h headers.
> 
> Note that linux/uprobes.h only include asm/uprobes.h if CONFIG_UPROBES
> is defined, which is why asm/uprobes.h needs to be unconditionally
> included in traps.c
> 
> Link: https://lore.kernel.org/linux-riscv/87v8d19aun.fsf@all.your.base.are.belong.to.us/ # [1]

> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202308271841.HlnnHFL7-lkp@intel.com/

Delete these, LKP did not report the probes issue. The LKP bot says:
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* [PATCH v2] riscv: Only consider swbp/ss handlers for correct privileged mode
From: Björn Töpel @ 2023-08-27 11:40 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Guo Ren
  Cc: Björn Töpel, linux-kernel, linux-trace-kernel, Nam Cao,
	kernel test robot

From: Björn Töpel <bjorn@rivosinc.com>

RISC-V software breakpoint trap handlers are used for {k,u}probes.

When trapping from kernelmode, only the kernelmode handlers should be
considered. Vice versa, only usermode handlers for usermode
traps. This is not the case on RISC-V, which can trigger a bug if a
userspace process uses uprobes, and a WARN() is triggered from
kernelmode (which is implemented via {c.,}ebreak).

The kernel will trap on the kernelmode {c.,}ebreak, look for uprobes
handlers, realize incorrectly that uprobes need to be handled, and
exit the trap handler early. The trap returns to re-executing the
{c.,}ebreak, and enter an infinite trap-loop.

The issue was found running the BPF selftest [1].

Fix this issue by only considering the swbp/ss handlers for
kernel/usermode respectively. Also, move CONFIG ifdeffery from traps.c
to the asm/{k,u}probes.h headers.

Note that linux/uprobes.h only include asm/uprobes.h if CONFIG_UPROBES
is defined, which is why asm/uprobes.h needs to be unconditionally
included in traps.c

Link: https://lore.kernel.org/linux-riscv/87v8d19aun.fsf@all.your.base.are.belong.to.us/ # [1]
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308271841.HlnnHFL7-lkp@intel.com/
Reviewed-by: Guo Ren <guoren@kernel.org>
Fixes: 74784081aac8 ("riscv: Add uprobes supported")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
---
v1->v2: Fix Clang build warning (kernel test robot)
---
 arch/riscv/include/asm/kprobes.h | 11 ++++++++++-
 arch/riscv/include/asm/uprobes.h | 13 ++++++++++++-
 arch/riscv/kernel/traps.c        | 28 ++++++++++++++++++----------
 3 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index e7882ccb0fd4..78ea44f76718 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -40,6 +40,15 @@ void arch_remove_kprobe(struct kprobe *p);
 int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
 bool kprobe_breakpoint_handler(struct pt_regs *regs);
 bool kprobe_single_step_handler(struct pt_regs *regs);
-
+#else
+static inline bool kprobe_breakpoint_handler(struct pt_regs *regs)
+{
+	return false;
+}
+
+static inline bool kprobe_single_step_handler(struct pt_regs *regs)
+{
+	return false;
+}
 #endif /* CONFIG_KPROBES */
 #endif /* _ASM_RISCV_KPROBES_H */
diff --git a/arch/riscv/include/asm/uprobes.h b/arch/riscv/include/asm/uprobes.h
index f2183e00fdd2..3fc7deda9190 100644
--- a/arch/riscv/include/asm/uprobes.h
+++ b/arch/riscv/include/asm/uprobes.h
@@ -34,7 +34,18 @@ struct arch_uprobe {
 	bool simulate;
 };
 
+#ifdef CONFIG_UPROBES
 bool uprobe_breakpoint_handler(struct pt_regs *regs);
 bool uprobe_single_step_handler(struct pt_regs *regs);
-
+#else
+static inline bool uprobe_breakpoint_handler(struct pt_regs *regs)
+{
+	return false;
+}
+
+static inline bool uprobe_single_step_handler(struct pt_regs *regs)
+{
+	return false;
+}
+#endif /* CONFIG_UPROBES */
 #endif /* _ASM_RISCV_UPROBES_H */
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f798c853bede..cd6f10c73a16 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -13,6 +13,8 @@
 #include <linux/kdebug.h>
 #include <linux/uaccess.h>
 #include <linux/kprobes.h>
+#include <linux/uprobes.h>
+#include <asm/uprobes.h>
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/irq.h>
@@ -246,22 +248,28 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
 	return GET_INSN_LENGTH(insn);
 }
 
+static bool probe_single_step_handler(struct pt_regs *regs)
+{
+	bool user = user_mode(regs);
+
+	return user ? uprobe_single_step_handler(regs) : kprobe_single_step_handler(regs);
+}
+
+static bool probe_breakpoint_handler(struct pt_regs *regs)
+{
+	bool user = user_mode(regs);
+
+	return user ? uprobe_breakpoint_handler(regs) : kprobe_breakpoint_handler(regs);
+}
+
 void handle_break(struct pt_regs *regs)
 {
-#ifdef CONFIG_KPROBES
-	if (kprobe_single_step_handler(regs))
+	if (probe_single_step_handler(regs))
 		return;
 
-	if (kprobe_breakpoint_handler(regs))
-		return;
-#endif
-#ifdef CONFIG_UPROBES
-	if (uprobe_single_step_handler(regs))
+	if (probe_breakpoint_handler(regs))
 		return;
 
-	if (uprobe_breakpoint_handler(regs))
-		return;
-#endif
 	current->thread.bad_cause = regs->cause;
 
 	if (user_mode(regs))

base-commit: 7d2f353b2682dcfe5f9bc71e5b61d5b61770d98e
-- 
2.39.2


^ permalink raw reply related

* Re: [PATCH] riscv: Only consider swbp/ss handlers for correct privileged mode
From: Guo Ren @ 2023-08-27 10:55 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv,
	Björn Töpel, linux-kernel, linux-trace-kernel, Nam Cao
In-Reply-To: <20230827083949.204927-1-bjorn@kernel.org>

On Sun, Aug 27, 2023 at 4:39 AM Björn Töpel <bjorn@kernel.org> wrote:
>
> From: Björn Töpel <bjorn@rivosinc.com>
>
> RISC-V software breakpoint trap handlers are used for {k,u}probes.
>
> When trapping from kernelmode, only the kernelmode handlers should be
> considered. Vice versa, only usermode handlers for usermode
> traps. This is not the case on RISC-V, which can trigger a bug if a
> userspace process uses uprobes, and a WARN() is triggered from
> kernelmode (which is implemented via {c.,}ebreak).
>
> The kernel will trap on the kernelmode {c.,}ebreak, look for uprobes
> handlers, realize incorrectly that uprobes need to be handled, and
> exit the trap handler early. The trap returns to re-executing the
> {c.,}ebreak, and enter an infinite trap-loop.
Thx for the fixup.

Reviewed-by: Guo Ren <guoren@kernel.org>

>
> The issue was found running the BPF selftest [1].
>
> Fix this issue by only considering the swbp/ss handlers for
> kernel/usermode respectively. Also, move CONFIG ifdeffery from traps.c
> to the asm/{k,u}probes.h headers.
>
> Note that linux/uprobes.h only include asm/uprobes.h if CONFIG_UPROBES
> is defined, which is why asm/uprobes.h needs to be unconditionally
> included in traps.c
>
> Link: https://lore.kernel.org/linux-riscv/87v8d19aun.fsf@all.your.base.are.belong.to.us/ # [1]
> Fixes: 74784081aac8 ("riscv: Add uprobes supported")
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
> ---
>  arch/riscv/include/asm/kprobes.h | 11 ++++++++++-
>  arch/riscv/include/asm/uprobes.h | 13 ++++++++++++-
>  arch/riscv/kernel/traps.c        | 28 ++++++++++++++++++----------
>  3 files changed, 40 insertions(+), 12 deletions(-)
>
> diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
> index e7882ccb0fd4..89fbd90f16a2 100644
> --- a/arch/riscv/include/asm/kprobes.h
> +++ b/arch/riscv/include/asm/kprobes.h
> @@ -40,6 +40,15 @@ void arch_remove_kprobe(struct kprobe *p);
>  int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
>  bool kprobe_breakpoint_handler(struct pt_regs *regs);
>  bool kprobe_single_step_handler(struct pt_regs *regs);
> -
> +#else
> +static inline bool kprobe_breakpoint_handler(struct pt_regs *)
> +{
> +       return false;
> +}
> +
> +static inline bool kprobe_single_step_handler(struct pt_regs *)
> +{
> +       return false;
> +}
>  #endif /* CONFIG_KPROBES */
>  #endif /* _ASM_RISCV_KPROBES_H */
> diff --git a/arch/riscv/include/asm/uprobes.h b/arch/riscv/include/asm/uprobes.h
> index f2183e00fdd2..3fc7deda9190 100644
> --- a/arch/riscv/include/asm/uprobes.h
> +++ b/arch/riscv/include/asm/uprobes.h
> @@ -34,7 +34,18 @@ struct arch_uprobe {
>         bool simulate;
>  };
>
> +#ifdef CONFIG_UPROBES
>  bool uprobe_breakpoint_handler(struct pt_regs *regs);
>  bool uprobe_single_step_handler(struct pt_regs *regs);
> -
> +#else
> +static inline bool uprobe_breakpoint_handler(struct pt_regs *regs)
> +{
> +       return false;
> +}
> +
> +static inline bool uprobe_single_step_handler(struct pt_regs *regs)
> +{
> +       return false;
> +}
> +#endif /* CONFIG_UPROBES */
>  #endif /* _ASM_RISCV_UPROBES_H */
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index f798c853bede..cd6f10c73a16 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -13,6 +13,8 @@
>  #include <linux/kdebug.h>
>  #include <linux/uaccess.h>
>  #include <linux/kprobes.h>
> +#include <linux/uprobes.h>
> +#include <asm/uprobes.h>
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/irq.h>
> @@ -246,22 +248,28 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
>         return GET_INSN_LENGTH(insn);
>  }
>
> +static bool probe_single_step_handler(struct pt_regs *regs)
> +{
> +       bool user = user_mode(regs);
> +
> +       return user ? uprobe_single_step_handler(regs) : kprobe_single_step_handler(regs);
> +}
> +
> +static bool probe_breakpoint_handler(struct pt_regs *regs)
> +{
> +       bool user = user_mode(regs);
> +
> +       return user ? uprobe_breakpoint_handler(regs) : kprobe_breakpoint_handler(regs);
> +}
> +
>  void handle_break(struct pt_regs *regs)
>  {
> -#ifdef CONFIG_KPROBES
> -       if (kprobe_single_step_handler(regs))
> +       if (probe_single_step_handler(regs))
>                 return;
>
> -       if (kprobe_breakpoint_handler(regs))
> -               return;
> -#endif
> -#ifdef CONFIG_UPROBES
> -       if (uprobe_single_step_handler(regs))
> +       if (probe_breakpoint_handler(regs))
>                 return;
>
> -       if (uprobe_breakpoint_handler(regs))
> -               return;
> -#endif
>         current->thread.bad_cause = regs->cause;
>
>         if (user_mode(regs))
>
> base-commit: 7d2f353b2682dcfe5f9bc71e5b61d5b61770d98e
> --
> 2.39.2
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply

* Re: [PATCH] riscv: Only consider swbp/ss handlers for correct privileged mode
From: kernel test robot @ 2023-08-27 10:46 UTC (permalink / raw)
  To: Björn Töpel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	linux-riscv, Guo Ren
  Cc: llvm, oe-kbuild-all, Björn Töpel, linux-kernel,
	linux-trace-kernel, Nam Cao
In-Reply-To: <20230827083949.204927-1-bjorn@kernel.org>

Hi Björn,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 7d2f353b2682dcfe5f9bc71e5b61d5b61770d98e]

url:    https://github.com/intel-lab-lkp/linux/commits/Bj-rn-T-pel/riscv-Only-consider-swbp-ss-handlers-for-correct-privileged-mode/20230827-164313
base:   7d2f353b2682dcfe5f9bc71e5b61d5b61770d98e
patch link:    https://lore.kernel.org/r/20230827083949.204927-1-bjorn%40kernel.org
patch subject: [PATCH] riscv: Only consider swbp/ss handlers for correct privileged mode
config: riscv-randconfig-001-20230827 (https://download.01.org/0day-ci/archive/20230827/202308271841.HlnnHFL7-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce: (https://download.01.org/0day-ci/archive/20230827/202308271841.HlnnHFL7-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308271841.HlnnHFL7-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from kernel/sched/core.c:49:
   In file included from include/linux/kprobes.h:32:
>> arch/riscv/include/asm/kprobes.h:44:62: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
      44 | static inline bool kprobe_breakpoint_handler(struct pt_regs *)
         |                                                              ^
   arch/riscv/include/asm/kprobes.h:49:63: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
      49 | static inline bool kprobe_single_step_handler(struct pt_regs *)
         |                                                               ^
   kernel/sched/core.c:3695:20: warning: unused function 'rq_has_pinned_tasks' [-Wunused-function]
    3695 | static inline bool rq_has_pinned_tasks(struct rq *rq)
         |                    ^
   kernel/sched/core.c:5821:20: warning: unused function 'sched_tick_start' [-Wunused-function]
    5821 | static inline void sched_tick_start(int cpu) { }
         |                    ^
   kernel/sched/core.c:5822:20: warning: unused function 'sched_tick_stop' [-Wunused-function]
    5822 | static inline void sched_tick_stop(int cpu) { }
         |                    ^
   kernel/sched/core.c:6522:20: warning: unused function 'sched_core_cpu_starting' [-Wunused-function]
    6522 | static inline void sched_core_cpu_starting(unsigned int cpu) {}
         |                    ^
   kernel/sched/core.c:6523:20: warning: unused function 'sched_core_cpu_deactivate' [-Wunused-function]
    6523 | static inline void sched_core_cpu_deactivate(unsigned int cpu) {}
         |                    ^
   kernel/sched/core.c:6524:20: warning: unused function 'sched_core_cpu_dying' [-Wunused-function]
    6524 | static inline void sched_core_cpu_dying(unsigned int cpu) {}
         |                    ^
   8 warnings generated.


vim +44 arch/riscv/include/asm/kprobes.h

    38	
    39	void arch_remove_kprobe(struct kprobe *p);
    40	int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
    41	bool kprobe_breakpoint_handler(struct pt_regs *regs);
    42	bool kprobe_single_step_handler(struct pt_regs *regs);
    43	#else
  > 44	static inline bool kprobe_breakpoint_handler(struct pt_regs *)
    45	{
    46		return false;
    47	}
    48	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* [PATCH] riscv: Only consider swbp/ss handlers for correct privileged mode
From: Björn Töpel @ 2023-08-27  8:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Guo Ren
  Cc: Björn Töpel, linux-kernel, linux-trace-kernel, Nam Cao

From: Björn Töpel <bjorn@rivosinc.com>

RISC-V software breakpoint trap handlers are used for {k,u}probes.

When trapping from kernelmode, only the kernelmode handlers should be
considered. Vice versa, only usermode handlers for usermode
traps. This is not the case on RISC-V, which can trigger a bug if a
userspace process uses uprobes, and a WARN() is triggered from
kernelmode (which is implemented via {c.,}ebreak).

The kernel will trap on the kernelmode {c.,}ebreak, look for uprobes
handlers, realize incorrectly that uprobes need to be handled, and
exit the trap handler early. The trap returns to re-executing the
{c.,}ebreak, and enter an infinite trap-loop.

The issue was found running the BPF selftest [1].

Fix this issue by only considering the swbp/ss handlers for
kernel/usermode respectively. Also, move CONFIG ifdeffery from traps.c
to the asm/{k,u}probes.h headers.

Note that linux/uprobes.h only include asm/uprobes.h if CONFIG_UPROBES
is defined, which is why asm/uprobes.h needs to be unconditionally
included in traps.c

Link: https://lore.kernel.org/linux-riscv/87v8d19aun.fsf@all.your.base.are.belong.to.us/ # [1]
Fixes: 74784081aac8 ("riscv: Add uprobes supported")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/include/asm/kprobes.h | 11 ++++++++++-
 arch/riscv/include/asm/uprobes.h | 13 ++++++++++++-
 arch/riscv/kernel/traps.c        | 28 ++++++++++++++++++----------
 3 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index e7882ccb0fd4..89fbd90f16a2 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -40,6 +40,15 @@ void arch_remove_kprobe(struct kprobe *p);
 int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
 bool kprobe_breakpoint_handler(struct pt_regs *regs);
 bool kprobe_single_step_handler(struct pt_regs *regs);
-
+#else
+static inline bool kprobe_breakpoint_handler(struct pt_regs *)
+{
+	return false;
+}
+
+static inline bool kprobe_single_step_handler(struct pt_regs *)
+{
+	return false;
+}
 #endif /* CONFIG_KPROBES */
 #endif /* _ASM_RISCV_KPROBES_H */
diff --git a/arch/riscv/include/asm/uprobes.h b/arch/riscv/include/asm/uprobes.h
index f2183e00fdd2..3fc7deda9190 100644
--- a/arch/riscv/include/asm/uprobes.h
+++ b/arch/riscv/include/asm/uprobes.h
@@ -34,7 +34,18 @@ struct arch_uprobe {
 	bool simulate;
 };
 
+#ifdef CONFIG_UPROBES
 bool uprobe_breakpoint_handler(struct pt_regs *regs);
 bool uprobe_single_step_handler(struct pt_regs *regs);
-
+#else
+static inline bool uprobe_breakpoint_handler(struct pt_regs *regs)
+{
+	return false;
+}
+
+static inline bool uprobe_single_step_handler(struct pt_regs *regs)
+{
+	return false;
+}
+#endif /* CONFIG_UPROBES */
 #endif /* _ASM_RISCV_UPROBES_H */
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f798c853bede..cd6f10c73a16 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -13,6 +13,8 @@
 #include <linux/kdebug.h>
 #include <linux/uaccess.h>
 #include <linux/kprobes.h>
+#include <linux/uprobes.h>
+#include <asm/uprobes.h>
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/irq.h>
@@ -246,22 +248,28 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
 	return GET_INSN_LENGTH(insn);
 }
 
+static bool probe_single_step_handler(struct pt_regs *regs)
+{
+	bool user = user_mode(regs);
+
+	return user ? uprobe_single_step_handler(regs) : kprobe_single_step_handler(regs);
+}
+
+static bool probe_breakpoint_handler(struct pt_regs *regs)
+{
+	bool user = user_mode(regs);
+
+	return user ? uprobe_breakpoint_handler(regs) : kprobe_breakpoint_handler(regs);
+}
+
 void handle_break(struct pt_regs *regs)
 {
-#ifdef CONFIG_KPROBES
-	if (kprobe_single_step_handler(regs))
+	if (probe_single_step_handler(regs))
 		return;
 
-	if (kprobe_breakpoint_handler(regs))
-		return;
-#endif
-#ifdef CONFIG_UPROBES
-	if (uprobe_single_step_handler(regs))
+	if (probe_breakpoint_handler(regs))
 		return;
 
-	if (uprobe_breakpoint_handler(regs))
-		return;
-#endif
 	current->thread.bad_cause = regs->cause;
 
 	if (user_mode(regs))

base-commit: 7d2f353b2682dcfe5f9bc71e5b61d5b61770d98e
-- 
2.39.2


^ permalink raw reply related

* [RFC PATCH] mm/damon/core: add a tracepoint for damos apply target regions
From: SeongJae Park @ 2023-08-27  0:40 UTC (permalink / raw)
  To: damon
  Cc: SeongJae Park, Andrew Morton, Steven Rostedt, linux-mm,
	linux-trace-kernel, linux-kernel

DAMON provides damon_aggregated tracepoint, which exposes details of
each region and its access monitoring results.  It is useful for
getting whole monitoring results, e.g., for recording purposes.

For investigations of DAMOS, DAMON Sysfs interface provides DAMOS
statistics and tried_regions directory.  But, those provides only
statistics and snapshots.  If the scheme is frequently applied and if
the user needs to know every detail of DAMOS behavior, the
snapshot-based interface could be insufficient and expensive.

As a last resort, userspace users need to record the all monitoring
results via damon_aggregated tracepoint and simulate how DAMOS would
worked.  It is unnecessarily complicated.  DAMON kernel API users,
meanwhile, can do that easily via before_damos_apply() callback field of
'struct damon_callback', though.

Add a tracepoint that will be called just after before_damos_apply()
callback for more convenient investigations of DAMOS.  The tracepoint
exposes all details about each regions, similar to damon_aggregated
tracepoint.

Please note that DAMOS is currently not only for memory management but
also for query-like efficient monitoring results retrievals (when 'stat'
action is used).  Until now, only statistics or snapshots were
supported.  Addition of this tracepoint allows efficient full recording
of DAMOS-based filtered monitoring results.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/trace/events/damon.h | 37 ++++++++++++++++++++++++++++++++++++
 mm/damon/core.c              | 27 +++++++++++++++++++++++++-
 2 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h
index 0b8d13bde17a..c942c5033b5f 100644
--- a/include/trace/events/damon.h
+++ b/include/trace/events/damon.h
@@ -9,6 +9,43 @@
 #include <linux/types.h>
 #include <linux/tracepoint.h>
 
+TRACE_EVENT(damos_before_apply,
+
+	TP_PROTO(unsigned int context_idx, unsigned int scheme_idx,
+		unsigned int target_idx, struct damon_region *r,
+		unsigned int nr_regions),
+
+	TP_ARGS(context_idx, target_idx, scheme_idx, r, nr_regions),
+
+	TP_STRUCT__entry(
+		__field(unsigned int, context_idx)
+		__field(unsigned int, scheme_idx)
+		__field(unsigned long, target_idx)
+		__field(unsigned int, nr_regions)
+		__field(unsigned long, start)
+		__field(unsigned long, end)
+		__field(unsigned int, nr_accesses)
+		__field(unsigned int, age)
+	),
+
+	TP_fast_assign(
+		__entry->context_idx = context_idx;
+		__entry->scheme_idx = scheme_idx;
+		__entry->target_idx = target_idx;
+		__entry->nr_regions = nr_regions;
+		__entry->start = r->ar.start;
+		__entry->end = r->ar.end;
+		__entry->nr_accesses = r->nr_accesses;
+		__entry->age = r->age;
+	),
+
+	TP_printk("ctx_idx=%u scheme_idx=%u target_idx=%lu nr_regions=%u %lu-%lu: %u %u",
+			__entry->context_idx, __entry->scheme_idx,
+			__entry->target_idx, __entry->nr_regions,
+			__entry->start, __entry->end,
+			__entry->nr_accesses, __entry->age)
+);
+
 TRACE_EVENT(damon_aggregated,
 
 	TP_PROTO(unsigned int target_id, struct damon_region *r,
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 83af336bb0e6..22fe81abd35d 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -963,6 +963,28 @@ static void damos_apply_scheme(struct damon_ctx *c, struct damon_target *t,
 	struct timespec64 begin, end;
 	unsigned long sz_applied = 0;
 	int err = 0;
+	/*
+	 * We plan to support multiple context per kdamond, as DAMON sysfs
+	 * implies with 'nr_contexts' file.  Nevertheless, only single context
+	 * per kdamond is supported for now.  So, we can simply use '0' context
+	 * index here.
+	 */
+	unsigned int cidx = 0;
+	struct damos *siter;		/* schemes iterator */
+	unsigned int sidx = 0;
+	struct damon_target *titer;	/* targets iterator */
+	unsigned int tidx = 0;
+
+	damon_for_each_scheme(siter, c) {
+		if (siter == s)
+			break;
+		sidx++;
+	}
+	damon_for_each_target(titer, c) {
+		if (titer == t)
+			break;
+		tidx++;
+	}
 
 	if (c->ops.apply_scheme) {
 		if (quota->esz && quota->charged_sz + sz > quota->esz) {
@@ -986,8 +1008,11 @@ static void damos_apply_scheme(struct damon_ctx *c, struct damon_target *t,
 		ktime_get_coarse_ts64(&begin);
 		if (c->callback.before_damos_apply)
 			err = c->callback.before_damos_apply(c, t, r, s);
-		if (!err)
+		if (!err) {
+			trace_damos_before_apply(cidx, sidx, tidx, r,
+					damon_nr_regions(t));
 			sz_applied = c->ops.apply_scheme(c, t, r, s);
+		}
 		ktime_get_coarse_ts64(&end);
 		quota->total_charged_ns += timespec64_to_ns(&end) -
 			timespec64_to_ns(&begin);
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
From: Masami Hiramatsu @ 2023-08-26  3:38 UTC (permalink / raw)
  To: Florent Revest
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner,
	Peter Zijlstra
In-Reply-To: <CABRcYmLcTBey7QY9Ln3aVvJPV7weeTR0FA6DOU3_QObuAM8_Zg@mail.gmail.com>

(Cc: Peter)

On Fri, 25 Aug 2023 18:12:07 +0200
Florent Revest <revest@chromium.org> wrote:

> On Wed, Aug 23, 2023 at 5:16 PM Masami Hiramatsu (Google)
> <mhiramat@kernel.org> wrote:
> >
> > diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> > index c60d0d9f1a95..90ad28260a9f 100644
> > --- a/kernel/trace/trace_fprobe.c
> > +++ b/kernel/trace/trace_fprobe.c
> > +#else /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS && !CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> > +
> > +/* Since fprobe handlers can be nested, pt_regs buffer need to be a stack */
> > +#define PERF_FPROBE_REGS_MAX   4
> > +
> > +struct pt_regs_stack {
> > +       struct pt_regs regs[PERF_FPROBE_REGS_MAX];
> > +       int idx;
> > +};
> > +
> > +static DEFINE_PER_CPU(struct pt_regs_stack, perf_fprobe_regs);
> > +
> > +static __always_inline
> > +struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
> > +{
> > +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> > +       struct pt_regs *regs;
> > +
> > +       if (stack->idx < PERF_FPROBE_REGS_MAX) {
> > +               regs = stack->regs[stack->idx++];
> 
> This is missing an &:
> regs = &stack->regs[stack->idx++];

Oops, good point. I'm curious it didin't cause compile error...
(I thought I built it on arm64)

> 
> > +               return ftrace_partial_regs(fregs, regs);
> 
> I think this is incorrect on arm64 and will likely cause very subtle
> failure modes down the line on other architectures too. The problem on
> arm64 is that Perf calls "user_mode(regs)" somewhere down the line,
> that macro tries to read the "pstate" register, which is not populated
> in ftrace_regs, so it's not copied into a "partial" pt_regs either and
> Perf can take wrong decisions based on that.

I think we can assure the ftrace_regs is always !user_mode() so in that case
ftrace_partial_regs() should fill the 'pstate' register as kernel mode.

> 
> I already mentioned this problem in the past:
> - in the third answer block of:
> https://lore.kernel.org/all/CABRcYmJjtVq-330ktqTAUiNO1=yG_aHd0xz=c550O5C7QP++UA@mail.gmail.com/
> - in the fourth answer block of:
> https://lore.kernel.org/all/CABRcYm+esb8J2O1v6=C+h+HSa5NxraPUgo63w7-iZj0CXbpusg@mail.gmail.com/
> 

Oops, sorry I missed that. And I basically agreed that we need a special
care for perf. Let me reply it.

> It is quite possible that other architectures at some point introduce
> a light ftrace "args" trampoline that misses one of the registers
> expected by Perf because they don't realize that this trampoline calls
> fprobe which calls Perf which has specific registers expectations.

Agreed.

> 
> We got the green light from Alexei to use ftrace_partial_regs for "BPF
> mutli_kprobe" because these BPF programs can gracefully deal with
> sparse pt_regs but I think a similar conversation needs to happen with
> the Perf folks.

Indeed. Who is the best person to involve, Peterz? (but I think
we need arm64 PMU part maintainer to talk)

> 
> ----
> 
> On a side-note, a subtle difference between ftrace_partial_regs with
> and without HAVE_PT_REGS_TO_FTRACE_REGS_CAST is that one does a copy
> and the other does not. If a subsystem receives a partial regs under
> HAVE_PT_REGS_TO_FTRACE_REGS_CAST, it can modify register fields and
> the modified values will be restored by the ftrace trampoline. Without
> HAVE_PT_REGS_TO_FTRACE_REGS_CAST, only the copy will be modified and
> ftrace won't restore them. I think the least we can do is to document
> thoroughly the guarantees of the ftrace_partial_regs API: users
> shouldn't rely on modifying the resulting regs because depending on
> the architecture this could do different things. People shouldn't rely
> on any register that isn't covered by one of the ftrace_regs_get_*
> helpers because it can be unpopulated on some architectures. I believe
> this is the case for BPF multi_kprobe but not for Perf.

I agree with the documentation requirement, but since the fprobe official
interface becomes ftrace_regs, user naturally expects it is not pt_regs.
The problem is that the perf's case. Since the perf is natively only
support pt_regs (and there is no reason to support ftrace_regs, yes).
Hmm, I will recheck how the perf events on trace-event is implementd.

Thank you,

> 
> > +       }
> > +       return NULL;
> > +}
> > +
> > +static __always_inline void perf_fprobe_return_regs(struct pt_regs *regs)
> > +{
> > +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> > +
> > +       if (WARN_ON_ONCE(regs != stack->regs[stack->idx]))
> 
> This is missing an & too:
> if (WARN_ON_ONCE(regs != &stack->regs[stack->idx]))
> 
> 
> 
> 
> > +               return;
> > +
> > +       --stack->idx;
> > +}
> > +
> > +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
From: Masami Hiramatsu @ 2023-08-26  1:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner
In-Reply-To: <CAEf4Bzb9CBnQp1_bEW-DOhw9rpDj6jt79DMmsKL2L4a_4ts9gQ@mail.gmail.com>

On Fri, 25 Aug 2023 14:49:48 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Wed, Aug 23, 2023 at 8:16 AM Masami Hiramatsu (Google)
> <mhiramat@kernel.org> wrote:
> >
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
> > If the architecture defines its own ftrace_regs, this copies partial
> > registers to pt_regs and returns it. If not, ftrace_regs is the same as
> > pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.
> >
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > Acked-by: Florent Revest <revest@chromium.org>
> > ---
> >  Changes in v3:
> >   - Fix to use pt_regs::regs instead of x.
> >   - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
> >   - Fix typo.
> >   - Fix to copy correct registers to the pt_regs on arm64.
> >  Changes in v4:
> >   - Change the patch order in the series so that fprobe event can use this.
> > ---
> >  arch/arm64/include/asm/ftrace.h |   11 +++++++++++
> >  include/linux/ftrace.h          |   17 +++++++++++++++++
> >  2 files changed, 28 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> > index ab158196480c..5ad24f315d52 100644
> > --- a/arch/arm64/include/asm/ftrace.h
> > +++ b/arch/arm64/include/asm/ftrace.h
> > @@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
> >         fregs->pc = fregs->lr;
> >  }
> >
> > +static __always_inline struct pt_regs *
> > +ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
> > +{
> > +       memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
> > +       regs->sp = fregs->sp;
> > +       regs->pc = fregs->pc;
> > +       regs->regs[29] = fregs->fp;
> > +       regs->regs[30] = fregs->lr;
> 
> I see that orig_x0 from pt_regs is used on arm64 to get syscall's
> first parameter. And it seems like this ftrace_regs to pt_regs
> conversion doesn't touch orig_x0 at all. Is it maintained/preserved
> somewhere else, or will we lose the ability to get the first syscall's
> argument?

Thanks for checking it!

Does BPF uses kprobe probe to trace syscalls? Since we have raw_syscall
trace events, no need to use kprobe to do that. (and I don't recommend to
use kprobe to do such fixed event)

> 
> Looking at libbpf's bpf_tracing.h, other than orig_x0, I think all the
> other registers are still preserved, so this seems to be the only
> potential problem.

Great!

Thank you,

> 
> 
> > +       return regs;
> > +}
> > +
> >  int ftrace_regs_query_register_offset(const char *name);
> >
> >  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index c0a42d0860b8..a6ed2aa71efc 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
> >         return arch_ftrace_get_regs(fregs);
> >  }
> >
> > +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> > +       defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> > +
> > +static __always_inline struct pt_regs *
> > +ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
> > +{
> > +       /*
> > +        * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
> > +        * layout is the same as pt_regs. So always returns that address.
> > +        * Since arch_ftrace_get_regs() will check some members and may return
> > +        * NULL, we can not use it.
> > +        */
> > +       return &fregs->regs;
> > +}
> > +
> > +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> > +
> >  /*
> >   * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
> >   * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
> >
> >


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: (subset) [PATCH 00/17] -Wmissing-prototype warning fixes
From: Michael Schmitz @ 2023-08-25 22:44 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Martin K. Petersen, Andrew Morton, linux-kernel, Arnd Bergmann,
	Arnd Bergmann, Matt Turner, Vineet Gupta, Russell King,
	Catalin Marinas, Will Deacon, Guo Ren, Brian Cain, Huacai Chen,
	WANG Xuerui, Michal Simek, Thomas Bogendoerfer, Dinh Nguyen,
	Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	James E.J. Bottomley, Helge Deller, Michael Ellerman,
	Christophe Leroy, Palmer Dabbelt, Heiko Carstens,
	John Paul Adrian Glaubitz, x86, Borislav Petkov, Max Filippov,
	Jens Axboe, Sudip Mukherjee, Richard Weinberger, Bjorn Helgaas,
	Masahiro Yamada, Nathan Chancellor, Nick Desaulniers,
	Guenter Roeck, Stephen Rothwell, linux-next, linux-alpha,
	linux-snps-arc, linux-arm-kernel, linux-csky, linux-hexagon,
	linux-ia64, loongarch, linux-m68k, linux-mips, linux-openrisc,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390, linux-sh,
	sparclinux, linux-block, linux-scsi, linux-mtd,
	linux-trace-kernel, linux-pci, linux-kbuild
In-Reply-To: <CAMuHMdWC2S330_Vb_NTHTDC=BakBsw4ouP-eFJv0erV1-jmvTQ@mail.gmail.com>

Hi Geert,

Am 25.08.23 um 19:39 schrieb Geert Uytterhoeven:
> Hi Michael,
>
> On Fri, Aug 25, 2023 at 3:31 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> On 25/08/23 13:12, Martin K. Petersen wrote:
>>> [11/17] scsi: gvp11: remove unused gvp11_setup() function
>>>          https://git.kernel.org/mkp/scsi/c/bfaa4a0ce1bb
>> I somehow missed that one ...
>>
>> The gvp11_setup() function was probably a relic from the times before
>> module parameters.
>>
>> Since gvp11_xfer_mask appears to be required for some Amiga systems to
>> set the DMA mask, I'd best send a patch to add such a module parameter ...
>>
>> Do you know any details around the use of DMA masks for Amiga WD33C93
>> drivers, Geert?
> Doh, it's been a while, and I never had an affected system.
> Probably it's needed on A2000 with an accelerator card and GVP II SCSI,
> to prevent DMA to RAM banks that do not support fast DMA cycles.

Thanks, that's good enough for me.

Linux 2.0 had this comment:

|/* * DMA transfer mask for GVP Series II SCSI controller. * Some
versions can only DMA into the 24 bit address space * (0->16M). Others
can DMA into the full 32 bit address * space. The default is to only
allow DMA into the 24 bit * address space. The "gvp11=0xFFFFFFFE" setup
parameter can * be supplied to force an alternate (32 bit) mask. */ |

|We now handle that (since 2.6.35) through masks defined in
gvp11_zorro_tbl[] (though I note these don't account for unaligned
addresses such as implied by the example in the comment. Are unaligned
DMA buffers still possible today?). Would that cover the 'A2000 with
accelerator' case?
|

||

I'm happy to send a patch if an override to the device default DMA mask
is still necessary.

(Incidentally - did you ever publish the m68k full history tree anywhere
in git?)

Cheers,

    Michael


>
> Gr{oetje,eeting}s,
>
>                         Geert
>


^ permalink raw reply

* Re: [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
From: Andrii Nakryiko @ 2023-08-25 21:49 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner
In-Reply-To: <169280378611.282662.4078983611827223131.stgit@devnote2>

On Wed, Aug 23, 2023 at 8:16 AM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
> If the architecture defines its own ftrace_regs, this copies partial
> registers to pt_regs and returns it. If not, ftrace_regs is the same as
> pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Acked-by: Florent Revest <revest@chromium.org>
> ---
>  Changes in v3:
>   - Fix to use pt_regs::regs instead of x.
>   - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
>   - Fix typo.
>   - Fix to copy correct registers to the pt_regs on arm64.
>  Changes in v4:
>   - Change the patch order in the series so that fprobe event can use this.
> ---
>  arch/arm64/include/asm/ftrace.h |   11 +++++++++++
>  include/linux/ftrace.h          |   17 +++++++++++++++++
>  2 files changed, 28 insertions(+)
>
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index ab158196480c..5ad24f315d52 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
>         fregs->pc = fregs->lr;
>  }
>
> +static __always_inline struct pt_regs *
> +ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
> +{
> +       memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
> +       regs->sp = fregs->sp;
> +       regs->pc = fregs->pc;
> +       regs->regs[29] = fregs->fp;
> +       regs->regs[30] = fregs->lr;

I see that orig_x0 from pt_regs is used on arm64 to get syscall's
first parameter. And it seems like this ftrace_regs to pt_regs
conversion doesn't touch orig_x0 at all. Is it maintained/preserved
somewhere else, or will we lose the ability to get the first syscall's
argument?

Looking at libbpf's bpf_tracing.h, other than orig_x0, I think all the
other registers are still preserved, so this seems to be the only
potential problem.


> +       return regs;
> +}
> +
>  int ftrace_regs_query_register_offset(const char *name);
>
>  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index c0a42d0860b8..a6ed2aa71efc 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
>         return arch_ftrace_get_regs(fregs);
>  }
>
> +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> +       defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> +
> +static __always_inline struct pt_regs *
> +ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
> +{
> +       /*
> +        * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
> +        * layout is the same as pt_regs. So always returns that address.
> +        * Since arch_ftrace_get_regs() will check some members and may return
> +        * NULL, we can not use it.
> +        */
> +       return &fregs->regs;
> +}
> +
> +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> +
>  /*
>   * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
>   * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
>
>

^ permalink raw reply

* Re: [PATCH v4 9/9] Documentation: tracing: Add a note about argument and retval access
From: Florent Revest @ 2023-08-25 16:12 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner
In-Reply-To: <169280382895.282662.14910495061790007288.stgit@devnote2>

On Wed, Aug 23, 2023 at 5:17 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
> diff --git a/Documentation/trace/fprobetrace.rst b/Documentation/trace/fprobetrace.rst
> index 8e9bebcf0a2e..e35e6b18df40 100644
> --- a/Documentation/trace/fprobetrace.rst
> +++ b/Documentation/trace/fprobetrace.rst
> @@ -59,8 +59,12 @@ Synopsis of fprobe-events
>                    and bitfield are supported.
>
>    (\*1) This is available only when BTF is enabled.
> -  (\*2) only for the probe on function entry (offs == 0).
> -  (\*3) only for return probe.
> +  (\*2) only for the probe on function entry (offs == 0). Note, this argument access
> +        is best effort, because depending on the argument type, it may be passed on
> +        the stack. But this only support the arguments via registers.

supports*

Otherwise:
Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply

* Re: [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
From: Florent Revest @ 2023-08-25 16:12 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner
In-Reply-To: <169280379741.282662.12221517584561036597.stgit@devnote2>

On Wed, Aug 23, 2023 at 5:16 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> index c60d0d9f1a95..90ad28260a9f 100644
> --- a/kernel/trace/trace_fprobe.c
> +++ b/kernel/trace/trace_fprobe.c
> +#else /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS && !CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> +
> +/* Since fprobe handlers can be nested, pt_regs buffer need to be a stack */
> +#define PERF_FPROBE_REGS_MAX   4
> +
> +struct pt_regs_stack {
> +       struct pt_regs regs[PERF_FPROBE_REGS_MAX];
> +       int idx;
> +};
> +
> +static DEFINE_PER_CPU(struct pt_regs_stack, perf_fprobe_regs);
> +
> +static __always_inline
> +struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
> +{
> +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> +       struct pt_regs *regs;
> +
> +       if (stack->idx < PERF_FPROBE_REGS_MAX) {
> +               regs = stack->regs[stack->idx++];

This is missing an &:
regs = &stack->regs[stack->idx++];

> +               return ftrace_partial_regs(fregs, regs);

I think this is incorrect on arm64 and will likely cause very subtle
failure modes down the line on other architectures too. The problem on
arm64 is that Perf calls "user_mode(regs)" somewhere down the line,
that macro tries to read the "pstate" register, which is not populated
in ftrace_regs, so it's not copied into a "partial" pt_regs either and
Perf can take wrong decisions based on that.

I already mentioned this problem in the past:
- in the third answer block of:
https://lore.kernel.org/all/CABRcYmJjtVq-330ktqTAUiNO1=yG_aHd0xz=c550O5C7QP++UA@mail.gmail.com/
- in the fourth answer block of:
https://lore.kernel.org/all/CABRcYm+esb8J2O1v6=C+h+HSa5NxraPUgo63w7-iZj0CXbpusg@mail.gmail.com/

It is quite possible that other architectures at some point introduce
a light ftrace "args" trampoline that misses one of the registers
expected by Perf because they don't realize that this trampoline calls
fprobe which calls Perf which has specific registers expectations.

We got the green light from Alexei to use ftrace_partial_regs for "BPF
mutli_kprobe" because these BPF programs can gracefully deal with
sparse pt_regs but I think a similar conversation needs to happen with
the Perf folks.

----

On a side-note, a subtle difference between ftrace_partial_regs with
and without HAVE_PT_REGS_TO_FTRACE_REGS_CAST is that one does a copy
and the other does not. If a subsystem receives a partial regs under
HAVE_PT_REGS_TO_FTRACE_REGS_CAST, it can modify register fields and
the modified values will be restored by the ftrace trampoline. Without
HAVE_PT_REGS_TO_FTRACE_REGS_CAST, only the copy will be modified and
ftrace won't restore them. I think the least we can do is to document
thoroughly the guarantees of the ftrace_partial_regs API: users
shouldn't rely on modifying the resulting regs because depending on
the architecture this could do different things. People shouldn't rely
on any register that isn't covered by one of the ftrace_regs_get_*
helpers because it can be unpopulated on some architectures. I believe
this is the case for BPF multi_kprobe but not for Perf.

> +       }
> +       return NULL;
> +}
> +
> +static __always_inline void perf_fprobe_return_regs(struct pt_regs *regs)
> +{
> +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> +
> +       if (WARN_ON_ONCE(regs != stack->regs[stack->idx]))

This is missing an & too:
if (WARN_ON_ONCE(regs != &stack->regs[stack->idx]))




> +               return;
> +
> +       --stack->idx;
> +}
> +
> +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */

^ permalink raw reply

* Re: [PATCH v4 2/9] fprobe: Use fprobe_regs in fprobe entry handler
From: Florent Revest @ 2023-08-25 16:11 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner
In-Reply-To: <169280375109.282662.4109179404470188137.stgit@devnote2>

On Wed, Aug 23, 2023 at 5:15 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
> on arm64.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply

* Re: [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter
From: Florent Revest @ 2023-08-25 16:11 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner
In-Reply-To: <169280373992.282662.14835192462715188987.stgit@devnote2>

On Wed, Aug 23, 2023 at 5:15 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Add a new ret_ip callback parameter description.
>
> Fixes: cb16330d1274 ("fprobe: Pass return address to the handlers")
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply

* Re: [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs
From: Florent Revest @ 2023-08-25 16:11 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner
In-Reply-To: <169280372795.282662.9784422934484459769.stgit@devnote2>

On Wed, Aug 23, 2023 at 5:15 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> Hi,
>
> Here is the 4th version of the series to use ftrace_regs instead of pt_regs
> in fprobe.
> The previous version is here;
>
> https://lore.kernel.org/all/169181859570.505132.10136520092011157898.stgit@devnote2/
>
> This version fixes the issues pointed in the previous series; fix document
> description, keep CONFIG_FPROBE dependency for multi-kprobe, add
> static_assert check for ftrace_regs size, reorder the ftrace_partial_regs()
> patch for perf fprobe event support, introduce per-cpu pt_regs stack for
> perf fprobe event and add Florent's Ack (Thanks!). Also this adds a new
> documentation patch to clarify that the $argN and $retval is best effort.
>
>  - Document fix for the current fprobe callback prototype
>  - Simply replace pt_regs in fprobe_entry_handler with ftrace_regs.
>  - Expose ftrace_regs even if CONFIG_FUNCTION_TRACER=n.
>  - Introduce ftrace_partial_regs(). (This changes ARM64 which needs a custom
>    implementation)
>  - Replace pt_regs in rethook and fprobe_exit_handler with ftrace_regs. This
>    introduce a new HAVE_PT_REGS_TO_FTRACE_REGS_CAST which means ftrace_regs is
>    just a wrapper of pt_regs (except for arm64, other architectures do this)
>  - Update fprobe-events to use ftrace_regs natively.
>  - Update bpf multi-kprobe handler use ftrace_partial_regs().
>  - Update document for new fprobe callbacks.
>  - Add notes for the $argN and $retval.
>
> This series can be applied against the probes/core branch on linux-trace tree.
>
> This series can also be found below branch.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=topic/fprobe-ftrace-regs
>
> Thank you,

FWIW, I verified that I could implement BPF multi_kprobe on arm64 on
top of this series so the BPF multi_kprobe usecase is tested but I
haven't tested the "trace_fprobe" or perf use cases.

^ permalink raw reply

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
From: Florent Revest @ 2023-08-25 16:12 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner
In-Reply-To: <169280377434.282662.7610009313268953247.stgit@devnote2>

On Wed, Aug 23, 2023 at 5:16 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Change the fprobe exit handler and rethook to use ftrace_regs data structure
> instead of pt_regs. This also introduce HAVE_PT_REGS_TO_FTRACE_REGS_CAST
> which means the ftrace_regs's memory layout is equal to the pt_regs so
> that those are able to cast. Only if it is enabled, kretprobe will use
> rethook since kretprobe requires pt_regs for backward compatibility.
>
> This means the archs which currently implement rethook for kretprobes needs to
> set that flag and it must ensure struct ftrace_regs is same as pt_regs.
> If not, it must be either disabling kretprobe or implementing kretprobe
> trampoline separately from rethook trampoline.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply

* Re: [PATCH v1 1/1] bitops: Share BYTES_TO_BITS() for everyone
From: Alexander Lobakin @ 2023-08-25 14:49 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Yury Norov, Uros Bizjak, Masami Hiramatsu (Google), linux-kernel,
	linux-trace-kernel, Steven Rostedt
In-Reply-To: <20230824123728.2761663-1-andriy.shevchenko@linux.intel.com>

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Thu, 24 Aug 2023 15:37:28 +0300

> It may be new callers for the same macro, share it.
> 
> Note, it's unknown why it's represented in the current form instead of
> simple multiplication and commit 1ff511e35ed8 ("tracing/kprobes: Add
> bitfield type") doesn't explain that neither. Let leave it as is and
> we may improve it in the future.

Maybe symmetrical change in tools/ like I did[0] an aeon ago?

> 
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> ---
>  include/linux/bitops.h     | 2 ++
>  kernel/trace/trace_probe.c | 3 +--
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index 2ba557e067fe..66dc091e0c28 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -21,6 +21,8 @@
>  #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
>  #define BITS_TO_BYTES(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(char))
>  
> +#define BYTES_TO_BITS(nb)	((BITS_PER_LONG * (nb)) / sizeof(long))
> +
>  extern unsigned int __sw_hweight8(unsigned int w);
>  extern unsigned int __sw_hweight16(unsigned int w);
>  extern unsigned int __sw_hweight32(unsigned int w);
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index c68a72707852..da6297d24d61 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -11,6 +11,7 @@
>   */
>  #define pr_fmt(fmt)	"trace_probe: " fmt
>  
> +#include <linux/bitops.h>
>  #include <linux/bpf.h>
>  
>  #include "trace_probe.h"
> @@ -830,8 +831,6 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
>  	return ret;
>  }
>  
> -#define BYTES_TO_BITS(nb)	((BITS_PER_LONG * (nb)) / sizeof(long))
> -
>  /* Bitfield type needs to be parsed into a fetch function */
>  static int __parse_bitfield_probe_arg(const char *bf,
>  				      const struct fetch_type *t,

[0]
https://github.com/alobakin/linux/commit/fd308001fe6d38837fe820427209a6a99e4850a8

Thanks,
Olek

^ permalink raw reply

* Re: [PATCH v3 1/1] tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols
From: Francis Laniel @ 2023-08-25 14:14 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: linux-kernel, linux-trace-kernel, Steven Rostedt
In-Reply-To: <20230825221321.faaa33e03afc48abc345c24f@kernel.org>

Le vendredi 25 août 2023, 15:13:21 CEST Masami Hiramatsu a écrit :
> On Fri, 25 Aug 2023 14:34:49 +0200
> 
> Francis Laniel <flaniel@linux.microsoft.com> wrote:
> > Hi.
> > 
> > Le vendredi 25 août 2023, 14:16:49 CEST Masami Hiramatsu a écrit :
> > > On Thu, 24 Aug 2023 18:08:59 +0200
> > > 
> > > Francis Laniel <flaniel@linux.microsoft.com> wrote:
> > > > Previously to this commit, if func matches several symbols, a kprobe,
> > > > being
> > > > either sysfs or PMU, would only be installed for the first matching
> > > > address. This could lead to some misunderstanding when some BPF code
> > > > was
> > > > never called because it was attached to a function which was indeed
> > > > not
> > > > called, because the effectively called one has no kprobes attached.
> > > > 
> > > > So, this commit returns EADDRNOTAVAIL when func matches several
> > > > symbols.
> > > > This way, user needs to use address to remove the ambiguity.
> > > > 
> > > > Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > > Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
> > > > Link:
> > > > https://lore.kernel.org/lkml/20230819101105.b0c104ae4494a7d1f2eea742@k
> > > > ern
> > > > el.org/ ---
> > > 
> > > Ah, this should be fine, but selftest (tools/testing/selftests/ftrace)
> > > fails.
> > > 
> > >  # tail 60-kprobe_module.tc-log.vsOHnF
> > > 
> > > ...
> > > + :
> > > + : 'Add an event on a module function without specifying event name'
> > > + :
> > > + echo 'p trace_printk:trace_printk_irq_work'
> > > sh: write error: No such file or directory
> > > 
> > > Ah, the function on non-exist module should be checked too.
> > > 
> > > # tail 63-kprobe_syntax_errors.tc-log.mMLwIQ
> > > + + printfwc '%s' -c
> > > 
> > >  'p '
> > > 
> > > + pos=2
> > > + printf+  '%s'tr 'p ^non_exist_func'
> > > 
> > >  -d ^
> > > 
> > > + command='p non_exist_func'
> > > + echo 'Test command: p non_exist_func'
> > > Test command: p non_exist_func
> > > + echo
> > > + grep 'trace_kprobe: error:' -A 3 error_log
> > > 
> > > Also, this doesn't leave a syntax error message.
> > > 
> > > So, the below changes are needed.
> > 
> > Excellent catch! Thank you, I will apply this patch and send v4 right
> > after. Regarding test, do you think I can add a test for the
> > EADDRNOTAVAIL case?
> Hmm, in that case, you need to change something in tracefs/README so that
> we can identify the kernel has different behavior. Or we have to change
> this is a "Fix" for backporting.

Oops, sorry I sent the v4 with a test but as a separated commit, so we can 
just ignore it for the moment.

> > Maybe it should go inside LTP? As this would need having a kernel compiled
> > with a name pointing to several symbols?
> 
> For this tracing feature, I rather like to use
> tools/testing/selftests/ftrace to test it. And it is used on all stable
> kernel, that is why we need to add some change on tracefs/README or
> something.
> 
> But I would like to wait for Alessandro's work. After his work, in this time
> we need to probe all the same-name symbols as your original patch does.
> This is because 1:n mapping can happen as Alessandro pointed in
> 
> https://lore.kernel.org/all/CAPp5cGQsRdB0+KHR1wX2bDDdc5sTzSNPA417PNJb0ypmV=y
> S6w@mail.gmail.com/
> 
> But if his feature is configurable (and maybe so), we need to keep this
> version... We have many options.
> 
> - this normal kallsyms: the same-name symbols should not be used.
> - enhanced kallsyms (if 1:n symbol has the same suffix): the same name
> symbols should be probed at once.
> - enhanced kallsyms (if 1:n symbol has different suffix): the same-name
> symbol must not exist.

I understand!
In future case, we could still have a test and change its behavior (i.e. 
potentially skipping it) when KALLSYMS_ALIAS is set.

> > Also, should some man pages somewhere be updated to reflect the case
> > kprobe can return EADDRNOTAVAIL?
> 
> No, it is a tracefs interface and we don't have man pages yet.

I was more thinking to the PMU counterpart as it is created through 
perf_event_open()?

> Thank you,
> 
> > > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> > > index 8ab46a2a446d..1e57bc896952 100644
> > > --- a/kernel/trace/trace_kprobe.c
> > > +++ b/kernel/trace/trace_kprobe.c
> > > @@ -855,7 +855,7 @@ static int __trace_kprobe_create(int argc, const
> > > char
> > > *argv[]) }
> > > 
> > >  	}
> > > 
> > > -	if (symbol) {
> > > +	if (symbol && !strchr(symbol, ':')) {
> > > 
> > >  		unsigned int count;
> > >  		
> > >  		count = number_of_same_symbols(symbol);
> > > 
> > > @@ -864,6 +864,7 @@ static int __trace_kprobe_create(int argc, const
> > > char
> > > *argv[]) * Users should use ADDR to remove the ambiguity of
> > > 
> > >  			 * using KSYM only.
> > >  			 */
> > > 
> > > +			trace_probe_log_err(0, NON_UNIQ_SYMBOL);
> > > 
> > >  			ret = -EADDRNOTAVAIL;
> > >  			
> > >  			goto error;
> > > 
> > > @@ -872,6 +873,7 @@ static int __trace_kprobe_create(int argc, const
> > > char
> > > *argv[]) * We can return ENOENT earlier than when register the
> > > 
> > >  			 * kprobe.
> > >  			 */
> > > 
> > > +			trace_probe_log_err(0, BAD_PROBE_ADDR);
> > > 
> > >  			ret = -ENOENT;
> > >  			
> > >  			goto error;
> > > 
> > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> > > index 7f929482e8d4..a4f478448eef 100644
> > > --- a/kernel/trace/trace_probe.h
> > > +++ b/kernel/trace/trace_probe.h
> > > @@ -450,6 +450,7 @@ extern int traceprobe_define_arg_fields(struct
> > > trace_event_call *event_call, C(BAD_MAXACT,		"Invalid maxactive
> > > number"),		\
> > > 
> > >  	C(MAXACT_TOO_BIG,	"Maxactive is too big"),		\
> > >  	C(BAD_PROBE_ADDR,	"Invalid probed address or symbol"),	\
> > > 
> > > +	C(NON_UNIQ_SYMBOL,	"The symbol is not unique"),		\
> > > 
> > >  	C(BAD_RETPROBE,		"Retprobe address must be an function
> > 
> > entry"), \
> > 
> > >  	C(NO_TRACEPOINT,	"Tracepoint is not found"),		\
> > >  	C(BAD_ADDR_SUFFIX,	"Invalid probed address suffix"), \
> > > 
> > > Thank you,
> > > 
> > > >  kernel/trace/trace_kprobe.c | 61
> > > >  +++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 61 insertions(+)
> > > > 
> > > > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> > > > index 23dba01831f7..2f393739e8cf 100644
> > > > --- a/kernel/trace/trace_kprobe.c
> > > > +++ b/kernel/trace/trace_kprobe.c
> > > > @@ -705,6 +705,25 @@ static struct notifier_block
> > > > trace_kprobe_module_nb =
> > > > {>
> > > > 
> > > >  	.priority = 1	/* Invoked after kprobe module callback */
> > > >  
> > > >  };
> > > > 
> > > > +static int count_symbols(void *data, unsigned long unused)
> > > > +{
> > > > +	unsigned int *count = data;
> > > > +
> > > > +	(*count)++;
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +static unsigned int number_of_same_symbols(char *func_name)
> > > > +{
> > > > +	unsigned int count;
> > > > +
> > > > +	count = 0;
> > > > +	kallsyms_on_each_match_symbol(count_symbols, func_name, &count);
> > > > +
> > > > +	return count;
> > > > +}
> > > > +
> > > > 
> > > >  static int __trace_kprobe_create(int argc, const char *argv[])
> > > >  {
> > > >  
> > > >  	/*
> > > > 
> > > > @@ -836,6 +855,29 @@ static int __trace_kprobe_create(int argc, const
> > > > char
> > > > *argv[])>
> > > > 
> > > >  		}
> > > >  	
> > > >  	}
> > > > 
> > > > +	if (symbol) {
> > > > +		unsigned int count;
> > > > +
> > > > +		count = number_of_same_symbols(symbol);
> > > > +		if (count > 1) {
> > > > +			/*
> > > > +			 * Users should use ADDR to remove the ambiguity of
> > > > +			 * using KSYM only.
> > > > +			 */
> > > > 
> > > > 
> > > > 
> > > > +			ret = -EADDRNOTAVAIL;
> > > > +
> > > > +			goto error;
> > > > +		} else if (count == 0) {
> > > > +			/*
> > > > +			 * We can return ENOENT earlier than when register the
> > > > +			 * kprobe.
> > > > +			 */
> > > > +			ret = -ENOENT;
> > > > +
> > > > +			goto error;
> > > > +		}
> > > > +	}
> > > > +
> > > > 
> > > >  	trace_probe_log_set_index(0);
> > > >  	if (event) {
> > > >  	
> > > >  		ret = traceprobe_parse_event_name(&event, &group, gbuf,
> > > > 
> > > > @@ -1699,6 +1741,7 @@ static int unregister_kprobe_event(struct
> > > > trace_kprobe *tk)>
> > > > 
> > > >  }
> > > >  
> > > >  #ifdef CONFIG_PERF_EVENTS
> > > > 
> > > > +
> > > > 
> > > >  /* create a trace_kprobe, but don't add it to global lists */
> > > >  struct trace_event_call *
> > > >  create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
> > > > 
> > > > @@ -1709,6 +1752,24 @@ create_local_trace_kprobe(char *func, void
> > > > *addr,
> > > > unsigned long offs,>
> > > > 
> > > >  	int ret;
> > > >  	char *event;
> > > > 
> > > > +	if (func) {
> > > > +		unsigned int count;
> > > > +
> > > > +		count = number_of_same_symbols(func);
> > > > +		if (count > 1)
> > > > +			/*
> > > > +			 * Users should use addr to remove the ambiguity of
> > > > +			 * using func only.
> > > > +			 */
> > > > +			return ERR_PTR(-EADDRNOTAVAIL);
> > > > +		else if (count == 0)
> > > > +			/*
> > > > +			 * We can return ENOENT earlier than when register the
> > > > +			 * kprobe.
> > > > +			 */
> > > > +			return ERR_PTR(-ENOENT);
> > > > +	}
> > > > +
> > > > 
> > > >  	/*
> > > >  	
> > > >  	 * local trace_kprobes are not added to dyn_event, so they are never
> > > >  	 * searched in find_trace_kprobe(). Therefore, there is no concern
> > > >  	 of
> > 
> > Best regards.



^ permalink raw reply

* [PATCH v4 2/2] selftests/ftrace: Add new test case which checks non unique symbol
From: Francis Laniel @ 2023-08-25 14:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, linux-trace-kernel, Francis Laniel,
	Steven Rostedt, Shuah Khan, linux-kselftest
In-Reply-To: <20230825140519.34297-1-flaniel@linux.microsoft.com>

If name_show() is non unique, this test will try to install a kprobe on this
function which should fail returning EADDRNOTAVAIL.
On kernel where name_show() is not unique, this test is skipped.

Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
---
 .../ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc  | 13 +++++++++++++
 1 file changed, 13 insertions(+)
 create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc

diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
new file mode 100644
index 000000000000..bc9514428dba
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
@@ -0,0 +1,13 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Test failure of registering kprobe on non unique symbol
+# requires: kprobe_events
+
+SYMBOL='name_show'
+
+# We skip this test on kernel where SYMBOL is unique or does not exist.
+if [ "$(grep -c -E "[[:alnum:]]+ t ${SYMBOL}" /proc/kallsyms)" -le '1' ]; then
+	exit_unsupported
+fi
+
+! echo "p:test_non_unique ${SYMBOL}" > kprobe_events
-- 
2.34.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox