Linux Trace Kernel
 help / color / mirror / Atom feed
* Re: [PATCH] visl: use canonical ftrace path
From: Steven Rostedt @ 2023-08-31 13:39 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: linux-kernel, Ross Zwisler, Mark Rutland, Masami Hiramatsu,
	Mauro Carvalho Chehab, linux-media, linux-trace-kernel,
	Daniel Almeida, Hans Verkuil
In-Reply-To: <20230829204600.3210276-2-zwisler@kernel.org>

On Tue, 29 Aug 2023 14:46:01 -0600
Ross Zwisler <zwisler@kernel.org> wrote:

> From: Ross Zwisler <zwisler@google.com>
> 
> The canonical location for the tracefs filesystem is at /sys/kernel/tracing.
> 
> But, from Documentation/trace/ftrace.rst:
> 
>   Before 4.1, all ftrace tracing control files were within the debugfs
>   file system, which is typically located at /sys/kernel/debug/tracing.
>   For backward compatibility, when mounting the debugfs file system,
>   the tracefs file system will be automatically mounted at:
> 
>   /sys/kernel/debug/tracing
> 
> Update the visl decoder driver documentation to use this tracefs path.
> 
> Signed-off-by: Ross Zwisler <zwisler@google.com>

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Thanks Ross!

-- Steve


^ permalink raw reply

* Re: [PATCH] tracing: zero the pipe cpumask on alloc to avoid spurious -EBUSY
From: Zheng Yejian @ 2023-08-31 13:51 UTC (permalink / raw)
  To: Brian Foster, linux-trace-kernel; +Cc: linux-kernel, rostedt
In-Reply-To: <20230831125500.986862-1-bfoster@redhat.com>

On 2023/8/31 20:55, Brian Foster wrote:
> The pipe cpumask used to serialize opens between the main and percpu
> trace pipes is not zeroed or initialized. This can result in
> spurious -EBUSY returns if underlying memory is not fully zeroed.
> This has been observed by immediate failure to read the main
> trace_pipe file on an otherwise newly booted and idle system:
> 
>   # cat /sys/kernel/debug/tracing/trace_pipe
>   cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy
> 
> Zero the allocation of pipe_cpumask to avoid the problem.
> 
> Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> Hi,
> 
> I ran into this problem just recently on one of my test VMs immediately
> after updating to a v6.5 base. A revert of the aforementioned commit
> addressed the problem. I'm not terribly familiar with the tracing code,
> but on further inspection I noticed the cpumask doesn't appear to be
> initialized anywhere. I suppose this could alternatively do a
> cpumask_clear() or whatever after allocation, but either way this
> addresses the problem for me.

Yes, pipe_cpumask must be initialized.

--

Thanks,
Zheng Yejian

> 
> Please CC on replies as I'm not subscribed to the list. Thanks.
> 
> Brian
> 
>   kernel/trace/trace.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 8e64aaad5361..2656ca3b9b39 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -9486,7 +9486,7 @@ static struct trace_array *trace_array_create(const char *name)
>   	if (!alloc_cpumask_var(&tr->tracing_cpumask, GFP_KERNEL))
>   		goto out_free_tr;
>   
> -	if (!alloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
> +	if (!zalloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
>   		goto out_free_tr;
>   
>   	tr->trace_flags = global_trace.trace_flags & ~ZEROED_TRACE_FLAGS;
> @@ -10431,7 +10431,7 @@ __init static int tracer_alloc_buffers(void)
>   	if (trace_create_savedcmd() < 0)
>   		goto out_free_temp_buffer;
>   
> -	if (!alloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
> +	if (!zalloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
>   		goto out_free_savedcmd;
>   
>   	/* TODO: make the number of buffers hot pluggable with CPUS */


^ permalink raw reply

* [PATCH] trace/events/task.h: Replace strlcpy with strscpy
From: Azeem Shaikh @ 2023-08-31 19:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-hardening, Azeem Shaikh, linux-kernel, linux-trace-kernel,
	Steven Rostedt, Masami Hiramatsu

strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().

No return values were used, so direct replacement is safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
---
 include/trace/events/task.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/trace/events/task.h b/include/trace/events/task.h
index 64d160930b0d..47b527464d1a 100644
--- a/include/trace/events/task.h
+++ b/include/trace/events/task.h
@@ -47,7 +47,7 @@ TRACE_EVENT(task_rename,
 	TP_fast_assign(
 		__entry->pid = task->pid;
 		memcpy(entry->oldcomm, task->comm, TASK_COMM_LEN);
-		strlcpy(entry->newcomm, comm, TASK_COMM_LEN);
+		strscpy(entry->newcomm, comm, TASK_COMM_LEN);
 		__entry->oom_score_adj = task->signal->oom_score_adj;
 	),

--
2.42.0.283.g2d96d420d3-goog



^ permalink raw reply related

* Re: [PATCH] tracing: zero the pipe cpumask on alloc to avoid spurious -EBUSY
From: Steven Rostedt @ 2023-08-31 20:33 UTC (permalink / raw)
  To: Zheng Yejian; +Cc: Brian Foster, linux-trace-kernel, linux-kernel
In-Reply-To: <26f8fb43-6ea8-edc2-381d-3600fce261af@huawei.com>

On Thu, 31 Aug 2023 21:51:18 +0800
Zheng Yejian <zhengyejian1@huawei.com> wrote:

> > Hi,
> > 
> > I ran into this problem just recently on one of my test VMs immediately
> > after updating to a v6.5 base. A revert of the aforementioned commit
> > addressed the problem. I'm not terribly familiar with the tracing code,
> > but on further inspection I noticed the cpumask doesn't appear to be
> > initialized anywhere. I suppose this could alternatively do a
> > cpumask_clear() or whatever after allocation, but either way this
> > addresses the problem for me.  
> 
> Yes, pipe_cpumask must be initialized.

Can I add a Reviewed-by tag from you?

> 
> > 
> > Please CC on replies as I'm not subscribed to the list. Thanks.

That's the default with Linux kernel lists.

-- Steve

^ permalink raw reply

* Re: [PATCH] trace/events/task.h: Replace strlcpy with strscpy
From: Kees Cook @ 2023-08-31 21:08 UTC (permalink / raw)
  To: Azeem Shaikh
  Cc: linux-hardening, linux-kernel, linux-trace-kernel, Steven Rostedt,
	Masami Hiramatsu
In-Reply-To: <20230831194212.1529941-1-azeemshaikh38@gmail.com>

On Thu, Aug 31, 2023 at 07:42:12PM +0000, Azeem Shaikh wrote:
> strlcpy() reads the entire source buffer first.
> This read may exceed the destination size limit.
> This is both inefficient and can lead to linear read
> overflows if a source string is not NUL-terminated [1].
> In an effort to remove strlcpy() completely [2], replace
> strlcpy() here with strscpy().
> 
> No return values were used, so direct replacement is safe.
> 
> [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
> [2] https://github.com/KSPP/linux/issues/89
> 
> Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH] trace/events/task.h: Replace strlcpy with strscpy
From: Justin Stitt @ 2023-08-31 23:20 UTC (permalink / raw)
  To: Azeem Shaikh
  Cc: Kees Cook, linux-hardening, linux-kernel, linux-trace-kernel,
	Steven Rostedt, Masami Hiramatsu
In-Reply-To: <20230831194212.1529941-1-azeemshaikh38@gmail.com>

On Thu, Aug 31, 2023 at 12:42 PM Azeem Shaikh <azeemshaikh38@gmail.com> wrote:
>
> strlcpy() reads the entire source buffer first.
> This read may exceed the destination size limit.
> This is both inefficient and can lead to linear read
> overflows if a source string is not NUL-terminated [1].
> In an effort to remove strlcpy() completely [2], replace
> strlcpy() here with strscpy().
>
> No return values were used, so direct replacement is safe.
>
> [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
> [2] https://github.com/KSPP/linux/issues/89
>
> Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>

Reviewed-by: Justin Stitt <justinstitt@google.com>

> ---
>  include/trace/events/task.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/trace/events/task.h b/include/trace/events/task.h
> index 64d160930b0d..47b527464d1a 100644
> --- a/include/trace/events/task.h
> +++ b/include/trace/events/task.h
> @@ -47,7 +47,7 @@ TRACE_EVENT(task_rename,
>         TP_fast_assign(
>                 __entry->pid = task->pid;
>                 memcpy(entry->oldcomm, task->comm, TASK_COMM_LEN);
> -               strlcpy(entry->newcomm, comm, TASK_COMM_LEN);
> +               strscpy(entry->newcomm, comm, TASK_COMM_LEN);
>                 __entry->oom_score_adj = task->signal->oom_score_adj;
>         ),
>
> --
> 2.42.0.283.g2d96d420d3-goog
>
>

^ permalink raw reply

* Re: [PATCH] tracing: zero the pipe cpumask on alloc to avoid spurious -EBUSY
From: Zheng Yejian @ 2023-09-01  1:34 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Brian Foster, linux-trace-kernel, linux-kernel
In-Reply-To: <20230831163331.6c90c963@rorschach.local.home>

On 2023/9/1 04:33, Steven Rostedt wrote:
> On Thu, 31 Aug 2023 21:51:18 +0800
> Zheng Yejian <zhengyejian1@huawei.com> wrote:
> 
>>> Hi,
>>>
>>> I ran into this problem just recently on one of my test VMs immediately
>>> after updating to a v6.5 base. A revert of the aforementioned commit
>>> addressed the problem. I'm not terribly familiar with the tracing code,
>>> but on further inspection I noticed the cpumask doesn't appear to be
>>> initialized anywhere. I suppose this could alternatively do a
>>> cpumask_clear() or whatever after allocation, but either way this
>>> addresses the problem for me.
>>
>> Yes, pipe_cpumask must be initialized.
> 
> Can I add a Reviewed-by tag from you?

Of course :)
Reviewed-by: Zheng Yejian <zhengyejian1@huawei.com>

--

Thanks,
Zheng Yejian

> 
>>
>>>
>>> Please CC on replies as I'm not subscribed to the list. Thanks.
> 
> That's the default with Linux kernel lists.
> 
> -- Steve
> 


^ permalink raw reply

* Re: [PATCH v3] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
From: Masahiro Yamada @ 2023-09-01  5:31 UTC (permalink / raw)
  To: Alessandro Carminati (Red Hat)
  Cc: Masami Hiramatsu, Steven Rostedt, Daniel Bristot de Oliveira,
	Josh Poimboeuf, Luis Chamberlain, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Alexander Lobakin, Nick Alcock,
	Kris Van Hees, Eugene Loh, Francis Laniel, Viktor Malik,
	linux-kbuild, linux-kernel, linux-trace-kernel, live-patching
In-Reply-To: <20230828080423.3539686-1-alessandro.carminati@gmail.com>

On Mon, Aug 28, 2023 at 8:45 PM Alessandro Carminati (Red Hat)
<alessandro.carminati@gmail.com> wrote:
>
> From: Alessandro Carminati <alessandro.carminati@gmail.com>
>
> It is not uncommon for drivers or modules related to similar peripherals
> to have symbols with the exact same name.
> While this is not a problem for the kernel's binary itself, it becomes an
> issue when attempting to trace or probe specific functions using
> infrastructure like ftrace or kprobe.
>
> The tracing subsystem relies on the `nm -n vmlinux` output, which provides
> symbol information from the kernel's ELF binary. However, when multiple
> symbols share the same name, the standard nm output does not differentiate
> between them. This can lead to confusion and difficulty when trying to
> probe the intended symbol.
>
>  ~ # cat /proc/kallsyms | grep " name_show"
>  ffffffff8c4f76d0 t name_show
>  ffffffff8c9cccb0 t name_show
>  ffffffff8cb0ac20 t name_show
>  ffffffff8cc728c0 t name_show
>  ffffffff8ce0efd0 t name_show
>  ffffffff8ce126c0 t name_show
>  ffffffff8ce1dd20 t name_show
>  ffffffff8ce24e70 t name_show
>  ffffffff8d1104c0 t name_show
>  ffffffff8d1fe480 t name_show
>
> **kas_alias** addresses this challenge by extending the symbol names with
> unique suffixes during the kernel build process.
> The newly created aliases for these duplicated symbols are unique names
> that can be fed to the ftracefs interface. By doing so, it enables
> previously unreachable symbols to be probed.
>
>  ~ # cat /proc/kallsyms | grep " name_show"
>  ffffffff974f76d0 t name_show
>  ffffffff974f76d0 t name_show__alias__6340
>  ffffffff979cccb0 t name_show
>  ffffffff979cccb0 t name_show__alias__6341
>  ffffffff97b0ac20 t name_show
>  ffffffff97b0ac20 t name_show__alias__6342
>  ffffffff97c728c0 t name_show
>  ffffffff97c728c0 t name_show__alias__6343
>  ffffffff97e0efd0 t name_show
>  ffffffff97e0efd0 t name_show__alias__6344
>  ffffffff97e126c0 t name_show
>  ffffffff97e126c0 t name_show__alias__6345
>  ffffffff97e1dd20 t name_show
>  ffffffff97e1dd20 t name_show__alias__6346
>  ffffffff97e24e70 t name_show
>  ffffffff97e24e70 t name_show__alias__6347
>  ffffffff981104c0 t name_show
>  ffffffff981104c0 t name_show__alias__6348
>  ffffffff981fe480 t name_show
>  ffffffff981fe480 t name_show__alias__6349
>
>  ~ # echo "p:kprobes/evnt1 name_show__alias__6349" \
>  > >/sys/kernel/tracing/kprobe_events
>  ~ # cat /sys/kernel/tracing/kprobe_events
>  p:kprobes/evnt1 name_show__alias__6349
>
> Changes from v1:
> - Integrated changes requested by Masami to exclude symbols with prefixes
>   "_cfi" and "_pfx".
> - Introduced a small framework to handle patterns that need to be excluded
>   from the alias production.
> - Excluded other symbols using the framework.
> - Introduced the ability to discriminate between text and data symbols.
> - Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA,
>   which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which
>   excludes all filters and provides an alias for each duplicated symbol.
>
> https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@gmail.com/
>
> Changes from v2:
> - Alias tags are created by querying DWARF information from the vmlinux.
> - The filename + line number is normalized and appended to the original name.
> - The tag begins with '@' to indicate the symbol source.
> - Not a change, but worth mentioning, since the alias is added to the existing
>   list, the old duplicated name is preserved, and the livepatch way of dealing
>   with duplicates is maintained.
> - Acknowledging the existence of scenarios where inlined functions declared in
>   header files may result in multiple copies due to compiler behavior, though
>    it is not actionable as it does not pose an operational issue.
> - Highlighting a single exception where the same name refers to different
>   functions: the case of "compat_binfmt_elf.c," which directly includes
>   "binfmt_elf.c" producing identical function copies in two separate
>   modules.
>
> sample from new v3
>
>  ~ # cat /proc/kallsyms | grep gic_mask_irq
>  ffffd0b03c04dae4 t gic_mask_irq
>  ffffd0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167
>  ffffd0b03c050960 t gic_mask_irq
>  ffffd0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404
>  ~ #
>
> https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/
>
> Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com>
> ---
>  init/Kconfig                        |  36 ++++
>  scripts/Makefile                    |   4 +
>  scripts/kas_alias/Makefile          |   4 +
>  scripts/kas_alias/a2l.c             | 268 ++++++++++++++++++++++++++++
>  scripts/kas_alias/a2l.h             |  32 ++++
>  scripts/kas_alias/duplicates_list.c |  70 ++++++++
>  scripts/kas_alias/duplicates_list.h |  15 ++
>  scripts/kas_alias/item_list.c       | 230 ++++++++++++++++++++++++
>  scripts/kas_alias/item_list.h       |  26 +++
>  scripts/kas_alias/kas_alias.c       | 217 ++++++++++++++++++++++
>  scripts/link-vmlinux.sh             |  11 +-
>  11 files changed, 910 insertions(+), 3 deletions(-)
>  create mode 100644 scripts/kas_alias/Makefile
>  create mode 100644 scripts/kas_alias/a2l.c
>  create mode 100644 scripts/kas_alias/a2l.h
>  create mode 100644 scripts/kas_alias/duplicates_list.c
>  create mode 100644 scripts/kas_alias/duplicates_list.h
>  create mode 100644 scripts/kas_alias/item_list.c
>  create mode 100644 scripts/kas_alias/item_list.h
>  create mode 100644 scripts/kas_alias/kas_alias.c
>
> diff --git a/init/Kconfig b/init/Kconfig
> index f7f65af4ee12..bc69fcd9cbc8 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1737,6 +1737,42 @@ config KALLSYMS_BASE_RELATIVE
>           time constants, and no relocation pass is required at runtime to fix
>           up the entries based on the runtime load address of the kernel.
>
> +config KALLSYMS_ALIAS
> +       bool "Produces alias for duplicated symbols" if EXPERT
> +       depends on KALLSYMS && (DEBUG_INFO_DWARF4 || DEBUG_INFO_DWARF5)


You can simplify this to


           depends on KALLSYMS && DEBUG_INFO

I re-checked the commit log of
a66049e2cf0ef166dba5bafdbb3062287fc965ad

It says "GCC since ~4.8 has defaulted to DWARF v4
implicitly, and GCC 11 has bumped this to v5"


When DEBUG_INFO is enabled, the dwarf version is 4 or 5.


Presumably, your code does not work with DEBUG_INFO_SPLIT


   depends on KALLSYMS && DEBUG_INFO && !DEBUG_INFO_SPLIT




> diff --git a/scripts/kas_alias/a2l.c b/scripts/kas_alias/a2l.c
> new file mode 100644
> index 000000000000..a9692ac30180
> --- /dev/null
> +++ b/scripts/kas_alias/a2l.c
> @@ -0,0 +1,268 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/wait.h>
> +#include <string.h>


<string.h> is included twice.

I always sort include directives alphabetically
to avoid such a mistake.






> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#include "a2l.h"
> +
> +int addr2line_pid = -1;
> +int a2l_in[2];
> +int a2l_out[2];
> +char line[MAX_BUF];
> +char vmlinux_path[MAX_BUF];
> +char addr2line_cmd[MAX_CMD_LEN];
> +FILE *a2l_stdin, *a2l_stdout;
> +
> +static char *normalize_path(const char *input_path, char *output_path)
> +{
> +       char *prev_token = NULL;
> +       char *delimiter = "/";
> +       char inbuf[MAX_BUF];
> +       char *token;
> +       char *pos;
> +
> +       memset(inbuf, 0, MAX_BUF);
> +       *output_path = '\0';
> +       strncpy(inbuf, input_path, MAX_BUF);
> +       if (!input_path || !output_path || strlen(input_path) == 0)
> +               return NULL;
> +
> +       token = strtok(inbuf, delimiter);
> +       while (token) {
> +               if (strcmp(token, "..") == 0 && prev_token) {
> +                       pos = strrchr(output_path, '/');
> +                       if (pos)
> +                               *pos = '\0';
> +
> +               } else if (strcmp(token, ".") != 0) {
> +                       strcat(output_path, "/");
> +                       strcat(output_path, token);
> +               }
> +
> +               prev_token = token;
> +               token = strtok(NULL, delimiter);
> +       }
> +
> +       return output_path;
> +}
> +
> +static void path_of(const char *full_path, char *path)
> +{
> +       const char *last_slash = strrchr(full_path, '/');
> +       size_t path_length;
> +       char cwd[MAX_BUF];
> +
> +       if (!last_slash) {
> +               if (getcwd(cwd, sizeof(cwd)))
> +                       strcpy(path, cwd);
> +               else
> +                       strcpy(path, ".");
> +       } else {
> +               path_length = last_slash - full_path;
> +               strncpy(path, full_path, path_length);
> +               path[path_length] = '\0';
> +       }
> +}
> +
> +static bool file_exists(const char *file_path)
> +{
> +       FILE *file;
> +
> +       file = fopen(file_path, "r");
> +       if (file) {
> +               fclose(file);
> +               return true;
> +       }
> +       return false;
> +}
> +
> +int addr2line_init(const char *cmd, const char *vmlinux)
> +{
> +       if ((!file_exists(cmd)) || (!file_exists(vmlinux))) {
> +               printf("file not found\n");
> +               return 0;
> +               }
> +
> +       path_of(vmlinux, vmlinux_path);
> +       if (pipe(a2l_in) == -1) {
> +               printf("Failed to create pipe\n");
> +               return 0;
> +       }
> +
> +       if (pipe(a2l_out) == -1) {
> +               printf("Failed to create pipe\n");
> +               return 0;
> +       }
> +
> +       addr2line_pid = fork();
> +       if (addr2line_pid == -1) {
> +               printf("Failed to fork process\n");
> +               close(a2l_in[P_READ]);
> +               close(a2l_in[P_WRITE]);
> +               close(a2l_out[P_READ]);
> +               close(a2l_out[P_WRITE]);
> +               return 0;
> +       }
> +
> +       if (addr2line_pid == 0) {
> +               dup2(a2l_in[P_READ], 0);
> +               dup2(a2l_out[P_WRITE], 1);
> +               close(a2l_in[P_WRITE]);
> +               close(a2l_out[P_READ]);
> +
> +               execlp(cmd, cmd, ADDR2LINE_ARGS, vmlinux, NULL);
> +
> +               printf("Failed to execute addr2line command\n");
> +               exit(1);
> +       } else {
> +               close(a2l_in[P_READ]);
> +               close(a2l_out[P_WRITE]);
> +       }
> +
> +       a2l_stdin = fdopen(a2l_in[P_WRITE], "w");
> +       if (!a2l_stdin) {
> +               printf("Failed to open pipe a2l_in\n");
> +               return 0;
> +       }
> +
> +       a2l_stdout = fdopen(a2l_out[P_READ], "r");
> +       if (!a2l_stdout) {
> +               printf("Failed to open pipe a2l_out\n");
> +               fclose(a2l_stdin);
> +               return 0;
> +       }
> +
> +       return 1;
> +}
> +
> +const char *remove_subdir(const char *home, const char *f_path)
> +{
> +       int i = 0;
> +
> +       while (*(home + i) == *(f_path + i))
> +               i++;
> +
> +       return (strlen(home) != i) ? NULL : f_path + i;
> +}
> +
> +char *addr2line_get_lines(uint64_t address)
> +{
> +       char buf[MAX_BUF];
> +
> +       fprintf(a2l_stdin, "%08lx\n", address);
> +       fflush(a2l_stdin);
> +
> +       if (!fgets(line, sizeof(line), a2l_stdout)) {
> +               printf("Failed to read lines from addr2line\n");
> +               return NULL;
> +       }
> +
> +       if (!fgets(line, sizeof(line), a2l_stdout)) {
> +               printf("Failed to read lines from addr2line\n");
> +               return NULL;
> +       }
> +
> +       line[strcspn(line, "\n")] = '\0';
> +       strncpy(buf, line, MAX_BUF);
> +       return normalize_path(buf, line);
> +}
> +
> +int addr2line_cleanup(void)
> +{
> +       int status;
> +
> +       if (addr2line_pid != -1) {
> +               kill(addr2line_pid, SIGKILL);
> +               waitpid(addr2line_pid, &status, 0);
> +               fclose(a2l_stdin);
> +               fclose(a2l_stdout);
> +               addr2line_pid = -1;
> +       }
> +
> +       return 1;
> +}
> +
> +static char *find_executable(const char *command)
> +{
> +       char *path_env = getenv("PATH");
> +       char *executable_path;
> +       char *path_copy;
> +       char *path;
> +       int n;
> +
> +       if (!path_env)
> +               return NULL;
> +
> +       path_copy = strdup(path_env);
> +       if (!path_copy)
> +               return NULL;
> +
> +       path = strtok(path_copy, ":");
> +       while (path) {
> +               n = snprintf(0, 0, "%s/%s", path, command);
> +               executable_path = (char *)malloc(n + 1);
> +               snprintf(executable_path, n + 1, "%s/%s", path, command);
> +               if (access(executable_path, X_OK) == 0) {
> +                       free(path_copy);
> +                       return executable_path;
> +               }
> +
> +       path = strtok(NULL, ":");
> +       free(executable_path);
> +       executable_path = NULL;
> +       }
> +
> +       free(path_copy);
> +       if (executable_path)
> +               free(executable_path);
> +       return NULL;
> +}
> +
> +const char *get_addr2line(int mode)
> +{
> +       char *buf = "";
> +
> +       switch (mode) {
> +       case A2L_CROSS:
> +               buf = getenv("CROSS_COMPILE");
> +               memcpy(addr2line_cmd, buf, strlen(buf));

Dangerous code.
It is easy to cause a buffer overrun by setting a long string to CROSS_COMPILE.


I am fine with adding ADDR2LINE in the top Makefile.





> +       case A2L_DEFAULT:
> +               memcpy(addr2line_cmd + strlen(buf), ADDR2LINE, strlen(ADDR2LINE));
> +               buf = find_executable(addr2line_cmd);
> +               if (buf) {
> +                       memcpy(addr2line_cmd, buf, strlen(buf));
> +                       free(buf);
> +               }
> +               return addr2line_cmd;
> +       case A2L_LLVM:
> +       default:
> +               return NULL;
> +       }
> +}
> +
> +char *get_vmlinux(char *input)
> +{
> +       const char *match_string1 = ".syms";
> +       const char *match_string2 = ".tmp_vmlinux.kallsyms";
> +       char *result = NULL;
> +       char *match_pos;
> +
> +       match_pos = strstr(input, match_string1);
> +       if (!match_pos)
> +               return NULL;
> +
> +       match_pos = strstr(input, match_string2);
> +       if (!match_pos)
> +               return NULL;
> +
> +       result = strdup(input);
> +       match_pos = strstr(result, match_string1);
> +       *match_pos = '\0';
> +       return result;
> +}

Tedious code to compute the vmlinux name.
You can pass vmlinux from command line.




> diff --git a/scripts/kas_alias/item_list.c b/scripts/kas_alias/item_list.c
> new file mode 100644
> index 000000000000..48f2e525592a
> --- /dev/null
> +++ b/scripts/kas_alias/item_list.c
> @@ -0,0 +1,230 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <string.h>
> +#include <stdbool.h>
> +#include <assert.h>
> +#include "item_list.h"
> +
> +#define CHECK_ORDER_BY_ADDRESS(sort_by, current, temp, op) \
> +       ((sort_by) == BY_ADDRESS && (current)->addr op (temp)->addr)
> +#define CHECK_ORDER_BY_NAME(sort_by, current, temp, op) \
> +       ((sort_by) == BY_NAME && strcmp((current)->symb_name, (temp)->symb_name) op 0)
> +
> +struct item *list_index[96] = {0};
> +
> +void build_index(struct item *list)
> +{
> +       char current_first_letter = ' ';
> +       struct item *current = list;
> +
> +       while (current) {
> +               if (current->symb_name[0] != current_first_letter) {
> +                       current_first_letter = current->symb_name[0];
> +                       list_index[current_first_letter - 32] = current;
> +               }
> +               current = current->next;
> +       }
> +}
> +
> +struct item *add_item(struct item **list, const char *name, char stype, uint64_t addr)
> +{
> +       struct item *new_item;
> +       struct item *current;
> +
> +       new_item = malloc(sizeof(struct item));
> +       if (!new_item)
> +               return NULL;
> +
> +       strncpy(new_item->symb_name, name, MAX_NAME_SIZE);
> +       new_item->symb_name[MAX_NAME_SIZE - 1] = '\0';
> +       new_item->addr = addr;
> +       new_item->stype = stype;
> +       new_item->next = NULL;
> +
> +       if (!(*list)) {
> +               *list = new_item;
> +       } else {
> +               current = *list;
> +               while (current->next)
> +                       current = current->next;


What is the purpose of the ->next traverse in while() loop?

If you remember the last item,
you know where the new item should be connected, don't you?





> +struct item *merge(struct item *left, struct item *right, int sort_by)
> +{
> +       struct item *current = NULL;
> +       struct item *result = NULL;
> +
> +       if (!left)
> +               return right;
> +       if (!right)
> +               return left;
> +
> +       if (sort_by == BY_NAME) {
> +               if (strcmp(left->symb_name, right->symb_name) <= 0) {
> +                       result = left;
> +                       left = left->next;
> +               } else {
> +                       result = right;
> +                       right = right->next;
> +               }
> +       } else {
> +               if (sort_by == BY_ADDRESS) {
> +                       if (left->addr <= right->addr) {
> +                               result = left;
> +                               left = left->next;
> +                       } else {
> +                               result = right;
> +                               right = right->next;
> +                       }
> +               }
> +       }
> +
> +       current = result;
> +
> +       while (left && right) {
> +               if (sort_by == BY_NAME) {
> +                       if (strcmp(left->symb_name, right->symb_name) <= 0) {
> +                               current->next = left;
> +                               left = left->next;
> +                       } else {
> +                               current->next = right;
> +                               right = right->next;
> +                       }
> +               } else {
> +                       if (sort_by == BY_ADDRESS) {
> +                               if (left->addr <= right->addr) {
> +                                       current->next = left;
> +                                       left = left->next;
> +                               } else {
> +                                       current->next = right;
> +                                       right = right->next;
> +                               }
> +                       }
> +               }
> +
> +               current = current->next;
> +       }
> +
> +       if (left) {
> +               current->next = left;
> +       } else {
> +               if (right)
> +                       current->next = right;
> +       }
> +
> +       return result;
> +}
> +
> +struct item *merge_sort(struct item *head, int sort_by)



Any reason why you did not use qsort() in the C library?


One disadvantage of quick sort is that it is not "stable sort".

But, your algorithm (sort by name and sort by address)
does not keep the order anyway.




> diff --git a/scripts/kas_alias/item_list.h b/scripts/kas_alias/item_list.h
> new file mode 100644
> index 000000000000..b4891cb088ee
> --- /dev/null
> +++ b/scripts/kas_alias/item_list.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef ITEM_LIST_H
> +#define ITEM_LIST_H
> +#include <stdint.h>
> +
> +#define MAX_NAME_SIZE 256
> +#define BY_ADDRESS 1
> +#define BY_NAME 2
> +
> +struct item {
> +       char            symb_name[MAX_NAME_SIZE];
> +       uint64_t        addr;
> +       char            stype;
> +       struct item     *next;
> +};
> +
> +void build_index(struct item *list);
> +struct item *add_item(struct item **list, const char *name, char stype, uint64_t addr);
> +void sort_list(struct item **list, int sort_by);
> +struct item *merge(struct item *left, struct item *right, int sort_by);
> +struct item *merge_sort(struct item *head, int sort_by);
> +void sort_list_m(struct item **head, int sort_by);
> +int insert_after(struct item *list, const uint64_t search_addr,
> +                const char *name, uint64_t addr, char stype);
> +void free_items(struct item **head);
> +#endif
> diff --git a/scripts/kas_alias/kas_alias.c b/scripts/kas_alias/kas_alias.c
> new file mode 100644
> index 000000000000..532aeb39f851
> --- /dev/null
> +++ b/scripts/kas_alias/kas_alias.c
> @@ -0,0 +1,217 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <stdbool.h>
> +#include <stdarg.h>
> +#include <regex.h>
> +
> +#include "item_list.h"
> +#include "duplicates_list.h"
> +#include "a2l.h"
> +
> +#define SYMB_IS_TEXT(s) ((((s)->stype) == 't') ||  (((s)->stype) == 'T'))
> +#define SYMB_IS_DATA(s) ((((s)->stype) == 'b') ||  (((s)->stype) == 'B') || \
> +                        (((s)->stype) == 'd') ||  (((s)->stype) == 'D') || \
> +                        (((s)->stype) == 'r') ||  (((s)->stype) == 'R'))
> +#ifdef CONFIG_KALLSYMS_ALIAS_DATA



Please do not use #ifdef CONFIG_ in host programs.

Also, please note include/linux/kconfig.h is not included
for host programs, so this #ifdef is always false.

You never tested this CONFIG_KALLSYMS_ALIAS_DATA







> +#define SYMB_NEEDS_ALIAS(s) (SYMB_IS_TEXT(s) || SYMB_IS_DATA(s))
> +#else
> +#define SYMB_NEEDS_ALIAS(s) SYMB_IS_TEXT(s)
> +#endif
> +#define FNOMATCH 0
> +#define FMATCH 1
> +#define EREGEX 2
> +
> +const char *ignore_regex[] = {
> +       "^__cfi_.*$",                           // __cfi_ preamble
> +#ifndef CONFIG_KALLSYMS_ALIAS_DATA_ALL
> +       "^_*TRACE_SYSTEM.*$",
> +       "^__already_done\\.[0-9]+$",            // Call a function once data
> +       "^___tp_str\\.[0-9]+$",
> +       "^___done\\.[0-9]+$",
> +       "^__print_once\\.[0-9]+$",
> +       "^_rs\\.[0-9]+$",
> +       "^__compound_literal\\.[0-9]+$",
> +       "^___once_key\\.[0-9]+$",
> +       "^__func__\\.[0-9]+$",
> +       "^__msg\\.[0-9]+$",
> +       "^CSWTCH\\.[0-9]+$",
> +       "^__flags\\.[0-9]+$",
> +       "^__wkey.*$",
> +       "^__mkey.*$",
> +       "^__key.*$",
> +#endif
> +       "^__pfx_.*$"                            // NOP-padding
> +};
> +
> +int suffix_serial;
> +
> +static inline void verbose_msg(bool verbose, const char *fmt, ...)

Meaningless 'inline'.  Only 'static' is enough.


> +{
> +       va_list args;
> +
> +       va_start(args, fmt);
> +       if (verbose)
> +               printf(fmt, args);



You cannot pass va_list to printf().

The correct code is:

          vprintf(fmt, args);








> +
> +       va_end(args);
> +}


verbose_msg() prints the message to stdout.

If verbose_msge is enabled, the output file breaks
because you redirect both real data and debug messages
in this way:
   scripts/kas_alias/kas_alias <input>   > <output>


If you implement debug logging, I recommend this:

   scripts/kas_alias/kas_alias <input> <output>





> +
> +static void create_suffix(const char *name, char *output_suffix)
> +{
> +       sprintf(output_suffix, "%s__alias__%d", name, suffix_serial++);
> +}
> +
> +static void create_file_suffix(const char *name, uint64_t address, char *output_suffix, char *cwd)
> +{
> +       const char *f_path;
> +       char *buf;
> +       int i = 0;
> +
> +       buf = addr2line_get_lines(address);
> +       f_path = remove_subdir(cwd, buf);
> +       if (f_path) {
> +               sprintf(output_suffix, "%s@%s", name, f_path);
> +               while (*(output_suffix + i) != '\0') {
> +                       switch (*(output_suffix + i)) {
> +                       case '/':
> +                       case ':':
> +                       case '.':
> +                               *(output_suffix + i) = '_';
> +                               break;
> +                       default:
> +                       }
> +               i++;
> +               }
> +       } else {
> +               create_suffix(name, output_suffix);
> +       }
> +}
> +
> +static int filter_symbols(char *symbol, const char **ignore_list, int regex_no)
> +{
> +       regex_t regex;
> +       int res, i;
> +
> +       for (i = 0; i < regex_no; i++) {
> +               res = regcomp(&regex, ignore_list[i], REG_EXTENDED);
> +               if (res)
> +                       return -EREGEX;
> +
> +               res = regexec(&regex, symbol, 0, NULL, 0);
> +               regfree(&regex);
> +               switch (res) {
> +               case 0:
> +                       return FMATCH;
> +               case REG_NOMATCH:
> +                       break;
> +               default:
> +                       return -EREGEX;
> +               }
> +       }
> +
> +       return FNOMATCH;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +       char t, sym_name[MAX_NAME_SIZE], new_name[MAX_NAME_SIZE + 15];
> +       struct duplicate_item  *duplicate_iterator;
> +       struct duplicate_item *duplicate;
> +       struct item *head = {NULL};



This might be something new I should learn.

When I initialize a pointer, I always write like this:
        struct item *head = NULL;


I have never seen this style before:
        struct item *head = {NULL};

But, the compiler does not warn about it.
Could you educate me how it works?





> +       bool need_2_process = true;
> +       struct item *last = {NULL};
> +       struct item  *current;
> +       int verbose_mode = 0;
> +       uint64_t address;
> +       FILE *fp;
> +       int res;
> +
> +       if (argc < 2 || argc > 3) {
> +               printf("Usage: %s <nmfile> [-verbose]\n", argv[0]);
> +               return 1;
> +       }
> +
> +       if (argc == 3 && strcmp(argv[2], "-verbose") == 0)


"-v" or "--verbose" is better
(although there is no way to enable it.)


Anyway, I only see super-boring messages
even with -verbose given.

Scanning nm data()
Sorting nm data
Scanning nm data for duplicates
Applying suffixes



> +               verbose_mode = 1;
> +
> +       verbose_msg(verbose_mode, "Scanning nm data(%s)\n", argv[1]);
> +
> +       fp = fopen(argv[1], "r");
> +       if (!fp) {
> +               printf("Can't open input file.\n");
> +               return 1;
> +       }
> +
> +       if (!addr2line_init(get_addr2line(A2L_DEFAULT), get_vmlinux(argv[1])))
> +               return 1;
> +
> +       while (fscanf(fp, "%lx %c %99s\n", &address, &t, sym_name) == 3) {
> +               if (strstr(sym_name, "@_")) {
> +                       if (verbose_mode && need_2_process)
> +                               printf("Already processed\n");
> +                       need_2_process = false;
> +                       }
> +               last = add_item(&last, sym_name, t, address);
> +               if (!last) {
> +                       printf("Error in allocate memory\n");
> +                       free_items(&head);
> +                       return 1;
> +               }
> +
> +               if (!head)
> +                       head = last;
> +       }
> +
> +       fclose(fp);
> +
> +       if (need_2_process) {
> +               verbose_msg(verbose_mode, "Sorting nm data\n");
> +               sort_list_m(&head, BY_NAME);
> +               verbose_msg(verbose_mode, "Scanning nm data for duplicates\n");
> +               duplicate = find_duplicates(head);
> +               if (!duplicate) {
> +                       printf("Error in duplicates list\n");
> +                       return 1;
> +               }
> +
> +               verbose_msg(verbose_mode, "Applying suffixes\n");
> +               build_index(head);
> +               duplicate_iterator = duplicate;
> +               while (duplicate_iterator) {
> +                       res = filter_symbols(duplicate_iterator->original_item->symb_name,
> +                                            ignore_regex, sizeof(ignore_regex) /
> +                                            sizeof(ignore_regex[0]));
> +                       if (res != FMATCH &&
> +                           SYMB_NEEDS_ALIAS(duplicate_iterator->original_item)) {
> +                               if (res < 0)
> +                                       return 1;
> +
> +                               create_file_suffix(duplicate_iterator->original_item->symb_name,
> +                                                  duplicate_iterator->original_item->addr,
> +                                                  new_name, vmlinux_path);
> +                               if (!insert_after(head, duplicate_iterator->original_item->addr,
> +                                                 new_name, duplicate_iterator->original_item->addr,
> +                                                 duplicate_iterator->original_item->stype))
> +                                       return 1;
> +                       }
> +
> +                       duplicate_iterator = duplicate_iterator->next;
> +               }
> +
> +               sort_list_m(&head, BY_ADDRESS);



Do you sort just for finding duplicates?


Since you sort the list by name, and then sort by address.

It can potentially change the order.


[input]
00000000 D foo
00000000 D bar
00000010 D baz

[merge sort by name]
00000000 D bar
00000010 D baz
00000000 D foo

[merge sort by address = final result]
00000000 D bar
00000000 D foo
00000010 D baz


It may not be a big deal, but such a destructive algorithm is frowned.



You do not need to sort the list to find duplicates.


Another choice is a hashtable of { name : count }.
Traversing the list, you can count the number of occurrences.
If count >= 2, it is duplicated.

The average look-up of a hashtable is O(1).
So, the hashtable (or set) algorithm is O(N).


Sorting is O(N log N).





> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index a432b171be82..cacf60b597ce 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -89,8 +89,9 @@ vmlinux_link()
>
>         ldflags="${ldflags} ${wl}--script=${objtree}/${KBUILD_LDS}"
>
> -       # The kallsyms linking does not need debug symbols included.
> -       if [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
> +       # The kallsyms linking does not need debug symbols included, unless the KALLSYMS_ALIAS.
> +       if [ ! is_enabled CONFIG_KALLSYMS_ALIAS ] && \


I observed this error message:


scripts/link-vmlinux.sh: 93: [: is_enabled: unexpected operator


The correct code is this:


        if ! is_enabled CONFIG_KALLSYMS_ALIAS && \










> +           [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
>                 ldflags="${ldflags} ${wl}--strip-debug"
>         fi
>
> @@ -161,7 +162,11 @@ kallsyms()
>         fi
>
>         info KSYMS ${2}
> -       scripts/kallsyms ${kallsymopt} ${1} > ${2}
> +       if is_enabled CONFIG_KALLSYMS_ALIAS; then
> +               ALIAS=".alias"
> +               scripts/kas_alias/kas_alias ${1} >${1}${ALIAS}
> +               fi
> +       scripts/kallsyms ${kallsymopt} ${1}${ALIAS} > ${2}
>  }
>
>  # Perform one step in kallsyms generation, including temporary linking of
> --
> 2.34.1
>


--
Best Regards



Masahiro Yamada

^ permalink raw reply

* Re: [RFC PATCH 1/1] smp: Change function signatures to use call_single_data_t
From: Leonardo Bras Soares Passos @ 2023-09-01  6:17 UTC (permalink / raw)
  To: Guo Ren
  Cc: Steven Rostedt, Masami Hiramatsu, Peter Zijlstra, Josh Poimboeuf,
	Valentin Schneider, Paul E. McKenney, Juergen Gross, Yury Norov,
	Imran Khan, linux-kernel, linux-trace-kernel
In-Reply-To: <CAJF2gTSo5opLeRrOevU_g5C9iJZO1O4=+F7vsrwbBrY_KqaYQA@mail.gmail.com>

On Thu, Aug 31, 2023 at 3:58 AM Guo Ren <guoren@kernel.org> wrote:
>
> On Thu, Aug 31, 2023 at 2:31 PM Leonardo Bras <leobras@redhat.com> wrote:
> >
> > call_single_data_t is a size-aligned typedef of struct __call_single_data.
> >
> > This alignment is desirable in order to have smp_call_function*() avoid
> > bouncing an extra cacheline in case of an unaligned csd, given this
> > would hurt performance.
> >
> > Since the removal of struct request->csd in commit 660e802c76c8
> > ("blk-mq: use percpu csd to remote complete instead of per-rq csd") there
> > are no current users of smp_call_function*() with unaligned csd.
> >
> > Change every 'struct __call_single_data' function parameter to
> > 'call_single_data_t', so we have warnings if any new code tries to
> > introduce an smp_call_function*() call with unaligned csd.
> I agree to prevent __call_single_data usage.
>
> Reviewed-by: Guo Ren <guoren@kernel.org>

Thanks for reviewing!


>
> /*
>  * structure shares (partial) layout with struct irq_work
>  */
> struct __call_single_data {
>         struct __call_single_node node;
>         smp_call_func_t func;
>         void *info;
> };
>
> #define CSD_INIT(_func, _info) \
>         (struct __call_single_data){ .func = (_func), .info = (_info), }
>
> /* Use __aligned() to avoid to use 2 cache lines for 1 csd */
> typedef struct __call_single_data call_single_data_t
>         __aligned(sizeof(struct __call_single_data));
>
> >
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > ---
> >  include/linux/smp.h        |  2 +-
> >  include/trace/events/csd.h |  8 ++++----
> >  kernel/smp.c               | 26 +++++++++++++-------------
> >  kernel/up.c                |  2 +-
> >  4 files changed, 19 insertions(+), 19 deletions(-)
> >
> > diff --git a/include/linux/smp.h b/include/linux/smp.h
> > index 91ea4a67f8ca..e87520dc2959 100644
> > --- a/include/linux/smp.h
> > +++ b/include/linux/smp.h
> > @@ -53,7 +53,7 @@ int smp_call_function_single(int cpuid, smp_call_func_t func, void *info,
> >  void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func,
> >                            void *info, bool wait, const struct cpumask *mask);
> >
> > -int smp_call_function_single_async(int cpu, struct __call_single_data *csd);
> > +int smp_call_function_single_async(int cpu, call_single_data_t *csd);
> >
> >  /*
> >   * Cpus stopping functions in panic. All have default weak definitions.
> > diff --git a/include/trace/events/csd.h b/include/trace/events/csd.h
> > index 67e9d01f80c2..58cc83b99c34 100644
> > --- a/include/trace/events/csd.h
> > +++ b/include/trace/events/csd.h
> > @@ -12,7 +12,7 @@ TRACE_EVENT(csd_queue_cpu,
> >         TP_PROTO(const unsigned int cpu,
> >                 unsigned long callsite,
> >                 smp_call_func_t func,
> > -               struct __call_single_data *csd),
> > +               call_single_data_t *csd),
> >
> >         TP_ARGS(cpu, callsite, func, csd),
> >
> > @@ -39,7 +39,7 @@ TRACE_EVENT(csd_queue_cpu,
> >   */
> >  DECLARE_EVENT_CLASS(csd_function,
> >
> > -       TP_PROTO(smp_call_func_t func, struct __call_single_data *csd),
> > +       TP_PROTO(smp_call_func_t func, call_single_data_t *csd),
> >
> >         TP_ARGS(func, csd),
> >
> > @@ -57,12 +57,12 @@ DECLARE_EVENT_CLASS(csd_function,
> >  );
> >
> >  DEFINE_EVENT(csd_function, csd_function_entry,
> > -       TP_PROTO(smp_call_func_t func, struct __call_single_data *csd),
> > +       TP_PROTO(smp_call_func_t func, call_single_data_t *csd),
> >         TP_ARGS(func, csd)
> >  );
> >
> >  DEFINE_EVENT(csd_function, csd_function_exit,
> > -       TP_PROTO(smp_call_func_t func, struct __call_single_data *csd),
> > +       TP_PROTO(smp_call_func_t func, call_single_data_t *csd),
> >         TP_ARGS(func, csd)
> >  );
> >
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index 8455a53465af..8c714583786b 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -127,7 +127,7 @@ send_call_function_ipi_mask(struct cpumask *mask)
> >  }
> >
> >  static __always_inline void
> > -csd_do_func(smp_call_func_t func, void *info, struct __call_single_data *csd)
> > +csd_do_func(smp_call_func_t func, void *info, call_single_data_t *csd)
> >  {
> >         trace_csd_function_entry(func, csd);
> >         func(info);
> > @@ -174,7 +174,7 @@ module_param(csd_lock_timeout, ulong, 0444);
> >  static atomic_t csd_bug_count = ATOMIC_INIT(0);
> >
> >  /* Record current CSD work for current CPU, NULL to erase. */
> > -static void __csd_lock_record(struct __call_single_data *csd)
> > +static void __csd_lock_record(call_single_data_t *csd)
> >  {
> >         if (!csd) {
> >                 smp_mb(); /* NULL cur_csd after unlock. */
> > @@ -189,13 +189,13 @@ static void __csd_lock_record(struct __call_single_data *csd)
> >                   /* Or before unlock, as the case may be. */
> >  }
> >
> > -static __always_inline void csd_lock_record(struct __call_single_data *csd)
> > +static __always_inline void csd_lock_record(call_single_data_t *csd)
> >  {
> >         if (static_branch_unlikely(&csdlock_debug_enabled))
> >                 __csd_lock_record(csd);
> >  }
> >
> > -static int csd_lock_wait_getcpu(struct __call_single_data *csd)
> > +static int csd_lock_wait_getcpu(call_single_data_t *csd)
> >  {
> >         unsigned int csd_type;
> >
> > @@ -210,7 +210,7 @@ static int csd_lock_wait_getcpu(struct __call_single_data *csd)
> >   * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
> >   * so waiting on other types gets much less information.
> >   */
> > -static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 *ts1, int *bug_id)
> > +static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
> >  {
> >         int cpu = -1;
> >         int cpux;
> > @@ -276,7 +276,7 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 *
> >   * previous function call. For multi-cpu calls its even more interesting
> >   * as we'll have to ensure no other cpu is observing our csd.
> >   */
> > -static void __csd_lock_wait(struct __call_single_data *csd)
> > +static void __csd_lock_wait(call_single_data_t *csd)
> >  {
> >         int bug_id = 0;
> >         u64 ts0, ts1;
> > @@ -290,7 +290,7 @@ static void __csd_lock_wait(struct __call_single_data *csd)
> >         smp_acquire__after_ctrl_dep();
> >  }
> >
> > -static __always_inline void csd_lock_wait(struct __call_single_data *csd)
> > +static __always_inline void csd_lock_wait(call_single_data_t *csd)
> >  {
> >         if (static_branch_unlikely(&csdlock_debug_enabled)) {
> >                 __csd_lock_wait(csd);
> > @@ -300,17 +300,17 @@ static __always_inline void csd_lock_wait(struct __call_single_data *csd)
> >         smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK));
> >  }
> >  #else
> > -static void csd_lock_record(struct __call_single_data *csd)
> > +static void csd_lock_record(call_single_data_t *csd)
> >  {
> >  }
> >
> > -static __always_inline void csd_lock_wait(struct __call_single_data *csd)
> > +static __always_inline void csd_lock_wait(call_single_data_t *csd)
> >  {
> >         smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK));
> >  }
> >  #endif
> >
> > -static __always_inline void csd_lock(struct __call_single_data *csd)
> > +static __always_inline void csd_lock(call_single_data_t *csd)
> >  {
> >         csd_lock_wait(csd);
> >         csd->node.u_flags |= CSD_FLAG_LOCK;
> > @@ -323,7 +323,7 @@ static __always_inline void csd_lock(struct __call_single_data *csd)
> >         smp_wmb();
> >  }
> >
> > -static __always_inline void csd_unlock(struct __call_single_data *csd)
> > +static __always_inline void csd_unlock(call_single_data_t *csd)
> >  {
> >         WARN_ON(!(csd->node.u_flags & CSD_FLAG_LOCK));
> >
> > @@ -376,7 +376,7 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
> >   * for execution on the given CPU. data must already have
> >   * ->func, ->info, and ->flags set.
> >   */
> > -static int generic_exec_single(int cpu, struct __call_single_data *csd)
> > +static int generic_exec_single(int cpu, call_single_data_t *csd)
> >  {
> >         if (cpu == smp_processor_id()) {
> >                 smp_call_func_t func = csd->func;
> > @@ -667,7 +667,7 @@ EXPORT_SYMBOL(smp_call_function_single);
> >   *
> >   * Return: %0 on success or negative errno value on error
> >   */
> > -int smp_call_function_single_async(int cpu, struct __call_single_data *csd)
> > +int smp_call_function_single_async(int cpu, call_single_data_t *csd)
> >  {
> >         int err = 0;
> >
> > diff --git a/kernel/up.c b/kernel/up.c
> > index a38b8b095251..df50828cc2f0 100644
> > --- a/kernel/up.c
> > +++ b/kernel/up.c
> > @@ -25,7 +25,7 @@ int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
> >  }
> >  EXPORT_SYMBOL(smp_call_function_single);
> >
> > -int smp_call_function_single_async(int cpu, struct __call_single_data *csd)
> > +int smp_call_function_single_async(int cpu, call_single_data_t *csd)
> >  {
> >         unsigned long flags;
> >
> > --
> > 2.42.0
> >
>
>
> --
> Best Regards
>  Guo Ren
>


^ permalink raw reply

* [PATCH] tracing: Fix the unusable problem caused by non-empty pipe_cpumask
From: Chuang Wang @ 2023-09-01  7:26 UTC (permalink / raw)
  Cc: Chuang Wang, stable, Steven Rostedt, Masami Hiramatsu,
	Zheng Yejian, linux-kernel, linux-trace-kernel

There is a problem in the linux-6.5 when I execute the following
command:

 $ perf ftrace -t irqsoff
 failed to open trace_pipe

At the same time, when I open this file, the same error occurs.

Therefore, after carefully looking at the code, the open function of
'trace_pipe' returns -EBUSY in open_pipe_on_cpu() because no clearing
operation was performed when 'trace_array->pipe_cpumask' was allocated.

With this patch, Use zalloc_cpumask_var() to ensure that
'trace_array->pipe_cpumask' is reset to 0 when allocated.

Cc: stable@vger.kernel.org
Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
---
 kernel/trace/trace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d8233d34b5a0..c0b8a72f3466 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -9461,7 +9461,7 @@ static struct trace_array *trace_array_create(const char *name)
 	if (!alloc_cpumask_var(&tr->tracing_cpumask, GFP_KERNEL))
 		goto out_free_tr;
 
-	if (!alloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
+	if (!zalloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
 		goto out_free_tr;
 
 	tr->trace_flags = global_trace.trace_flags & ~ZEROED_TRACE_FLAGS;
@@ -10406,7 +10406,7 @@ __init static int tracer_alloc_buffers(void)
 	if (trace_create_savedcmd() < 0)
 		goto out_free_temp_buffer;
 
-	if (!alloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
+	if (!zalloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
 		goto out_free_savedcmd;
 
 	/* TODO: make the number of buffers hot pluggable with CPUS */
-- 
2.39.2


^ permalink raw reply related

* Re: [PATCH] tracing: Fix the unusable problem caused by non-empty pipe_cpumask
From: Masami Hiramatsu @ 2023-09-01 14:02 UTC (permalink / raw)
  To: Chuang Wang
  Cc: stable, Steven Rostedt, Masami Hiramatsu, Zheng Yejian,
	linux-kernel, linux-trace-kernel
In-Reply-To: <20230901072626.278880-1-nashuiliang@gmail.com>

On Fri,  1 Sep 2023 15:26:26 +0800
Chuang Wang <nashuiliang@gmail.com> wrote:

> There is a problem in the linux-6.5 when I execute the following
> command:
> 
>  $ perf ftrace -t irqsoff
>  failed to open trace_pipe
> 
> At the same time, when I open this file, the same error occurs.
> 
> Therefore, after carefully looking at the code, the open function of
> 'trace_pipe' returns -EBUSY in open_pipe_on_cpu() because no clearing
> operation was performed when 'trace_array->pipe_cpumask' was allocated.
> 
> With this patch, Use zalloc_cpumask_var() to ensure that
> 'trace_array->pipe_cpumask' is reset to 0 when allocated.

Good catch. This looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

> 
> Cc: stable@vger.kernel.org
> Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
> Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
> ---
>  kernel/trace/trace.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index d8233d34b5a0..c0b8a72f3466 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -9461,7 +9461,7 @@ static struct trace_array *trace_array_create(const char *name)
>  	if (!alloc_cpumask_var(&tr->tracing_cpumask, GFP_KERNEL))
>  		goto out_free_tr;
>  
> -	if (!alloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
> +	if (!zalloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
>  		goto out_free_tr;
>  
>  	tr->trace_flags = global_trace.trace_flags & ~ZEROED_TRACE_FLAGS;
> @@ -10406,7 +10406,7 @@ __init static int tracer_alloc_buffers(void)
>  	if (trace_create_savedcmd() < 0)
>  		goto out_free_temp_buffer;
>  
> -	if (!alloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
> +	if (!zalloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
>  		goto out_free_savedcmd;
>  
>  	/* TODO: make the number of buffers hot pluggable with CPUS */
> -- 
> 2.39.2
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* [PATCH 0/4] Minor v6.6 trace_events_filter fixes
From: Valentin Schneider @ 2023-09-01 15:10 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel; +Cc: Steven Rostedt, Masami Hiramatsu

Hi,

These are small fixes incorporating feedback on the trace filters part of [1].

Based on top of trace-v6.6.

Cheers,
Valentin

[1]: https://lore.kernel.org/all/20230720163056.2564824-1-vschneid@redhat.com/

Valentin Schneider (4):
  tracing/filters: Fix error-handling of cpulist parsing buffer
  tracing/filters: Fix double-free of struct filter_pred.mask
  tracing/filters: Change parse_pred() cpulist ternary into an if block
  tracing/filters: Fix coding style issues

 kernel/trace/trace_events_filter.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

--
2.31.1


^ permalink raw reply

* [PATCH 3/4] tracing/filters: Change parse_pred() cpulist ternary into an if block
From: Valentin Schneider @ 2023-09-01 15:10 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel; +Cc: Steven Rostedt, Masami Hiramatsu
In-Reply-To: <20230901151039.125186-1-vschneid@redhat.com>

Review comments noted that an if block would be clearer than a ternary, so
swap it out.

No change in behaviour intended

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/trace/trace_events_filter.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index eb331e8b00b61..09b4733a2933d 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1782,13 +1782,17 @@ static int parse_pred(const char *str, void *data,
 				FILTER_PRED_FN_CPUMASK;
 		} else if (field->filter_type == FILTER_CPU) {
 			if (single) {
-				pred->op = pred->op == OP_BAND ? OP_EQ : pred->op;
+				if (pred->op == OP_BAND)
+					pred->op = OP_EQ;
+
 				pred->fn_num = FILTER_PRED_FN_CPU;
 			} else {
 				pred->fn_num = FILTER_PRED_FN_CPU_CPUMASK;
 			}
 		} else if (single) {
-			pred->op = pred->op == OP_BAND ? OP_EQ : pred->op;
+			if (pred->op == OP_BAND)
+				pred->op = OP_EQ;
+
 			pred->fn_num = select_comparison_fn(pred->op, field->size, false);
 			if (pred->op == OP_NE)
 				pred->not = 1;
-- 
2.31.1


^ permalink raw reply related

* [PATCH 2/4] tracing/filters: Fix double-free of struct filter_pred.mask
From: Valentin Schneider @ 2023-09-01 15:10 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel; +Cc: Steven Rostedt, Masami Hiramatsu
In-Reply-To: <20230901151039.125186-1-vschneid@redhat.com>

When a cpulist filter is found to contain a single CPU, that CPU is saved
as a scalar and the backing cpumask storage is freed.

Also NULL the mask to avoid a double-free once we get down to
free_predicate().

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/trace/trace_events_filter.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index c06e1d596f4b9..eb331e8b00b61 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1773,6 +1773,7 @@ static int parse_pred(const char *str, void *data,
 		if (single) {
 			pred->val = cpumask_first(pred->mask);
 			kfree(pred->mask);
+			pred->mask = NULL;
 		}
 
 		if (field->filter_type == FILTER_CPUMASK) {
-- 
2.31.1


^ permalink raw reply related

* [PATCH 1/4] tracing/filters: Fix error-handling of cpulist parsing buffer
From: Valentin Schneider @ 2023-09-01 15:10 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Steven Rostedt, Josh Poimboeuf, Masami Hiramatsu
In-Reply-To: <20230901151039.125186-1-vschneid@redhat.com>

parse_pred() allocates a string buffer to parse the user-provided cpulist,
but doesn't check the allocation result nor does it free the buffer once it
is no longer needed.

Add an allocation check, and free the buffer as soon as it is no longer
needed.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/trace/trace_events_filter.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 3a529214a21b7..c06e1d596f4b9 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1744,17 +1744,23 @@ static int parse_pred(const char *str, void *data,
 
 		/* Copy the cpulist between { and } */
 		tmp = kmalloc((i - maskstart) + 1, GFP_KERNEL);
-		strscpy(tmp, str + maskstart, (i - maskstart) + 1);
+		if (!tmp)
+			goto err_mem;
 
+		strscpy(tmp, str + maskstart, (i - maskstart) + 1);
 		pred->mask = kzalloc(cpumask_size(), GFP_KERNEL);
-		if (!pred->mask)
+		if (!pred->mask) {
+			kfree(tmp);
 			goto err_mem;
+		}
 
 		/* Now parse it */
 		if (cpulist_parse(tmp, pred->mask)) {
+			kfree(tmp);
 			parse_error(pe, FILT_ERR_INVALID_CPULIST, pos + i);
 			goto err_free;
 		}
+		kfree(tmp);
 
 		/* Move along */
 		i++;
-- 
2.31.1


^ permalink raw reply related

* [PATCH 4/4] tracing/filters: Fix coding style issues
From: Valentin Schneider @ 2023-09-01 15:10 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel; +Cc: Steven Rostedt, Masami Hiramatsu
In-Reply-To: <20230901151039.125186-1-vschneid@redhat.com>

Recent commits have introduced some coding style issues, fix those up.

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/trace/trace_events_filter.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 09b4733a2933d..33264e510d161 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1360,7 +1360,7 @@ int filter_assign_type(const char *type)
 			return FILTER_DYN_STRING;
 		if (strstr(type, "cpumask_t"))
 			return FILTER_CPUMASK;
-		}
+	}
 
 	if (strstr(type, "__rel_loc") && strstr(type, "char"))
 		return FILTER_RDYN_STRING;
@@ -1731,7 +1731,9 @@ static int parse_pred(const char *str, void *data,
 		maskstart = i;
 
 		/* Walk the cpulist until closing } */
-		for (; str[i] && str[i] != '}'; i++);
+		for (; str[i] && str[i] != '}'; i++)
+			;
+
 		if (str[i] != '}') {
 			parse_error(pe, FILT_ERR_MISSING_BRACE_CLOSE, pos + i);
 			goto err_free;
-- 
2.31.1


^ permalink raw reply related

* [PATCH 1/3] tracing/user_events: Allow events to persist for perfmon_capable users
From: Beau Belgrave @ 2023-09-01 20:43 UTC (permalink / raw)
  To: rostedt, mhiramat; +Cc: linux-kernel, linux-trace-kernel, ast, dcook
In-Reply-To: <20230901204332.159-1-beaub@linux.microsoft.com>

There are several scenarios that have come up where having a user_event
persist even if the process that registered it exits. The main one is
having a daemon create events on bootup that shouldn't get deleted if
the daemon has to exit or reload. Another is within OpenTelemetry
exporters, they wish to potentially check if a user_event exists on the
system to determine if exporting the data out should occur. The
user_event in this case must exist even in the absence of the owning
process running (such as the above daemon case).

Expose the previously internal flag USER_EVENT_REG_PERSIST to user
processes. Upon register or delete of events with this flag, ensure the
user is perfmon_capable to prevent random user processes with access to
tracefs from creating events that persist after exit.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 include/uapi/linux/user_events.h | 11 ++++++++++-
 kernel/trace/trace_events_user.c | 28 ++++++++++++++--------------
 2 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/include/uapi/linux/user_events.h b/include/uapi/linux/user_events.h
index 2984aae4a2b4..f74f3aedd49c 100644
--- a/include/uapi/linux/user_events.h
+++ b/include/uapi/linux/user_events.h
@@ -17,6 +17,15 @@
 /* Create dynamic location entry within a 32-bit value */
 #define DYN_LOC(offset, size) ((size) << 16 | (offset))
 
+/* List of supported registration flags */
+enum user_reg_flag {
+	/* Event will not delete upon last reference closing */
+	USER_EVENT_REG_PERSIST		= 1U << 0,
+
+	/* This value or above is currently non-ABI */
+	USER_EVENT_REG_MAX		= 1U << 1,
+};
+
 /*
  * Describes an event registration and stores the results of the registration.
  * This structure is passed to the DIAG_IOCSREG ioctl, callers at a minimum
@@ -33,7 +42,7 @@ struct user_reg {
 	/* Input: Enable size in bytes at address */
 	__u8	enable_size;
 
-	/* Input: Flags for future use, set to 0 */
+	/* Input: Flags to use, if any */
 	__u16	flags;
 
 	/* Input: Address to update when enabled */
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 6f046650e527..af663de93492 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -49,18 +49,6 @@
 #define EVENT_STATUS_PERF BIT(1)
 #define EVENT_STATUS_OTHER BIT(7)
 
-/*
- * User register flags are not allowed yet, keep them here until we are
- * ready to expose them out to the user ABI.
- */
-enum user_reg_flag {
-	/* Event will not delete upon last reference closing */
-	USER_EVENT_REG_PERSIST		= 1U << 0,
-
-	/* This value or above is currently non-ABI */
-	USER_EVENT_REG_MAX		= 1U << 1,
-};
-
 /*
  * Stores the system name, tables, and locks for a group of events. This
  * allows isolation for events by various means.
@@ -1888,10 +1876,16 @@ static int user_event_parse(struct user_event_group *group, char *name,
 	int argc = 0;
 	char **argv;
 
-	/* User register flags are not ready yet */
-	if (reg_flags != 0 || flags != NULL)
+	/* Currently don't support any text based flags */
+	if (flags != NULL)
 		return -EINVAL;
 
+	/* Persistent events require CAP_PERFMON / CAP_SYS_ADMIN */
+	if (reg_flags & USER_EVENT_REG_PERSIST) {
+		if (!perfmon_capable())
+			return -EPERM;
+	}
+
 	/* Prevent dyn_event from racing */
 	mutex_lock(&event_mutex);
 	user = find_user_event(group, name, &key);
@@ -2024,6 +2018,12 @@ static int delete_user_event(struct user_event_group *group, char *name)
 	if (!user_event_last_ref(user))
 		return -EBUSY;
 
+	/* Persistent events require CAP_PERFMON / CAP_SYS_ADMIN */
+	if (user->reg_flags & USER_EVENT_REG_PERSIST) {
+		if (!perfmon_capable())
+			return -EPERM;
+	}
+
 	return destroy_user_event(user);
 }
 
-- 
2.34.1


^ permalink raw reply related

* [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users
From: Beau Belgrave @ 2023-09-01 20:43 UTC (permalink / raw)
  To: rostedt, mhiramat; +Cc: linux-kernel, linux-trace-kernel, ast, dcook

There are several scenarios that have come up where having a user_event
persist even if the process that registered it exits. The main one is
having a daemon create events on bootup that shouldn't get deleted if
the daemon has to exit or reload. Another is within OpenTelemetry
exporters, they wish to potentially check if a user_event exists on the
system to determine if exporting the data out should occur. The
user_event in this case must exist even in the absence of the owning
process running (such as the above daemon case).

Since persistent events aren't automatically cleaned up, we want to ensure
only trusted users are allowed to do this. It seems reasonable to use
CAP_PERFMON as that boundary, since those users can already do many things
via perf_event_open without requiring full CAP_SYS_ADMIN.

This patchset brings back the ability to use /sys/kernel/tracing/dynamic_events
to create user_events, as persist is now back to being supported. Both the
register and delete of events that persist require CAP_PERFMON, which prevents
a non-perfmon user from making an event go away that a perfmon user decided
should persist.

Beau Belgrave (3):
  tracing/user_events: Allow events to persist for perfmon_capable users
  selftests/user_events: Test persist flag cases
  tracing/user_events: Document persist event flags

 Documentation/trace/user_events.rst           | 21 ++++++-
 include/uapi/linux/user_events.h              | 11 +++-
 kernel/trace/trace_events_user.c              | 28 +++++-----
 .../testing/selftests/user_events/abi_test.c  | 55 ++++++++++++++++++-
 .../testing/selftests/user_events/dyn_test.c  | 54 +++++++++++++++++-
 5 files changed, 149 insertions(+), 20 deletions(-)


base-commit: f940e482b0f889e697372a22b6c15da87aa1f63a
-- 
2.34.1


^ permalink raw reply

* [PATCH 3/3] tracing/user_events: Document persist event flags
From: Beau Belgrave @ 2023-09-01 20:43 UTC (permalink / raw)
  To: rostedt, mhiramat; +Cc: linux-kernel, linux-trace-kernel, ast, dcook
In-Reply-To: <20230901204332.159-1-beaub@linux.microsoft.com>

Users need to know how to make events persist, now that we allow for
that. We also now allow the dynamic_events file to create events by
utilizing the persist flag during event register.

Add back in to documentation how /sys/kernel/tracing/dynamic_events can
be used to create peristent user_events. Add a section under registering
for the currently supported flags (USER_EVENT_REG_PERSIST) and the
required permissions. Add a note under deleting that deleting a
persistent event also requires sufficient permission.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 Documentation/trace/user_events.rst | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/Documentation/trace/user_events.rst b/Documentation/trace/user_events.rst
index e7b07313550a..c5388a47376f 100644
--- a/Documentation/trace/user_events.rst
+++ b/Documentation/trace/user_events.rst
@@ -14,6 +14,11 @@ Programs can view status of the events via
 /sys/kernel/tracing/user_events_status and can both register and write
 data out via /sys/kernel/tracing/user_events_data.
 
+Programs can also use /sys/kernel/tracing/dynamic_events to register and
+delete user based events via the u: prefix. The format of the command to
+dynamic_events is the same as the ioctl with the u: prefix applied. This
+requires CAP_PERFMON due to the event persisting, otherwise -EPERM is returned.
+
 Typically programs will register a set of events that they wish to expose to
 tools that can read trace_events (such as ftrace and perf). The registration
 process tells the kernel which address and bit to reflect if any tool has
@@ -45,7 +50,7 @@ This command takes a packed struct user_reg as an argument::
         /* Input: Enable size in bytes at address */
         __u8 enable_size;
 
-        /* Input: Flags for future use, set to 0 */
+        /* Input: Flags to be used, if any */
         __u16 flags;
 
         /* Input: Address to update when enabled */
@@ -69,7 +74,7 @@ The struct user_reg requires all the above inputs to be set appropriately.
   This must be 4 (32-bit) or 8 (64-bit). 64-bit values are only allowed to be
   used on 64-bit kernels, however, 32-bit can be used on all kernels.
 
-+ flags: The flags to use, if any. For the initial version this must be 0.
++ flags: The flags to use, if any.
   Callers should first attempt to use flags and retry without flags to ensure
   support for lower versions of the kernel. If a flag is not supported -EINVAL
   is returned.
@@ -80,6 +85,13 @@ The struct user_reg requires all the above inputs to be set appropriately.
 + name_args: The name and arguments to describe the event, see command format
   for details.
 
+The following flags are currently supported.
+
++ USER_EVENT_REG_PERSIST: The event will not delete upon the last reference
+  closing. Callers may use this if an event should exist even after the
+  process closes or unregisters the event. Requires CAP_PERFMON otherwise
+  -EPERM is returned.
+
 Upon successful registration the following is set.
 
 + write_index: The index to use for this file descriptor that represents this
@@ -141,7 +153,10 @@ event (in both user and kernel space). User programs should use a separate file
 to request deletes than the one used for registration due to this.
 
 **NOTE:** By default events will auto-delete when there are no references left
-to the event. Flags in the future may change this logic.
+to the event. If programs do not want auto-delete, they must use the
+USER_EVENT_REG_PERSIST flag when registering the event. Once that flag is used
+the event exists until DIAG_IOCSDEL is invoked. Both register and delete of an
+event that persists requires CAP_PERFMON, otherwise -EPERM is returned.
 
 Unregistering
 -------------
-- 
2.34.1


^ permalink raw reply related

* [PATCH 2/3] selftests/user_events: Test persist flag cases
From: Beau Belgrave @ 2023-09-01 20:43 UTC (permalink / raw)
  To: rostedt, mhiramat; +Cc: linux-kernel, linux-trace-kernel, ast, dcook
In-Reply-To: <20230901204332.159-1-beaub@linux.microsoft.com>

Now that we have exposed USER_EVENT_REG_PERSIST events can persist both
via the ABI and in the /sys/kernel/tracing/dynamic_events file.

Ensure both the ABI and DYN cases work by calling both during the parse
tests. Add new flags test that ensures only USER_EVENT_REG_PERSIST is
honored and any other flag is invalid.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 .../testing/selftests/user_events/abi_test.c  | 55 ++++++++++++++++++-
 .../testing/selftests/user_events/dyn_test.c  | 54 +++++++++++++++++-
 2 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/user_events/abi_test.c b/tools/testing/selftests/user_events/abi_test.c
index 5125c42efe65..b95fc15496ba 100644
--- a/tools/testing/selftests/user_events/abi_test.c
+++ b/tools/testing/selftests/user_events/abi_test.c
@@ -23,6 +23,18 @@
 const char *data_file = "/sys/kernel/tracing/user_events_data";
 const char *enable_file = "/sys/kernel/tracing/events/user_events/__abi_event/enable";
 
+static bool event_exists(void)
+{
+	int fd = open(enable_file, O_RDWR);
+
+	if (fd < 0)
+		return false;
+
+	close(fd);
+
+	return true;
+}
+
 static int change_event(bool enable)
 {
 	int fd = open(enable_file, O_RDWR);
@@ -46,7 +58,22 @@ static int change_event(bool enable)
 	return ret;
 }
 
-static int reg_enable(long *enable, int size, int bit)
+static int event_delete(void)
+{
+	int fd = open(data_file, O_RDWR);
+	int ret;
+
+	if (fd < 0)
+		return -1;
+
+	ret = ioctl(fd, DIAG_IOCSDEL, "__abi_event");
+
+	close(fd);
+
+	return ret;
+}
+
+static int reg_enable_flags(long *enable, int size, int bit, int flags)
 {
 	struct user_reg reg = {0};
 	int fd = open(data_file, O_RDWR);
@@ -57,6 +84,7 @@ static int reg_enable(long *enable, int size, int bit)
 
 	reg.size = sizeof(reg);
 	reg.name_args = (__u64)"__abi_event";
+	reg.flags = flags;
 	reg.enable_bit = bit;
 	reg.enable_addr = (__u64)enable;
 	reg.enable_size = size;
@@ -68,6 +96,11 @@ static int reg_enable(long *enable, int size, int bit)
 	return ret;
 }
 
+static int reg_enable(long *enable, int size, int bit)
+{
+	return reg_enable_flags(enable, size, bit, 0);
+}
+
 static int reg_disable(long *enable, int bit)
 {
 	struct user_unreg reg = {0};
@@ -121,6 +154,26 @@ TEST_F(user, enablement) {
 	ASSERT_EQ(0, change_event(false));
 }
 
+TEST_F(user, flags) {
+	/* USER_EVENT_REG_PERSIST is allowed */
+	ASSERT_EQ(0, reg_enable_flags(&self->check, sizeof(int), 0,
+				      USER_EVENT_REG_PERSIST));
+	ASSERT_EQ(0, reg_disable(&self->check, 0));
+
+	/* Ensure it exists after close and disable */
+	ASSERT_TRUE(event_exists());
+
+	/* Ensure we can delete it */
+	ASSERT_EQ(0, event_delete());
+
+	/* USER_EVENT_REG_MAX or above is not allowed */
+	ASSERT_EQ(-1, reg_enable_flags(&self->check, sizeof(int), 0,
+				       USER_EVENT_REG_MAX));
+
+	/* Ensure it does not exist after invalid flags */
+	ASSERT_FALSE(event_exists());
+}
+
 TEST_F(user, bit_sizes) {
 	/* Allow 0-31 bits for 32-bit */
 	ASSERT_EQ(0, reg_enable(&self->check, sizeof(int), 0));
diff --git a/tools/testing/selftests/user_events/dyn_test.c b/tools/testing/selftests/user_events/dyn_test.c
index 91a4444ad42b..f2a41bcb5ad8 100644
--- a/tools/testing/selftests/user_events/dyn_test.c
+++ b/tools/testing/selftests/user_events/dyn_test.c
@@ -16,9 +16,25 @@
 
 #include "../kselftest_harness.h"
 
+const char *dyn_file = "/sys/kernel/tracing/dynamic_events";
 const char *abi_file = "/sys/kernel/tracing/user_events_data";
 const char *enable_file = "/sys/kernel/tracing/events/user_events/__test_event/enable";
 
+static int event_delete(void)
+{
+	int fd = open(abi_file, O_RDWR);
+	int ret;
+
+	if (fd < 0)
+		return -1;
+
+	ret = ioctl(fd, DIAG_IOCSDEL, "__test_event");
+
+	close(fd);
+
+	return ret;
+}
+
 static bool wait_for_delete(void)
 {
 	int i;
@@ -63,7 +79,31 @@ static int unreg_event(int fd, int *check, int bit)
 	return ioctl(fd, DIAG_IOCSUNREG, &unreg);
 }
 
-static int parse(int *check, const char *value)
+static int parse_dyn(const char *value)
+{
+	int fd = open(dyn_file, O_RDWR | O_APPEND);
+	int len = strlen(value);
+	int ret;
+
+	if (fd == -1)
+		return -1;
+
+	ret = write(fd, value, len);
+
+	if (ret == len)
+		ret = 0;
+	else
+		ret = -1;
+
+	close(fd);
+
+	if (ret == 0)
+		event_delete();
+
+	return ret;
+}
+
+static int parse_abi(int *check, const char *value)
 {
 	int fd = open(abi_file, O_RDWR);
 	int ret;
@@ -89,6 +129,18 @@ static int parse(int *check, const char *value)
 	return ret;
 }
 
+static int parse(int *check, const char *value)
+{
+	int abi_ret = parse_abi(check, value);
+	int dyn_ret = parse_dyn(value);
+
+	/* Ensure both ABI and DYN parse the same way */
+	if (dyn_ret != abi_ret)
+		return -1;
+
+	return dyn_ret;
+}
+
 static int check_match(int *check, const char *first, const char *second, bool *match)
 {
 	int fd = open(abi_file, O_RDWR);
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH 0/3] tracing/user_events: Allow events to persist for perfmon_capable users
From: Steven Rostedt @ 2023-09-01 20:57 UTC (permalink / raw)
  To: Beau Belgrave; +Cc: mhiramat, linux-kernel, linux-trace-kernel, ast, dcook
In-Reply-To: <20230901204332.159-1-beaub@linux.microsoft.com>


Hi Beau,

Just an FYI, except for fixes, it is never a good ideal to send out patches
while the merge window is open. They will likely be ignored for the
entirety of the merge window.

-- Steve


On Fri,  1 Sep 2023 20:43:29 +0000
Beau Belgrave <beaub@linux.microsoft.com> wrote:

> There are several scenarios that have come up where having a user_event
> persist even if the process that registered it exits. The main one is
> having a daemon create events on bootup that shouldn't get deleted if
> the daemon has to exit or reload. Another is within OpenTelemetry
> exporters, they wish to potentially check if a user_event exists on the
> system to determine if exporting the data out should occur. The
> user_event in this case must exist even in the absence of the owning
> process running (such as the above daemon case).
> 
> Since persistent events aren't automatically cleaned up, we want to ensure
> only trusted users are allowed to do this. It seems reasonable to use
> CAP_PERFMON as that boundary, since those users can already do many things
> via perf_event_open without requiring full CAP_SYS_ADMIN.
> 
> This patchset brings back the ability to use /sys/kernel/tracing/dynamic_events
> to create user_events, as persist is now back to being supported. Both the
> register and delete of events that persist require CAP_PERFMON, which prevents
> a non-perfmon user from making an event go away that a perfmon user decided
> should persist.
> 
> Beau Belgrave (3):
>   tracing/user_events: Allow events to persist for perfmon_capable users
>   selftests/user_events: Test persist flag cases
>   tracing/user_events: Document persist event flags
> 
>  Documentation/trace/user_events.rst           | 21 ++++++-
>  include/uapi/linux/user_events.h              | 11 +++-
>  kernel/trace/trace_events_user.c              | 28 +++++-----
>  .../testing/selftests/user_events/abi_test.c  | 55 ++++++++++++++++++-
>  .../testing/selftests/user_events/dyn_test.c  | 54 +++++++++++++++++-
>  5 files changed, 149 insertions(+), 20 deletions(-)
> 
> 
> base-commit: f940e482b0f889e697372a22b6c15da87aa1f63a


^ permalink raw reply

* Re: [PATCH] tracing: Fix the unusable problem caused by non-empty pipe_cpumask
From: Steven Rostedt @ 2023-09-02  1:26 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Chuang Wang, stable, Zheng Yejian, linux-kernel,
	linux-trace-kernel, Brian Foster
In-Reply-To: <20230901230206.9bf0250291acd7bfbde46b53@kernel.org>

On Fri, 1 Sep 2023 23:02:06 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> On Fri,  1 Sep 2023 15:26:26 +0800
> Chuang Wang <nashuiliang@gmail.com> wrote:
> 
> > There is a problem in the linux-6.5 when I execute the following
> > command:
> > 
> >  $ perf ftrace -t irqsoff
> >  failed to open trace_pipe
> > 
> > At the same time, when I open this file, the same error occurs.
> > 
> > Therefore, after carefully looking at the code, the open function of
> > 'trace_pipe' returns -EBUSY in open_pipe_on_cpu() because no clearing
> > operation was performed when 'trace_array->pipe_cpumask' was allocated.
> > 
> > With this patch, Use zalloc_cpumask_var() to ensure that
> > 'trace_array->pipe_cpumask' is reset to 0 when allocated.  


Brian Foster beat you to it.

  https://lore.kernel.org/linux-trace-kernel/20230831125500.986862-1-bfoster@redhat.com


> 
> Good catch. This looks good to me.
> 
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 

Masami, I'll add your Reviewed-by to that commit.

Thanks!

-- Steve

> > 
> > Cc: stable@vger.kernel.org
> > Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
> > Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
> > ---
> >  kernel/trace/trace.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index d8233d34b5a0..c0b8a72f3466 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -9461,7 +9461,7 @@ static struct trace_array *trace_array_create(const char *name)
> >  	if (!alloc_cpumask_var(&tr->tracing_cpumask, GFP_KERNEL))
> >  		goto out_free_tr;
> >  
> > -	if (!alloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
> > +	if (!zalloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
> >  		goto out_free_tr;
> >  
> >  	tr->trace_flags = global_trace.trace_flags & ~ZEROED_TRACE_FLAGS;
> > @@ -10406,7 +10406,7 @@ __init static int tracer_alloc_buffers(void)
> >  	if (trace_create_savedcmd() < 0)
> >  		goto out_free_temp_buffer;
> >  
> > -	if (!alloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
> > +	if (!zalloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
> >  		goto out_free_savedcmd;
> >  
> >  	/* TODO: make the number of buffers hot pluggable with CPUS */
> > -- 
> > 2.39.2
> >   
> 
> 


^ permalink raw reply

* Re: [PATCH v3] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
From: Masahiro Yamada @ 2023-09-02  6:36 UTC (permalink / raw)
  To: Alessandro Carminati (Red Hat)
  Cc: Masami Hiramatsu, Steven Rostedt, Daniel Bristot de Oliveira,
	Josh Poimboeuf, Luis Chamberlain, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Alexander Lobakin, Nick Alcock,
	Kris Van Hees, Eugene Loh, Francis Laniel, Viktor Malik,
	linux-kbuild, linux-kernel, linux-trace-kernel, live-patching
In-Reply-To: <20230828080423.3539686-1-alessandro.carminati@gmail.com>

On Mon, Aug 28, 2023 at 8:45 PM Alessandro Carminati (Red Hat)
<alessandro.carminati@gmail.com> wrote:
>
> From: Alessandro Carminati <alessandro.carminati@gmail.com>
>
> It is not uncommon for drivers or modules related to similar peripherals
> to have symbols with the exact same name.
> While this is not a problem for the kernel's binary itself, it becomes an
> issue when attempting to trace or probe specific functions using
> infrastructure like ftrace or kprobe.
>
> The tracing subsystem relies on the `nm -n vmlinux` output, which provides
> symbol information from the kernel's ELF binary. However, when multiple
> symbols share the same name, the standard nm output does not differentiate
> between them. This can lead to confusion and difficulty when trying to
> probe the intended symbol.
>
>  ~ # cat /proc/kallsyms | grep " name_show"
>  ffffffff8c4f76d0 t name_show
>  ffffffff8c9cccb0 t name_show
>  ffffffff8cb0ac20 t name_show
>  ffffffff8cc728c0 t name_show
>  ffffffff8ce0efd0 t name_show
>  ffffffff8ce126c0 t name_show
>  ffffffff8ce1dd20 t name_show
>  ffffffff8ce24e70 t name_show
>  ffffffff8d1104c0 t name_show
>  ffffffff8d1fe480 t name_show
>
> **kas_alias** addresses this challenge by extending the symbol names with
> unique suffixes during the kernel build process.
> The newly created aliases for these duplicated symbols are unique names
> that can be fed to the ftracefs interface. By doing so, it enables
> previously unreachable symbols to be probed.
>
>  ~ # cat /proc/kallsyms | grep " name_show"
>  ffffffff974f76d0 t name_show
>  ffffffff974f76d0 t name_show__alias__6340
>  ffffffff979cccb0 t name_show
>  ffffffff979cccb0 t name_show__alias__6341
>  ffffffff97b0ac20 t name_show
>  ffffffff97b0ac20 t name_show__alias__6342
>  ffffffff97c728c0 t name_show
>  ffffffff97c728c0 t name_show__alias__6343
>  ffffffff97e0efd0 t name_show
>  ffffffff97e0efd0 t name_show__alias__6344
>  ffffffff97e126c0 t name_show
>  ffffffff97e126c0 t name_show__alias__6345
>  ffffffff97e1dd20 t name_show
>  ffffffff97e1dd20 t name_show__alias__6346
>  ffffffff97e24e70 t name_show
>  ffffffff97e24e70 t name_show__alias__6347
>  ffffffff981104c0 t name_show
>  ffffffff981104c0 t name_show__alias__6348
>  ffffffff981fe480 t name_show
>  ffffffff981fe480 t name_show__alias__6349
>  ~ # echo "p:kprobes/evnt1 name_show__alias__6349" \
>  > >/sys/kernel/tracing/kprobe_events
>  ~ # cat /sys/kernel/tracing/kprobe_events
>  p:kprobes/evnt1 name_show__alias__6349
>
> Changes from v1:
> - Integrated changes requested by Masami to exclude symbols with prefixes
>   "_cfi" and "_pfx".
> - Introduced a small framework to handle patterns that need to be excluded
>   from the alias production.
> - Excluded other symbols using the framework.
> - Introduced the ability to discriminate between text and data symbols.
> - Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA,
>   which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which
>   excludes all filters and provides an alias for each duplicated symbol.
>
> https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@gmail.com/
>
> Changes from v2:
> - Alias tags are created by querying DWARF information from the vmlinux.
> - The filename + line number is normalized and appended to the original name.
> - The tag begins with '@' to indicate the symbol source.
> - Not a change, but worth mentioning, since the alias is added to the existing
>   list, the old duplicated name is preserved, and the livepatch way of dealing
>   with duplicates is maintained.
> - Acknowledging the existence of scenarios where inlined functions declared in
>   header files may result in multiple copies due to compiler behavior, though
>    it is not actionable as it does not pose an operational issue.
> - Highlighting a single exception where the same name refers to different
>   functions: the case of "compat_binfmt_elf.c," which directly includes
>   "binfmt_elf.c" producing identical function copies in two separate
>   modules.
>
> sample from new v3
>
>  ~ # cat /proc/kallsyms | grep gic_mask_irq
>  ffffd0b03c04dae4 t gic_mask_irq
>  ffffd0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167
>  ffffd0b03c050960 t gic_mask_irq
>  ffffd0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404
>  ~ #
>
> https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/
>
> Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com>
> ---
>  init/Kconfig                        |  36 ++++
>  scripts/Makefile                    |   4 +
>  scripts/kas_alias/Makefile          |   4 +
>  scripts/kas_alias/a2l.c             | 268 ++++++++++++++++++++++++++++
>  scripts/kas_alias/a2l.h             |  32 ++++
>  scripts/kas_alias/duplicates_list.c |  70 ++++++++
>  scripts/kas_alias/duplicates_list.h |  15 ++
>  scripts/kas_alias/item_list.c       | 230 ++++++++++++++++++++++++
>  scripts/kas_alias/item_list.h       |  26 +++
>  scripts/kas_alias/kas_alias.c       | 217 ++++++++++++++++++++++
>  scripts/link-vmlinux.sh             |  11 +-
>  11 files changed, 910 insertions(+), 3 deletions(-)


I added some review comments in another thread, but
one of the biggest concerns might be "910 insertions".


What this program does is quite simple,
"find duplicated names, and call addr2line".



You wrote a lot of code to self-implement these:

 - sort function
 - parse PATH env variable to find addr2line
 - fork addr2line to establish pipe communications



Have you considered writing the code in Python (or Perl)?
Is it too slow?

Most of the functions you implemented are already
available in script languages.



I am not sure if "@<file-path>" is a good solution,
but the amount of the added code looks too much to me.




--
Best Regards
Masahiro Yamada

^ permalink raw reply

* Re: [PATCH v3] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
From: Alessandro Carminati @ 2023-09-02  7:26 UTC (permalink / raw)
  To: Francis Laniel
  Cc: Masami Hiramatsu, Steven Rostedt, Daniel Bristot de Oliveira,
	Josh Poimboeuf, Masahiro Yamada, Luis Chamberlain,
	Nathan Chancellor, Nick Desaulniers, Nicolas Schier,
	Alexander Lobakin, Nick Alcock, Kris Van Hees, Eugene Loh,
	Viktor Malik, linux-kbuild, linux-kernel, linux-trace-kernel,
	live-patching
In-Reply-To: <5966520.lOV4Wx5bFT@pwmachine>

Hi Francis,

I want to express my gratitude for your review and the time you took
to provide it.

Il giorno mar 29 ago 2023 alle ore 16:51 Francis Laniel
<flaniel@linux.microsoft.com> ha scritto:
>
> Hi.
>
> Le lundi 28 août 2023, 10:04:23 CEST Alessandro Carminati (Red Hat) a écrit :
> > From: Alessandro Carminati <alessandro.carminati@gmail.com>
> >
> > It is not uncommon for drivers or modules related to similar peripherals
> > to have symbols with the exact same name.
> > While this is not a problem for the kernel's binary itself, it becomes an
> > issue when attempting to trace or probe specific functions using
> > infrastructure like ftrace or kprobe.
>
> Thank you for the patch!
> I tested it and it works fine:
> root@vm-amd64:~# grep -m 6 ' name_show' /proc/kallsyms
> ffffffff81108330 t name_show
> ffffffff81108330 t name_show@_kernel_irq_irqdesc_c_264
> ffffffff816d2dd0 t name_show
> ffffffff816d2dd0 t name_show@_drivers_pnp_card_c_186
> ffffffff81814990 t name_show
> ffffffff81814990 t name_show@_drivers_gpu_drm_i915_gt_sysfs_engines_c_26
>
> On the paper, it combines well with my other patch forbidding tracing non
> unique symbols.
> I will nonetheless need to try yours on top of mine and I will let you know
> the result, even though everything should go well.
>
> Regarding the code itself, from my review your implementation is good.
> I was just wondering if we cannot avoid the first sort BY_NAME by modifying
> find_duplicates() and the returned structured, but the index you are using
> later should reduce the loopthrough time and compensate the first sort.
> I have some specific comments but they are mainly nits:
>
> > The tracing subsystem relies on the `nm -n vmlinux` output, which provides
> > symbol information from the kernel's ELF binary. However, when multiple
> > symbols share the same name, the standard nm output does not differentiate
> > between them. This can lead to confusion and difficulty when trying to
> > probe the intended symbol.
> >
> >  ~ # cat /proc/kallsyms | grep " name_show"
> >  ffffffff8c4f76d0 t name_show
> >  ffffffff8c9cccb0 t name_show
> >  ffffffff8cb0ac20 t name_show
> >  ffffffff8cc728c0 t name_show
> >  ffffffff8ce0efd0 t name_show
> >  ffffffff8ce126c0 t name_show
> >  ffffffff8ce1dd20 t name_show
> >  ffffffff8ce24e70 t name_show
> >  ffffffff8d1104c0 t name_show
> >  ffffffff8d1fe480 t name_show
> >
> > **kas_alias** addresses this challenge by extending the symbol names with
> > unique suffixes during the kernel build process.
> > The newly created aliases for these duplicated symbols are unique names
> > that can be fed to the ftracefs interface. By doing so, it enables
> > previously unreachable symbols to be probed.
> >
> >  ~ # cat /proc/kallsyms | grep " name_show"
> >  ffffffff974f76d0 t name_show
> >  ffffffff974f76d0 t name_show__alias__6340
>
> This output is outdated.
> This is not a big problem but it would be better with the updated one as the
> "@_file_line" is better.
>
> >  ffffffff979cccb0 t name_show
> >  ffffffff979cccb0 t name_show__alias__6341
> >  ffffffff97b0ac20 t name_show
> >  ffffffff97b0ac20 t name_show__alias__6342
> >  ffffffff97c728c0 t name_show
> >  ffffffff97c728c0 t name_show__alias__6343
> >  ffffffff97e0efd0 t name_show
> >  ffffffff97e0efd0 t name_show__alias__6344
> >  ffffffff97e126c0 t name_show
> >  ffffffff97e126c0 t name_show__alias__6345
> >  ffffffff97e1dd20 t name_show
> >  ffffffff97e1dd20 t name_show__alias__6346
> >  ffffffff97e24e70 t name_show
> >  ffffffff97e24e70 t name_show__alias__6347
> >  ffffffff981104c0 t name_show
> >  ffffffff981104c0 t name_show__alias__6348
> >  ffffffff981fe480 t name_show
> >  ffffffff981fe480 t name_show__alias__6349
> >  ~ # echo "p:kprobes/evnt1 name_show__alias__6349" \
> >
> >  > >/sys/kernel/tracing/kprobe_events
> >
> >  ~ # cat /sys/kernel/tracing/kprobe_events
> >  p:kprobes/evnt1 name_show__alias__6349
> >
> > Changes from v1:
> > - Integrated changes requested by Masami to exclude symbols with prefixes
> >   "_cfi" and "_pfx".
> > - Introduced a small framework to handle patterns that need to be excluded
> >   from the alias production.
> > - Excluded other symbols using the framework.
> > - Introduced the ability to discriminate between text and data symbols.
> > - Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA,
> >   which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which
> >   excludes all filters and provides an alias for each duplicated symbol.
> >
> > https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@gm
> > ail.com/
> >
> > Changes from v2:
> > - Alias tags are created by querying DWARF information from the vmlinux.
> > - The filename + line number is normalized and appended to the original
> > name. - The tag begins with '@' to indicate the symbol source.
> > - Not a change, but worth mentioning, since the alias is added to the
> > existing list, the old duplicated name is preserved, and the livepatch way
> > of dealing with duplicates is maintained.
> > - Acknowledging the existence of scenarios where inlined functions declared
> > in header files may result in multiple copies due to compiler behavior,
> > though it is not actionable as it does not pose an operational issue.
> > - Highlighting a single exception where the same name refers to different
> >   functions: the case of "compat_binfmt_elf.c," which directly includes
> >   "binfmt_elf.c" producing identical function copies in two separate
> >   modules.
> >
> > sample from new v3
> >
> >  ~ # cat /proc/kallsyms | grep gic_mask_irq
> >  ffffd0b03c04dae4 t gic_mask_irq
> >  ffffd0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167
> >  ffffd0b03c050960 t gic_mask_irq
> >  ffffd0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404
> >  ~ #
> >
> > https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gm
> > ail.com/
> >
> > Signed-off-by: Alessandro Carminati (Red Hat)
> > <alessandro.carminati@gmail.com> ---
> >  init/Kconfig                        |  36 ++++
> >  scripts/Makefile                    |   4 +
> >  scripts/kas_alias/Makefile          |   4 +
> >  scripts/kas_alias/a2l.c             | 268 ++++++++++++++++++++++++++++
> >  scripts/kas_alias/a2l.h             |  32 ++++
> >  scripts/kas_alias/duplicates_list.c |  70 ++++++++
> >  scripts/kas_alias/duplicates_list.h |  15 ++
> >  scripts/kas_alias/item_list.c       | 230 ++++++++++++++++++++++++
> >  scripts/kas_alias/item_list.h       |  26 +++
> >  scripts/kas_alias/kas_alias.c       | 217 ++++++++++++++++++++++
> >  scripts/link-vmlinux.sh             |  11 +-
> >  11 files changed, 910 insertions(+), 3 deletions(-)
> >  create mode 100644 scripts/kas_alias/Makefile
> >  create mode 100644 scripts/kas_alias/a2l.c
> >  create mode 100644 scripts/kas_alias/a2l.h
> >  create mode 100644 scripts/kas_alias/duplicates_list.c
> >  create mode 100644 scripts/kas_alias/duplicates_list.h
> >  create mode 100644 scripts/kas_alias/item_list.c
> >  create mode 100644 scripts/kas_alias/item_list.h
> >  create mode 100644 scripts/kas_alias/kas_alias.c
> >
> > diff --git a/init/Kconfig b/init/Kconfig
> > index f7f65af4ee12..bc69fcd9cbc8 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1737,6 +1737,42 @@ config KALLSYMS_BASE_RELATIVE
> >         time constants, and no relocation pass is required at runtime to fix
> >         up the entries based on the runtime load address of the kernel.
> >
> > +config KALLSYMS_ALIAS
> > +     bool "Produces alias for duplicated symbols" if EXPERT
> > +     depends on KALLSYMS && (DEBUG_INFO_DWARF4 || DEBUG_INFO_DWARF5)
> > +     help
> > +       It is not uncommon for drivers or modules related to similar
> > +       peripherals to have symbols with the exact same name.
> > +       While this is not a problem for the kernel's binary itself, it
> > +       becomes an issue when attempting to trace or probe specific
> > +       functions using infrastructure like ftrace or kprobe.
> > +
> > +       This option addresses this challenge by extending the symbol names
> > +       with unique suffixes during the kernel build process.
> > +       The newly created aliases for these duplicated symbols are unique
> > +       names that can be fed to the ftrace sysfs interface. By doing so, it
> > +       enables previously unreachable symbols to be probed.
> > +
> > +config CONFIG_KALLSYMS_ALIAS_DATA
> > +     bool "Produces alias also for data"
> > +     depends on KALLSYMS_ALIAS
> > +     help
> > +       Sometimes it can be useful to refer to data. In live patch
> scenarios,
> > +       you may find yourself needing to use symbols that are shared with
> > +       other functions. Since symbols face the same issue as functions,
> this
> > +       option allows you to create aliases for data as well.
> > +
> > +config CONFIG_KALLSYMS_ALIAS_DATA_ALL
> > +     bool "Removes all filter when producing data alias"
> > +     depends on CONFIG_KALLSYMS_ALIAS_DATA
> > +     help
> > +       When selecting data aliases, not all symbols are included in the set
> > +       This is because many symbols are unlikely to be used. If you choose
> > +       to have an alias for all data symbols, be aware that it will
> > +       significantly increase the size.
> > +
> > +       If unsure, say N.
> > +
> >  # end of the "standard kernel features (expert users)" menu
> >
> >  # syscall, maps, verifier
> > diff --git a/scripts/Makefile b/scripts/Makefile
> > index 32b6ba722728..65fafe17cfe5 100644
> > --- a/scripts/Makefile
> > +++ b/scripts/Makefile
> > @@ -49,3 +49,7 @@ subdir-$(CONFIG_SECURITY_SELINUX) += selinux
> >
> >  # Let clean descend into subdirs
> >  subdir-      += basic dtc gdb kconfig mod
> > +
> > +# KALLSyms alias
> > +subdir-$(CONFIG_KALLSYMS_ALIAS) += kas_alias
> > +
> > diff --git a/scripts/kas_alias/Makefile b/scripts/kas_alias/Makefile
> > new file mode 100644
> > index 000000000000..e1fde69232b4
> > --- /dev/null
> > +++ b/scripts/kas_alias/Makefile
> > @@ -0,0 +1,4 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +hostprogs-always-$(CONFIG_KALLSYMS_ALIAS)    += kas_alias
> > +
> > +kas_alias-objs        := duplicates_list.o item_list.o kas_alias.o a2l.o
> > diff --git a/scripts/kas_alias/a2l.c b/scripts/kas_alias/a2l.c
> > new file mode 100644
> > index 000000000000..a9692ac30180
> > --- /dev/null
> > +++ b/scripts/kas_alias/a2l.c
> > @@ -0,0 +1,268 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <unistd.h>
> > +#include <sys/types.h>
> > +#include <sys/wait.h>
> > +#include <string.h>
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +
> > +#include "a2l.h"
> > +
> > +int addr2line_pid = -1;
> > +int a2l_in[2];
> > +int a2l_out[2];
> > +char line[MAX_BUF];
> > +char vmlinux_path[MAX_BUF];
> > +char addr2line_cmd[MAX_CMD_LEN];
> > +FILE *a2l_stdin, *a2l_stdout;
> > +
> > +static char *normalize_path(const char *input_path, char *output_path)
> > +{
> > +     char *prev_token = NULL;
> > +     char *delimiter = "/";
> > +     char inbuf[MAX_BUF];
>
> Rather than using memset below you can initialize it like this above:
> char inbuf[MAX_BUF] = {0};
>
> > +     char *token;
> > +     char *pos;
> > +
> > +     memset(inbuf, 0, MAX_BUF);
> > +     *output_path = '\0';
> > +     strncpy(inbuf, input_path, MAX_BUF);
> > +     if (!input_path || !output_path || strlen(input_path) == 0)
> > +             return NULL;
> > +
> > +     token = strtok(inbuf, delimiter);
> > +     while (token) {
> > +             if (strcmp(token, "..") == 0 && prev_token) {
> > +                     pos = strrchr(output_path, '/');
>
> > +                     if (pos)
> > +                             *pos = '\0';
> > +
> > +             } else if (strcmp(token, ".") != 0) {
> > +                     strcat(output_path, "/");
> > +                     strcat(output_path, token);
> > +             }
> > +
> > +             prev_token = token;
> > +             token = strtok(NULL, delimiter);
> > +     }
> > +
> > +     return output_path;
> > +}
> > +
> > +static void path_of(const char *full_path, char *path)
> > +{
> > +     const char *last_slash = strrchr(full_path, '/');
> > +     size_t path_length;
> > +     char cwd[MAX_BUF];
> > +
> > +     if (!last_slash) {
> > +             if (getcwd(cwd, sizeof(cwd)))
> > +                     strcpy(path, cwd);
> > +             else
> > +                     strcpy(path, ".");
> > +     } else {
> > +             path_length = last_slash - full_path;
> > +             strncpy(path, full_path, path_length);
> > +             path[path_length] = '\0';
> > +     }
> > +}
> > +
> > +static bool file_exists(const char *file_path)
> > +{
> > +     FILE *file;
> > +
> > +     file = fopen(file_path, "r");
> > +     if (file) {
> > +             fclose(file);
> > +             return true;
> > +     }
> > +     return false;
> > +}
> > +
> > +int addr2line_init(const char *cmd, const char *vmlinux)
> > +{
> > +     if ((!file_exists(cmd)) || (!file_exists(vmlinux))) {
> > +             printf("file not found\n");
> > +             return 0;
> > +             }
> > +
> > +     path_of(vmlinux, vmlinux_path);
> > +     if (pipe(a2l_in) == -1) {
> > +             printf("Failed to create pipe\n");
> > +             return 0;
> > +     }
> > +
> > +     if (pipe(a2l_out) == -1) {
> > +             printf("Failed to create pipe\n");
> > +             return 0;
> > +     }
> > +
> > +     addr2line_pid = fork();
> > +     if (addr2line_pid == -1) {
> > +             printf("Failed to fork process\n");
> > +             close(a2l_in[P_READ]);
> > +             close(a2l_in[P_WRITE]);
> > +             close(a2l_out[P_READ]);
> > +             close(a2l_out[P_WRITE]);
> > +             return 0;
> > +     }
> > +
> > +     if (addr2line_pid == 0) {
> > +             dup2(a2l_in[P_READ], 0);
> > +             dup2(a2l_out[P_WRITE], 1);
> > +             close(a2l_in[P_WRITE]);
> > +             close(a2l_out[P_READ]);
> > +
> > +             execlp(cmd, cmd, ADDR2LINE_ARGS, vmlinux, NULL);
> > +
> > +             printf("Failed to execute addr2line command\n");
> > +             exit(1);
> > +     } else {
> > +             close(a2l_in[P_READ]);
> > +             close(a2l_out[P_WRITE]);
> > +     }
> > +
> > +     a2l_stdin = fdopen(a2l_in[P_WRITE], "w");
> > +     if (!a2l_stdin) {
> > +             printf("Failed to open pipe a2l_in\n");
> > +             return 0;
> > +     }
> > +
> > +     a2l_stdout = fdopen(a2l_out[P_READ], "r");
> > +     if (!a2l_stdout) {
> > +             printf("Failed to open pipe a2l_out\n");
> > +             fclose(a2l_stdin);
> > +             return 0;
> > +     }
> > +
> > +     return 1;
> > +}
> > +
> > +const char *remove_subdir(const char *home, const char *f_path)
> > +{
> > +     int i = 0;
> > +
> > +     while (*(home + i) == *(f_path + i))
>
> Using home[i] and f_path[i] should do the trick too.
>
> > +             i++;
> > +
> > +     return (strlen(home) != i) ? NULL : f_path + i;
> > +}
> > +
> > +char *addr2line_get_lines(uint64_t address)
> > +{
> > +     char buf[MAX_BUF];
> > +
> > +     fprintf(a2l_stdin, "%08lx\n", address);
> > +     fflush(a2l_stdin);
> > +
> > +     if (!fgets(line, sizeof(line), a2l_stdout)) {
> > +             printf("Failed to read lines from addr2line\n");
> > +             return NULL;
> > +     }
> > +
> > +     if (!fgets(line, sizeof(line), a2l_stdout)) {
> > +             printf("Failed to read lines from addr2line\n");
> > +             return NULL;
> > +     }
> > +     line[strcspn(line, "\n")] = '\0';
> > +     strncpy(buf, line, MAX_BUF);
> > +     return normalize_path(buf, line);
> > +}
> > +
> > +int addr2line_cleanup(void)
> > +{
> > +     int status;
> > +
> > +     if (addr2line_pid != -1) {
> > +             kill(addr2line_pid, SIGKILL);
> > +             waitpid(addr2line_pid, &status, 0);
> > +             fclose(a2l_stdin);
> > +             fclose(a2l_stdout);
> > +             addr2line_pid = -1;
> > +     }
> > +
> > +     return 1;
> > +}
> > +
> > +static char *find_executable(const char *command)
>
> Nice function, I would rather have tried to execve() the command.
>
> > +{
> > +     char *path_env = getenv("PATH");
> > +     char *executable_path;
> > +     char *path_copy;
> > +     char *path;
> > +     int n;
> > +
> > +     if (!path_env)
> > +             return NULL;
> > +
> > +     path_copy = strdup(path_env);
> > +     if (!path_copy)
> > +             return NULL;
> > +
> > +     path = strtok(path_copy, ":");
> > +     while (path) {
> > +             n = snprintf(0, 0, "%s/%s", path, command);
> > +             executable_path = (char *)malloc(n + 1);
> > +             snprintf(executable_path, n + 1, "%s/%s", path, command);
> > +             if (access(executable_path, X_OK) == 0) {
> > +                     free(path_copy);
> > +                     return executable_path;
> > +             }
> > +
> > +     path = strtok(NULL, ":");
> > +     free(executable_path);
> > +     executable_path = NULL;
>
> > +     }
> > +
> > +     free(path_copy);
> > +     if (executable_path)
> > +             free(executable_path);
>
> If you arrive here, executable_path should be NULL, right?
>
> > +     return NULL;
> > +}
> > +
> > +const char *get_addr2line(int mode)
> > +{
> > +     char *buf = "";
> > +
> > +     switch (mode) {
> > +     case A2L_CROSS:
> > +             buf = getenv("CROSS_COMPILE");
> > +             memcpy(addr2line_cmd, buf, strlen(buf));
> > +     case A2L_DEFAULT:
> > +             memcpy(addr2line_cmd + strlen(buf), ADDR2LINE,
> strlen(ADDR2LINE));
> > +             buf = find_executable(addr2line_cmd);
> > +             if (buf) {
> > +                     memcpy(addr2line_cmd, buf, strlen(buf));
> > +                     free(buf);
> > +             }
> > +             return addr2line_cmd;
> > +     case A2L_LLVM:
> > +     default:
> > +             return NULL;
> > +     }
> > +}
> > +
> > +char *get_vmlinux(char *input)
> > +{
> > +     const char *match_string1 = ".syms";
> > +     const char *match_string2 = ".tmp_vmlinux.kallsyms";
> > +     char *result = NULL;
> > +     char *match_pos;
> > +
> > +     match_pos = strstr(input, match_string1);
> > +     if (!match_pos)
> > +             return NULL;
> > +
> > +     match_pos = strstr(input, match_string2);
> > +     if (!match_pos)
> > +             return NULL;
> > +
> > +     result = strdup(input);
> > +     match_pos = strstr(result, match_string1);
> > +     *match_pos = '\0';
> > +     return result;
> > +}
> > diff --git a/scripts/kas_alias/a2l.h b/scripts/kas_alias/a2l.h
> > new file mode 100644
> > index 000000000000..ca6419229dde
> > --- /dev/null
> > +++ b/scripts/kas_alias/a2l.h
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef A2L_H
> > +#define A2L_H
> > +#include <stdint.h>
> > +
> > +#define ADDR2LINE "addr2line"
> > +#define ADDR2LINE_ARGS "-fe"
> > +//#define VMLINUX "vmlinux"
> > +#define MAX_BUF 4096
> > +#define MAX_CMD_LEN 256
> > +#define P_READ 0
> > +#define P_WRITE 1
> > +#define A2L_DEFAULT 1
> > +#define A2L_CROSS 2
> > +#define A2L_LLVM 3
> > +#define A2L_MAKE_VALUE 2
> > +
> > +extern int addr2line_pid;
> > +extern int a2l_in[2];
> > +extern int a2l_out[2];
> > +extern char line[MAX_BUF];
> > +extern char vmlinux_path[MAX_BUF];
> > +extern char addr2line_cmd[MAX_CMD_LEN];
> > +
> > +int addr2line_init(const char *cmd, const char *vmlinux);
> > +char *addr2line_get_lines(uint64_t address);
> > +int addr2line_cleanup(void);
> > +const char *remove_subdir(const char *home, const char *f_path);
> > +const char *get_addr2line(int mode);
> > +char *get_vmlinux(char *input);
> > +
> > +#endif
> > diff --git a/scripts/kas_alias/duplicates_list.c
> > b/scripts/kas_alias/duplicates_list.c new file mode 100644
> > index 000000000000..e7a3d2917937
> > --- /dev/null
> > +++ b/scripts/kas_alias/duplicates_list.c
> > @@ -0,0 +1,70 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <stdlib.h>
> > +#include <stdbool.h>
> > +
> > +#include "item_list.h"
> > +#include "duplicates_list.h"
> > +
> > +struct duplicate_item *find_duplicates(struct item *list)
> > +{
> > +     struct duplicate_item *current_duplicate = NULL;
> > +     struct duplicate_item *duplicates = NULL;
> > +     struct duplicate_item *new_duplicate;
> > +     struct item *current_item = list;
> > +     bool prev_was_duplicate = false;
> > +     struct item *prev_item = NULL;
> > +
> > +     while (current_item) {
> > +             if ((prev_item && (strcmp(current_item->symb_name, prev_item-
> >symb_name)
> > == 0)) || +               prev_was_duplicate) {
> > +                     if (!duplicates) {
> > +                             duplicates = malloc(sizeof(struct
> duplicate_item));
> > +                             if (!duplicates)
> > +                                     return NULL;
> > +
> > +                             duplicates->original_item = prev_item;
> > +                             duplicates->next = NULL;
> > +                             current_duplicate = duplicates;
> > +                     } else {
> > +                             new_duplicate = malloc(sizeof(struct
> duplicate_item));
> > +                             if (!new_duplicate) {
> > +                                     free_duplicates(&duplicates);
> > +                                     return NULL;
> > +                             }
> > +
> > +                             new_duplicate->original_item = prev_item;
> > +                             new_duplicate->next = NULL;
> > +                             current_duplicate->next = new_duplicate;
> > +                             current_duplicate = new_duplicate;
> > +
> > +                             if ((strcmp(current_item->symb_name, prev_item-
> >symb_name) != 0) &&
> > +                                 (prev_was_duplicate))
> > +                                     prev_was_duplicate = false;
> > +                             else
> > +                                     prev_was_duplicate = true;
>
> You can remove the branch and set prev_was_duplicate to:
> prev_was_duplicate = strcmp(current_item->symb_name, prev_item->symb_name) !=
> 0 && prev_was_duplicate;
 I agree with all your points, except for one.
While your statement is undoubtedly more concise than mine, in the v4,
I actually use its negation.
>
> > +                     }
> > +             }
> > +
> > +             prev_item = current_item;
> > +             current_item = current_item->next;
> > +     }
> > +
> > +     return duplicates;
> > +}
> > +
> > +void free_duplicates(struct duplicate_item **duplicates)
> > +{
> > +     struct duplicate_item *duplicates_iterator = *duplicates;
> > +     struct duplicate_item *app;
> > +
> > +     while (duplicates_iterator) {
> > +             app = duplicates_iterator;
> > +             duplicates_iterator = duplicates_iterator->next;
> > +             free(app);
> > +     }
> > +
> > +     *duplicates = NULL;
> > +}
> > diff --git a/scripts/kas_alias/duplicates_list.h
> > b/scripts/kas_alias/duplicates_list.h new file mode 100644
> > index 000000000000..76aa73e584bc
> > --- /dev/null
> > +++ b/scripts/kas_alias/duplicates_list.h
> > @@ -0,0 +1,15 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef DUPLICATES_LIST_H
> > +#define DUPLICATES_LIST_H
> > +
> > +#include "item_list.h"
> > +
> > +struct duplicate_item {
> > +     struct item *original_item;
> > +     struct duplicate_item *next;
> > +};
> > +
> > +struct duplicate_item *find_duplicates(struct item *list);
> > +void free_duplicates(struct duplicate_item **duplicates);
> > +
> > +#endif
> > diff --git a/scripts/kas_alias/item_list.c b/scripts/kas_alias/item_list.c
> > new file mode 100644
> > index 000000000000..48f2e525592a
> > --- /dev/null
> > +++ b/scripts/kas_alias/item_list.c
> > @@ -0,0 +1,230 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <stdint.h>
> > +#include <string.h>
> > +#include <stdbool.h>
> > +#include <assert.h>
> > +#include "item_list.h"
> > +
> > +#define CHECK_ORDER_BY_ADDRESS(sort_by, current, temp, op) \
> > +     ((sort_by) == BY_ADDRESS && (current)->addr op (temp)->addr)
> > +#define CHECK_ORDER_BY_NAME(sort_by, current, temp, op) \
> > +     ((sort_by) == BY_NAME && strcmp((current)->symb_name, (temp)-
> >symb_name)
> > op 0) +
> > +struct item *list_index[96] = {0};
> > +
> > +void build_index(struct item *list)
> > +{
> > +     char current_first_letter = ' ';
> > +     struct item *current = list;
> > +
> > +     while (current) {
> > +             if (current->symb_name[0] != current_first_letter) {
> > +                     current_first_letter = current->symb_name[0];
> > +                     list_index[current_first_letter - 32] = current;
>
> > +             }
> > +             current = current->next;
> > +     }
> > +}
> > +
> > +struct item *add_item(struct item **list, const char *name, char stype,
> > uint64_t addr) +{
> > +     struct item *new_item;
> > +     struct item *current;
> > +
> > +     new_item = malloc(sizeof(struct item));
> > +     if (!new_item)
> > +             return NULL;
> > +
> > +     strncpy(new_item->symb_name, name, MAX_NAME_SIZE);
> > +     new_item->symb_name[MAX_NAME_SIZE - 1] = '\0';
> > +     new_item->addr = addr;
> > +     new_item->stype = stype;
> > +     new_item->next = NULL;
> > +
> > +     if (!(*list)) {
> > +             *list = new_item;
> > +     } else {
> > +             current = *list;
> > +             while (current->next)
> > +                     current = current->next;
> > +
> > +             current->next = new_item;
> > +     }
> > +     return new_item;
> > +}
> > +
> > +void sort_list(struct item **list, int sort_by)
> > +{
> > +     struct item *current = *list;
> > +     struct item *sorted = NULL;
> > +     struct item *next_item;
> > +     struct item *temp;
> > +
> > +     if (!(*list) || !((*list)->next))
> > +             return;
> > +
> > +     while (current) {
> > +             next_item = current->next;
> > +             if (!sorted ||
> > +                 (CHECK_ORDER_BY_ADDRESS(sort_by, current, sorted, <) ||
> > +                 CHECK_ORDER_BY_NAME(sort_by, current, sorted, >=))) {
> > +                     current->next = sorted;
> > +                     sorted = current;
> > +             } else {
> > +                     temp = sorted;
> > +                     while (temp->next &&
> > +                            (CHECK_ORDER_BY_ADDRESS(sort_by, current, temp-
> >next, >=) ||
> > +                            CHECK_ORDER_BY_NAME(sort_by, current, temp->next,
> >=)))
> > +                             temp = temp->next;
> > +
> > +                     current->next = temp->next;
> > +                     temp->next = current;
> > +             }
> > +             current = next_item;
> > +     }
> > +
> > +     *list = sorted;
> > +}
> > +
> > +struct item *merge(struct item *left, struct item *right, int sort_by)
>
> > +{
> > +     struct item *current = NULL;
> > +     struct item *result = NULL;
> > +
> > +     if (!left)
> > +             return right;
> > +     if (!right)
> > +             return left;
> > +
> > +     if (sort_by == BY_NAME) {
> > +             if (strcmp(left->symb_name, right->symb_name) <= 0) {
> > +                     result = left;
> > +                     left = left->next;
> > +             } else {
> > +                     result = right;
> > +                     right = right->next;
> > +             }
> > +     } else {
> > +             if (sort_by == BY_ADDRESS) {
> > +                     if (left->addr <= right->addr) {
> > +                             result = left;
> > +                             left = left->next;
> > +                     } else {
> > +                             result = right;
> > +                             right = right->next;
> > +                     }
> > +             }
> > +     }
> > +
> > +     current = result;
> > +
> > +     while (left && right) {
> > +             if (sort_by == BY_NAME) {
> > +                     if (strcmp(left->symb_name, right->symb_name) <= 0) {
> > +                             current->next = left;
> > +                             left = left->next;
> > +                     } else {
> > +                             current->next = right;
> > +                             right = right->next;
> > +                     }
> > +             } else {
> > +                     if (sort_by == BY_ADDRESS) {
> > +                             if (left->addr <= right->addr) {
> > +                                     current->next = left;
> > +                                     left = left->next;
> > +                             } else {
> > +                                     current->next = right;
> > +                                     right = right->next;
> > +                             }
> > +                     }
> > +             }
> > +
> > +             current = current->next;
> > +     }
> > +
> > +     if (left) {
> > +             current->next = left;
> > +     } else {
> > +             if (right)
>
> > +                     current->next = right;
> > +     }
> > +
> > +     return result;
> > +}
> > +
> > +struct item *merge_sort(struct item *head, int sort_by)
> > +{
> > +     struct item *right;
> > +     struct item *slow;
> > +     struct item *fast;
> > +     struct item *left;
> > +
> > +     if (!head || !head->next)
> > +             return head;
> > +
> > +     slow = head;
> > +     fast = head->next;
> > +
> > +     while (fast && fast->next) {
> > +             slow = slow->next;
> > +             fast = fast->next->next;
> > +     }
> > +
> > +     left = head;
> > +     right = slow->next;
> > +     slow->next = NULL;
> > +
> > +     left = merge_sort(left, sort_by);
> > +     right = merge_sort(right, sort_by);
> > +
> > +     return merge(left, right, sort_by);
> > +}
> > +
> > +void sort_list_m(struct item **head, int sort_by)
> > +{
> > +     if (!(*head) || !((*head)->next))
> > +             return;
> > +
> > +     *head = merge_sort(*head, sort_by);
> > +}
> > +
> > +int insert_after(struct item *list, const uint64_t search_addr,
> > +              const char *name, uint64_t addr, char stype)
> > +{
> > +     struct item *new_item;
> > +     struct item *current;
> > +     int ret = 0;
> > +
> > +     current = (list_index[name[0] - 32]) ? list_index[name[0] - 32] :
> list;
> > +     while (current) {
> > +             if (current->addr == search_addr) {
> > +                     new_item = malloc(sizeof(struct item));
> > +                     if (!new_item)
> > +                             return ret;
> > +                     strncpy(new_item->symb_name, name, MAX_NAME_SIZE);
> > +                     new_item->symb_name[MAX_NAME_SIZE - 1] = '\0';
> > +                     new_item->addr = addr;
> > +                     new_item->stype = stype;
> > +                     new_item->next = current->next;
> > +                     current->next = new_item;
> > +                     ret = 1;
> > +                     break;
>
> > +             }
> > +             current = current->next;
> > +     }
> > +     return ret;
> > +}
> > +
> > +void free_items(struct item **head)
> > +{
> > +     struct item *app, *item_iterator = *head;
> > +
> > +     while (item_iterator) {
> > +             app = item_iterator;
> > +             item_iterator = item_iterator->next;
> > +             free(app);
> > +     }
> > +     *head = NULL;
> > +}
> > diff --git a/scripts/kas_alias/item_list.h b/scripts/kas_alias/item_list.h
> > new file mode 100644
> > index 000000000000..b4891cb088ee
> > --- /dev/null
> > +++ b/scripts/kas_alias/item_list.h
> > @@ -0,0 +1,26 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef ITEM_LIST_H
> > +#define ITEM_LIST_H
> > +#include <stdint.h>
> > +
> > +#define MAX_NAME_SIZE 256
> > +#define BY_ADDRESS 1
> > +#define BY_NAME 2
> > +
> > +struct item {
> > +     char            symb_name[MAX_NAME_SIZE];
> > +     uint64_t        addr;
> > +     char            stype;
> > +     struct item     *next;
> > +};
> > +
> > +void build_index(struct item *list);
> > +struct item *add_item(struct item **list, const char *name, char stype,
> > uint64_t addr); +void sort_list(struct item **list, int sort_by);
> > +struct item *merge(struct item *left, struct item *right, int sort_by);
> > +struct item *merge_sort(struct item *head, int sort_by);
> > +void sort_list_m(struct item **head, int sort_by);
> > +int insert_after(struct item *list, const uint64_t search_addr,
> > +              const char *name, uint64_t addr, char stype);
> > +void free_items(struct item **head);
> > +#endif
> > diff --git a/scripts/kas_alias/kas_alias.c b/scripts/kas_alias/kas_alias.c
> > new file mode 100644
> > index 000000000000..532aeb39f851
> > --- /dev/null
> > +++ b/scripts/kas_alias/kas_alias.c
> > @@ -0,0 +1,217 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <stdint.h>
> > +#include <unistd.h>
> > +#include <string.h>
> > +#include <stdbool.h>
> > +#include <stdarg.h>
> > +#include <regex.h>
> > +
> > +#include "item_list.h"
> > +#include "duplicates_list.h"
> > +#include "a2l.h"
> > +
> > +#define SYMB_IS_TEXT(s) ((((s)->stype) == 't') ||  (((s)->stype) == 'T'))
> > +#define SYMB_IS_DATA(s) ((((s)->stype) == 'b') ||  (((s)->stype) == 'B') ||
> > \ +                    (((s)->stype) == 'd') ||  (((s)->stype) == 'D') || \
> > +                      (((s)->stype) == 'r') ||  (((s)->stype) == 'R'))
> > +#ifdef CONFIG_KALLSYMS_ALIAS_DATA
> > +#define SYMB_NEEDS_ALIAS(s) (SYMB_IS_TEXT(s) || SYMB_IS_DATA(s))
> > +#else
> > +#define SYMB_NEEDS_ALIAS(s) SYMB_IS_TEXT(s)
> > +#endif
> > +#define FNOMATCH 0
> > +#define FMATCH 1
> > +#define EREGEX 2
> > +
> > +const char *ignore_regex[] = {
> > +     "^__cfi_.*$",                           // __cfi_ preamble
> > +#ifndef CONFIG_KALLSYMS_ALIAS_DATA_ALL
> > +     "^_*TRACE_SYSTEM.*$",
> > +     "^__already_done\\.[0-9]+$",            // Call a function once data
> > +     "^___tp_str\\.[0-9]+$",
> > +     "^___done\\.[0-9]+$",
> > +     "^__print_once\\.[0-9]+$",
> > +     "^_rs\\.[0-9]+$",
> > +     "^__compound_literal\\.[0-9]+$",
> > +     "^___once_key\\.[0-9]+$",
> > +     "^__func__\\.[0-9]+$",
> > +     "^__msg\\.[0-9]+$",
> > +     "^CSWTCH\\.[0-9]+$",
> > +     "^__flags\\.[0-9]+$",
> > +     "^__wkey.*$",
> > +     "^__mkey.*$",
> > +     "^__key.*$",
> > +#endif
> > +     "^__pfx_.*$"                            // NOP-padding
> > +};
> > +
> > +int suffix_serial;
> > +
> > +static inline void verbose_msg(bool verbose, const char *fmt, ...)
> > +{
> > +     va_list args;
> > +
> > +     va_start(args, fmt);
> > +     if (verbose)
> > +             printf(fmt, args);
> > +
> > +     va_end(args);
> > +}
> > +
> > +static void create_suffix(const char *name, char *output_suffix)
> > +{
> > +     sprintf(output_suffix, "%s__alias__%d", name, suffix_serial++);
> > +}
> > +
> > +static void create_file_suffix(const char *name, uint64_t address, char
> > *output_suffix, char *cwd) +{
> > +     const char *f_path;
> > +     char *buf;
> > +     int i = 0;
> > +
> > +     buf = addr2line_get_lines(address);
> > +     f_path = remove_subdir(cwd, buf);
> > +     if (f_path) {
> > +             sprintf(output_suffix, "%s@%s", name, f_path);
> > +             while (*(output_suffix + i) != '\0') {
> > +                     switch (*(output_suffix + i)) {
> > +                     case '/':
> > +                     case ':':
> > +                     case '.':
> > +                             *(output_suffix + i) = '_';
> > +                             break;
> > +                     default:
> > +                     }
> > +             i++;
> > +             }
> > +     } else {
> > +             create_suffix(name, output_suffix);
> > +     }
> > +}
> > +
> > +static int filter_symbols(char *symbol, const char **ignore_list, int
> > regex_no) +{
> > +     regex_t regex;
> > +     int res, i;
> > +
> > +     for (i = 0; i < regex_no; i++) {
> > +             res = regcomp(&regex, ignore_list[i], REG_EXTENDED);
> > +             if (res)
> > +                     return -EREGEX;
> > +
> > +             res = regexec(&regex, symbol, 0, NULL, 0);
> > +             regfree(&regex);
> > +             switch (res) {
> > +             case 0:
> > +                     return FMATCH;
> > +             case REG_NOMATCH:
> > +                     break;
> > +             default:
> > +                     return -EREGEX;
> > +             }
> > +     }
> > +
> > +     return FNOMATCH;
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +     char t, sym_name[MAX_NAME_SIZE], new_name[MAX_NAME_SIZE + 15];
> > +     struct duplicate_item  *duplicate_iterator;
> > +     struct duplicate_item *duplicate;
> > +     struct item *head = {NULL};
> > +     bool need_2_process = true;
> > +     struct item *last = {NULL};
> > +     struct item  *current;
> > +     int verbose_mode = 0;
> > +     uint64_t address;
> > +     FILE *fp;
> > +     int res;
> > +
> > +     if (argc < 2 || argc > 3) {
> > +             printf("Usage: %s <nmfile> [-verbose]\n", argv[0]);
> > +             return 1;
> > +     }
> > +
> > +     if (argc == 3 && strcmp(argv[2], "-verbose") == 0)
> > +             verbose_mode = 1;
> > +
> > +     verbose_msg(verbose_mode, "Scanning nm data(%s)\n", argv[1]);
> > +
> > +     fp = fopen(argv[1], "r");
> > +     if (!fp) {
> > +             printf("Can't open input file.\n");
> > +             return 1;
> > +     }
> > +
> > +     if (!addr2line_init(get_addr2line(A2L_DEFAULT), get_vmlinux(argv[1])))
> > +             return 1;
> > +
> > +     while (fscanf(fp, "%lx %c %99s\n", &address, &t, sym_name) == 3) {
> > +             if (strstr(sym_name, "@_")) {
> > +                     if (verbose_mode && need_2_process)
> > +                             printf("Already processed\n");
> > +                     need_2_process = false;
> > +                     }
> > +             last = add_item(&last, sym_name, t, address);
> > +             if (!last) {
> > +                     printf("Error in allocate memory\n");
> > +                     free_items(&head);
> > +                     return 1;
> > +             }
> > +
> > +             if (!head)
> > +                     head = last;
> > +     }
> > +
> > +     fclose(fp);
> > +
> > +     if (need_2_process) {
> > +             verbose_msg(verbose_mode, "Sorting nm data\n");
> > +             sort_list_m(&head, BY_NAME);
> > +             verbose_msg(verbose_mode, "Scanning nm data for duplicates\n");
> > +             duplicate = find_duplicates(head);
> > +             if (!duplicate) {
> > +                     printf("Error in duplicates list\n");
> > +                     return 1;
> > +             }
> > +
> > +             verbose_msg(verbose_mode, "Applying suffixes\n");
> > +             build_index(head);
> > +             duplicate_iterator = duplicate;
> > +             while (duplicate_iterator) {
> > +                     res = filter_symbols(duplicate_iterator->original_item-
> >symb_name,
> > +                                          ignore_regex, sizeof(ignore_regex) /
> > +                                          sizeof(ignore_regex[0]));
> > +                     if (res != FMATCH &&
> > +                         SYMB_NEEDS_ALIAS(duplicate_iterator->original_item))
> {
> > +                             if (res < 0)
> > +                                     return 1;
> > +
> > +                             create_file_suffix(duplicate_iterator-
> >original_item->symb_name,
> > +                                                duplicate_iterator-
> >original_item->addr,
> > +                                                new_name, vmlinux_path);
> > +                             if (!insert_after(head, duplicate_iterator-
> >original_item->addr,
> > +                                               new_name, duplicate_iterator-
> >original_item->addr,
> > +                                               duplicate_iterator-
> >original_item->stype))
> > +                                     return 1;
> > +                     }
> > +
> > +                     duplicate_iterator = duplicate_iterator->next;
> > +             }
> > +
> > +             sort_list_m(&head, BY_ADDRESS);
> > +     }
> > +     current = head;
> > +     while (current) {
> > +             printf("%08lx %c %s\n", current->addr, current->stype,
> > current->symb_name); +                current = current->next;
> > +     }
> > +
> > +     free_items(&head);
> > +     free_duplicates(&duplicate);
> > +     addr2line_cleanup();
> > +     return 0;
> > +}
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index a432b171be82..cacf60b597ce 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -89,8 +89,9 @@ vmlinux_link()
> >
> >       ldflags="${ldflags} ${wl}--script=${objtree}/${KBUILD_LDS}"
> >
> > -     # The kallsyms linking does not need debug symbols included.
> > -     if [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
> > +     # The kallsyms linking does not need debug symbols included, unless
> the
> > KALLSYMS_ALIAS. +     if [ ! is_enabled CONFIG_KALLSYMS_ALIAS ] && \
> > +         [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
> >               ldflags="${ldflags} ${wl}--strip-debug"
> >       fi
> >
> > @@ -161,7 +162,11 @@ kallsyms()
> >       fi
> >
> >       info KSYMS ${2}
> > -     scripts/kallsyms ${kallsymopt} ${1} > ${2}
> > +     if is_enabled CONFIG_KALLSYMS_ALIAS; then
> > +             ALIAS=".alias"
> > +             scripts/kas_alias/kas_alias ${1} >${1}${ALIAS}
> > +             fi
> > +     scripts/kallsyms ${kallsymopt} ${1}${ALIAS} > ${2}
> >  }
> >
> >  # Perform one step in kallsyms generation, including temporary linking of
>
> Best regards.
>
>
Cheers

^ permalink raw reply

* Re: [PATCH v3] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
From: Alessandro Carminati @ 2023-09-02  7:26 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Daniel Bristot de Oliveira, Josh Poimboeuf,
	Masahiro Yamada, Luis Chamberlain, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Alexander Lobakin, Nick Alcock,
	Kris Van Hees, Eugene Loh, Francis Laniel, Viktor Malik,
	linux-kbuild, linux-kernel, linux-trace-kernel, live-patching
In-Reply-To: <20230830150027.57bf46382eef3206c78de6d2@kernel.org>

Hey Masami,
Thanks a lot for your review and for your time.

Il giorno mer 30 ago 2023 alle ore 08:00 Masami Hiramatsu
<mhiramat@kernel.org> ha scritto:
>
> Hi Alessandro,
>
> Thanks for your update!
>
> On Mon, 28 Aug 2023 08:04:23 +0000
> "Alessandro Carminati (Red Hat)" <alessandro.carminati@gmail.com> wrote:
>
> > From: Alessandro Carminati <alessandro.carminati@gmail.com>
> >
> > It is not uncommon for drivers or modules related to similar peripherals
> > to have symbols with the exact same name.
> > While this is not a problem for the kernel's binary itself, it becomes an
> > issue when attempting to trace or probe specific functions using
> > infrastructure like ftrace or kprobe.
> >
> > The tracing subsystem relies on the `nm -n vmlinux` output, which provides
> > symbol information from the kernel's ELF binary. However, when multiple
> > symbols share the same name, the standard nm output does not differentiate
> > between them. This can lead to confusion and difficulty when trying to
> > probe the intended symbol.
> >
> >  ~ # cat /proc/kallsyms | grep " name_show"
> >  ffffffff8c4f76d0 t name_show
> >  ffffffff8c9cccb0 t name_show
> >  ffffffff8cb0ac20 t name_show
> >  ffffffff8cc728c0 t name_show
> >  ffffffff8ce0efd0 t name_show
> >  ffffffff8ce126c0 t name_show
> >  ffffffff8ce1dd20 t name_show
> >  ffffffff8ce24e70 t name_show
> >  ffffffff8d1104c0 t name_show
> >  ffffffff8d1fe480 t name_show
> >
> > **kas_alias** addresses this challenge by extending the symbol names with
> > unique suffixes during the kernel build process.
> > The newly created aliases for these duplicated symbols are unique names
> > that can be fed to the ftracefs interface. By doing so, it enables
> > previously unreachable symbols to be probed.
> >
> >  ~ # cat /proc/kallsyms | grep " name_show"
> >  ffffffff974f76d0 t name_show
> >  ffffffff974f76d0 t name_show__alias__6340
> >  ffffffff979cccb0 t name_show
> >  ffffffff979cccb0 t name_show__alias__6341
> >  ffffffff97b0ac20 t name_show
> >  ffffffff97b0ac20 t name_show__alias__6342
> >  ffffffff97c728c0 t name_show
> >  ffffffff97c728c0 t name_show__alias__6343
> >  ffffffff97e0efd0 t name_show
> >  ffffffff97e0efd0 t name_show__alias__6344
> >  ffffffff97e126c0 t name_show
> >  ffffffff97e126c0 t name_show__alias__6345
> >  ffffffff97e1dd20 t name_show
> >  ffffffff97e1dd20 t name_show__alias__6346
> >  ffffffff97e24e70 t name_show
> >  ffffffff97e24e70 t name_show__alias__6347
> >  ffffffff981104c0 t name_show
> >  ffffffff981104c0 t name_show__alias__6348
> >  ffffffff981fe480 t name_show
> >  ffffffff981fe480 t name_show__alias__6349
> >  ~ # echo "p:kprobes/evnt1 name_show__alias__6349" \
> >  > >/sys/kernel/tracing/kprobe_events
> >  ~ # cat /sys/kernel/tracing/kprobe_events
> >  p:kprobes/evnt1 name_show__alias__6349
> >
> > Changes from v1:
> > - Integrated changes requested by Masami to exclude symbols with prefixes
> >   "_cfi" and "_pfx".
> > - Introduced a small framework to handle patterns that need to be excluded
> >   from the alias production.
> > - Excluded other symbols using the framework.
> > - Introduced the ability to discriminate between text and data symbols.
> > - Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA,
> >   which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which
> >   excludes all filters and provides an alias for each duplicated symbol.
> >
> > https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@gmail.com/
> >
> > Changes from v2:
> > - Alias tags are created by querying DWARF information from the vmlinux.
> > - The filename + line number is normalized and appended to the original name.
> > - The tag begins with '@' to indicate the symbol source.
> > - Not a change, but worth mentioning, since the alias is added to the existing
> >   list, the old duplicated name is preserved, and the livepatch way of dealing
> >   with duplicates is maintained.
> > - Acknowledging the existence of scenarios where inlined functions declared in
> >   header files may result in multiple copies due to compiler behavior, though
> >    it is not actionable as it does not pose an operational issue.
> > - Highlighting a single exception where the same name refers to different
> >   functions: the case of "compat_binfmt_elf.c," which directly includes
> >   "binfmt_elf.c" producing identical function copies in two separate
> >   modules.
> >
> > sample from new v3
> >
> >  ~ # cat /proc/kallsyms | grep gic_mask_irq
> >  ffffd0b03c04dae4 t gic_mask_irq
> >  ffffd0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167
> >  ffffd0b03c050960 t gic_mask_irq
> >  ffffd0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404
>
> Could you also update the sample output in the patch description?
> And I can not see this line information but a serial number aliases,
> is there any way to enable this file-name aliases? (LLVM is not supported?)
> I think we'd better to have a new CONFIG_KALLSYMS_ALIAS_SRCLINE and
> clarify what is required.
In this version of the patch, there's still a fallback action that
produces the old "__alias__<seqnum>" if the addr2line action fails.
In the v3 patch, addr2line is default to the native option.
The CROSS_COMPILE setting remains a manual selection. While in my tests,
I didn't encounter any issues, and the native addr2line worked well on
all the architectures I tested, there is still a possibility of failure
leading to the fallback action. Somthing I will change in the v4.

Additionally, if CONFIG_KALLSYMS_ALIAS_DATA is set to 'Y,' data names
included in the KALLSYMS receive an old-style alias.
Regarding this matter, I'd like to discuss with you the relevance of
CONFIG_KALLSYMS_ALIAS_DATA and CONFIG_KALLSYMS_ALIAS_DATA_ALL.
Both of these settings aim to provide an alias for variable names as
well. In the past, I had to "livepatch" a function for an architecture
that had no support, and I thought it would be beneficial to have
unique variable names as well. What are your thoughts on this?
>
> >  ~ #
> >
> > https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/
> >
> > Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com>
> > ---
> >  init/Kconfig                        |  36 ++++
> >  scripts/Makefile                    |   4 +
> >  scripts/kas_alias/Makefile          |   4 +
> >  scripts/kas_alias/a2l.c             | 268 ++++++++++++++++++++++++++++
> >  scripts/kas_alias/a2l.h             |  32 ++++
> >  scripts/kas_alias/duplicates_list.c |  70 ++++++++
> >  scripts/kas_alias/duplicates_list.h |  15 ++
> >  scripts/kas_alias/item_list.c       | 230 ++++++++++++++++++++++++
> >  scripts/kas_alias/item_list.h       |  26 +++
> >  scripts/kas_alias/kas_alias.c       | 217 ++++++++++++++++++++++
> >  scripts/link-vmlinux.sh             |  11 +-
> >  11 files changed, 910 insertions(+), 3 deletions(-)
> >  create mode 100644 scripts/kas_alias/Makefile
> >  create mode 100644 scripts/kas_alias/a2l.c
> >  create mode 100644 scripts/kas_alias/a2l.h
> >  create mode 100644 scripts/kas_alias/duplicates_list.c
> >  create mode 100644 scripts/kas_alias/duplicates_list.h
> >  create mode 100644 scripts/kas_alias/item_list.c
> >  create mode 100644 scripts/kas_alias/item_list.h
> >  create mode 100644 scripts/kas_alias/kas_alias.c
> >
> > diff --git a/init/Kconfig b/init/Kconfig
> > index f7f65af4ee12..bc69fcd9cbc8 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1737,6 +1737,42 @@ config KALLSYMS_BASE_RELATIVE
> >         time constants, and no relocation pass is required at runtime to fix
> >         up the entries based on the runtime load address of the kernel.
> >
> > +config KALLSYMS_ALIAS
> > +     bool "Produces alias for duplicated symbols" if EXPERT
> > +     depends on KALLSYMS && (DEBUG_INFO_DWARF4 || DEBUG_INFO_DWARF5)
> > +     help
> > +       It is not uncommon for drivers or modules related to similar
> > +       peripherals to have symbols with the exact same name.
> > +       While this is not a problem for the kernel's binary itself, it
> > +       becomes an issue when attempting to trace or probe specific
> > +       functions using infrastructure like ftrace or kprobe.
> > +
> > +       This option addresses this challenge by extending the symbol names
> > +       with unique suffixes during the kernel build process.
> > +       The newly created aliases for these duplicated symbols are unique
> > +       names that can be fed to the ftrace sysfs interface. By doing so, it
> > +       enables previously unreachable symbols to be probed.
> > +
> > +config CONFIG_KALLSYMS_ALIAS_DATA
>
> You don't need 'CONFIG_' prefix here.
>
> > +     bool "Produces alias also for data"
> > +     depends on KALLSYMS_ALIAS
> > +     help
> > +       Sometimes it can be useful to refer to data. In live patch scenarios,
> > +       you may find yourself needing to use symbols that are shared with
> > +       other functions. Since symbols face the same issue as functions, this
> > +       option allows you to create aliases for data as well.
> > +
> > +config CONFIG_KALLSYMS_ALIAS_DATA_ALL
> > +     bool "Removes all filter when producing data alias"
> > +     depends on CONFIG_KALLSYMS_ALIAS_DATA
> > +     help
> > +       When selecting data aliases, not all symbols are included in the set
> > +       This is because many symbols are unlikely to be used. If you choose
> > +       to have an alias for all data symbols, be aware that it will
> > +       significantly increase the size.
> > +
> > +       If unsure, say N.
> > +
> >  # end of the "standard kernel features (expert users)" menu
> >
> >  # syscall, maps, verifier
> > diff --git a/scripts/Makefile b/scripts/Makefile
> > index 32b6ba722728..65fafe17cfe5 100644
> > --- a/scripts/Makefile
> > +++ b/scripts/Makefile
> > @@ -49,3 +49,7 @@ subdir-$(CONFIG_SECURITY_SELINUX) += selinux
> >
> >  # Let clean descend into subdirs
> >  subdir-      += basic dtc gdb kconfig mod
> > +
> > +# KALLSyms alias
> > +subdir-$(CONFIG_KALLSYMS_ALIAS) += kas_alias
> > +
> > diff --git a/scripts/kas_alias/Makefile b/scripts/kas_alias/Makefile
> > new file mode 100644
> > index 000000000000..e1fde69232b4
> > --- /dev/null
> > +++ b/scripts/kas_alias/Makefile
> > @@ -0,0 +1,4 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +hostprogs-always-$(CONFIG_KALLSYMS_ALIAS)    += kas_alias
> > +
> > +kas_alias-objs        := duplicates_list.o item_list.o kas_alias.o a2l.o
> > diff --git a/scripts/kas_alias/a2l.c b/scripts/kas_alias/a2l.c
> > new file mode 100644
> > index 000000000000..a9692ac30180
> > --- /dev/null
> > +++ b/scripts/kas_alias/a2l.c
> > @@ -0,0 +1,268 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <unistd.h>
> > +#include <sys/types.h>
> > +#include <sys/wait.h>
> > +#include <string.h>
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +
> > +#include "a2l.h"
> > +
> > +int addr2line_pid = -1;
> > +int a2l_in[2];
> > +int a2l_out[2];
> > +char line[MAX_BUF];
> > +char vmlinux_path[MAX_BUF];
> > +char addr2line_cmd[MAX_CMD_LEN];
> > +FILE *a2l_stdin, *a2l_stdout;
> > +
> > +static char *normalize_path(const char *input_path, char *output_path)
> > +{
> > +     char *prev_token = NULL;
> > +     char *delimiter = "/";
> > +     char inbuf[MAX_BUF];
> > +     char *token;
> > +     char *pos;
> > +
> > +     memset(inbuf, 0, MAX_BUF);
> > +     *output_path = '\0';
> > +     strncpy(inbuf, input_path, MAX_BUF);
> > +     if (!input_path || !output_path || strlen(input_path) == 0)
> > +             return NULL;
> > +
> > +     token = strtok(inbuf, delimiter);
> > +     while (token) {
> > +             if (strcmp(token, "..") == 0 && prev_token) {
> > +                     pos = strrchr(output_path, '/');
> > +                     if (pos)
> > +                             *pos = '\0';
> > +
> > +             } else if (strcmp(token, ".") != 0) {
> > +                     strcat(output_path, "/");
> > +                     strcat(output_path, token);
> > +             }
> > +
> > +             prev_token = token;
> > +             token = strtok(NULL, delimiter);
> > +     }
> > +
> > +     return output_path;
> > +}
> > +
> > +static void path_of(const char *full_path, char *path)
> > +{
> > +     const char *last_slash = strrchr(full_path, '/');
> > +     size_t path_length;
> > +     char cwd[MAX_BUF];
> > +
> > +     if (!last_slash) {
> > +             if (getcwd(cwd, sizeof(cwd)))
> > +                     strcpy(path, cwd);
> > +             else
> > +                     strcpy(path, ".");
> > +     } else {
> > +             path_length = last_slash - full_path;
> > +             strncpy(path, full_path, path_length);
> > +             path[path_length] = '\0';
> > +     }
> > +}
> > +
> > +static bool file_exists(const char *file_path)
> > +{
> > +     FILE *file;
> > +
> > +     file = fopen(file_path, "r");
> > +     if (file) {
> > +             fclose(file);
> > +             return true;
> > +     }
> > +     return false;
> > +}
> > +
> > +int addr2line_init(const char *cmd, const char *vmlinux)
> > +{
> > +     if ((!file_exists(cmd)) || (!file_exists(vmlinux))) {
> > +             printf("file not found\n");
> > +             return 0;
> > +             }
>
> nit: mis-indentation?
>
> > +
> > +     path_of(vmlinux, vmlinux_path);
> > +     if (pipe(a2l_in) == -1) {
> > +             printf("Failed to create pipe\n");
> > +             return 0;
> > +     }
> > +
> > +     if (pipe(a2l_out) == -1) {
> > +             printf("Failed to create pipe\n");
> > +             return 0;
> > +     }
> > +
> > +     addr2line_pid = fork();
> > +     if (addr2line_pid == -1) {
> > +             printf("Failed to fork process\n");
> > +             close(a2l_in[P_READ]);
> > +             close(a2l_in[P_WRITE]);
> > +             close(a2l_out[P_READ]);
> > +             close(a2l_out[P_WRITE]);
> > +             return 0;
> > +     }
> > +
> > +     if (addr2line_pid == 0) {
> > +             dup2(a2l_in[P_READ], 0);
> > +             dup2(a2l_out[P_WRITE], 1);
> > +             close(a2l_in[P_WRITE]);
> > +             close(a2l_out[P_READ]);
> > +
> > +             execlp(cmd, cmd, ADDR2LINE_ARGS, vmlinux, NULL);
> > +
> > +             printf("Failed to execute addr2line command\n");
> > +             exit(1);
> > +     } else {
> > +             close(a2l_in[P_READ]);
> > +             close(a2l_out[P_WRITE]);
> > +     }
> > +
> > +     a2l_stdin = fdopen(a2l_in[P_WRITE], "w");
> > +     if (!a2l_stdin) {
> > +             printf("Failed to open pipe a2l_in\n");
> > +             return 0;
> > +     }
> > +
> > +     a2l_stdout = fdopen(a2l_out[P_READ], "r");
> > +     if (!a2l_stdout) {
> > +             printf("Failed to open pipe a2l_out\n");
> > +             fclose(a2l_stdin);
> > +             return 0;
> > +     }
> > +
> > +     return 1;
> > +}
> > +
> > +const char *remove_subdir(const char *home, const char *f_path)
> > +{
> > +     int i = 0;
> > +
> > +     while (*(home + i) == *(f_path + i))
> > +             i++;
> > +
> > +     return (strlen(home) != i) ? NULL : f_path + i;
> > +}
> > +
> > +char *addr2line_get_lines(uint64_t address)
> > +{
> > +     char buf[MAX_BUF];
> > +
> > +     fprintf(a2l_stdin, "%08lx\n", address);
> > +     fflush(a2l_stdin);
> > +
> > +     if (!fgets(line, sizeof(line), a2l_stdout)) {
> > +             printf("Failed to read lines from addr2line\n");
> > +             return NULL;
> > +     }
> > +
> > +     if (!fgets(line, sizeof(line), a2l_stdout)) {
> > +             printf("Failed to read lines from addr2line\n");
> > +             return NULL;
> > +     }
> > +
> > +     line[strcspn(line, "\n")] = '\0';
> > +     strncpy(buf, line, MAX_BUF);
> > +     return normalize_path(buf, line);
> > +}
> > +
> > +int addr2line_cleanup(void)
> > +{
> > +     int status;
> > +
> > +     if (addr2line_pid != -1) {
> > +             kill(addr2line_pid, SIGKILL);
> > +             waitpid(addr2line_pid, &status, 0);
> > +             fclose(a2l_stdin);
> > +             fclose(a2l_stdout);
> > +             addr2line_pid = -1;
> > +     }
> > +
> > +     return 1;
> > +}
> > +
> > +static char *find_executable(const char *command)
> > +{
> > +     char *path_env = getenv("PATH");
> > +     char *executable_path;
> > +     char *path_copy;
> > +     char *path;
> > +     int n;
> > +
> > +     if (!path_env)
> > +             return NULL;
> > +
> > +     path_copy = strdup(path_env);
> > +     if (!path_copy)
> > +             return NULL;
> > +
> > +     path = strtok(path_copy, ":");
> > +     while (path) {
> > +             n = snprintf(0, 0, "%s/%s", path, command);
> > +             executable_path = (char *)malloc(n + 1);
> > +             snprintf(executable_path, n + 1, "%s/%s", path, command);
> > +             if (access(executable_path, X_OK) == 0) {
> > +                     free(path_copy);
> > +                     return executable_path;
> > +             }
> > +
> > +     path = strtok(NULL, ":");
> > +     free(executable_path);
> > +     executable_path = NULL;
>
> nit: mis-indentation?
>
> > +     }
> > +
> > +     free(path_copy);
> > +     if (executable_path)
> > +             free(executable_path);
> > +     return NULL;
> > +}
> > +
> > +const char *get_addr2line(int mode)
> > +{
> > +     char *buf = "";
> > +
> > +     switch (mode) {
> > +     case A2L_CROSS:
> > +             buf = getenv("CROSS_COMPILE");
> > +             memcpy(addr2line_cmd, buf, strlen(buf));
> > +     case A2L_DEFAULT:
> > +             memcpy(addr2line_cmd + strlen(buf), ADDR2LINE, strlen(ADDR2LINE));
> > +             buf = find_executable(addr2line_cmd);
> > +             if (buf) {
> > +                     memcpy(addr2line_cmd, buf, strlen(buf));
> > +                     free(buf);
> > +             }
> > +             return addr2line_cmd;
> > +     case A2L_LLVM:
> > +     default:
> > +             return NULL;
> > +     }
> > +}
> > +
> > +char *get_vmlinux(char *input)
> > +{
> > +     const char *match_string1 = ".syms";
> > +     const char *match_string2 = ".tmp_vmlinux.kallsyms";
> > +     char *result = NULL;
> > +     char *match_pos;
> > +
> > +     match_pos = strstr(input, match_string1);
> > +     if (!match_pos)
> > +             return NULL;
> > +
> > +     match_pos = strstr(input, match_string2);
> > +     if (!match_pos)
> > +             return NULL;
> > +
> > +     result = strdup(input);
> > +     match_pos = strstr(result, match_string1);
> > +     *match_pos = '\0';
> > +     return result;
> > +}
> > diff --git a/scripts/kas_alias/a2l.h b/scripts/kas_alias/a2l.h
> > new file mode 100644
> > index 000000000000..ca6419229dde
> > --- /dev/null
> > +++ b/scripts/kas_alias/a2l.h
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef A2L_H
> > +#define A2L_H
> > +#include <stdint.h>
> > +
> > +#define ADDR2LINE "addr2line"
> > +#define ADDR2LINE_ARGS "-fe"
> > +//#define VMLINUX "vmlinux"
>
> Nit: this should be removed if it is not used.
>
> > +#define MAX_BUF 4096
> > +#define MAX_CMD_LEN 256
> > +#define P_READ 0
> > +#define P_WRITE 1
> > +#define A2L_DEFAULT 1
> > +#define A2L_CROSS 2
> > +#define A2L_LLVM 3
> > +#define A2L_MAKE_VALUE 2
> > +
> > +extern int addr2line_pid;
> > +extern int a2l_in[2];
> > +extern int a2l_out[2];
> > +extern char line[MAX_BUF];
> > +extern char vmlinux_path[MAX_BUF];
> > +extern char addr2line_cmd[MAX_CMD_LEN];
> > +
> > +int addr2line_init(const char *cmd, const char *vmlinux);
> > +char *addr2line_get_lines(uint64_t address);
> > +int addr2line_cleanup(void);
> > +const char *remove_subdir(const char *home, const char *f_path);
> > +const char *get_addr2line(int mode);
> > +char *get_vmlinux(char *input);
> > +
> > +#endif
> > diff --git a/scripts/kas_alias/duplicates_list.c b/scripts/kas_alias/duplicates_list.c
> > new file mode 100644
> > index 000000000000..e7a3d2917937
> > --- /dev/null
> > +++ b/scripts/kas_alias/duplicates_list.c
> > @@ -0,0 +1,70 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <stdlib.h>
> > +#include <stdbool.h>
> > +
> > +#include "item_list.h"
> > +#include "duplicates_list.h"
> > +
>
> /* The item list must be sorted. */
>
> I think we can find the duplicated symbol without sorting the list,
> but such implementation level optimization will be done later.

I think I will handle this aspect in the v4.

>
> > +struct duplicate_item *find_duplicates(struct item *list)
> > +{
> > +     struct duplicate_item *current_duplicate = NULL;
> > +     struct duplicate_item *duplicates = NULL;
> > +     struct duplicate_item *new_duplicate;
> > +     struct item *current_item = list;
> > +     bool prev_was_duplicate = false;
> > +     struct item *prev_item = NULL;
> > +
> > +     while (current_item) {
> > +             if ((prev_item && (strcmp(current_item->symb_name, prev_item->symb_name) == 0)) ||
> > +                 prev_was_duplicate) {
> > +                     if (!duplicates) {
> > +                             duplicates = malloc(sizeof(struct duplicate_item));
> > +                             if (!duplicates)
> > +                                     return NULL;
> > +
> > +                             duplicates->original_item = prev_item;
> > +                             duplicates->next = NULL;
> > +                             current_duplicate = duplicates;
> > +                     } else {
> > +                             new_duplicate = malloc(sizeof(struct duplicate_item));
> > +                             if (!new_duplicate) {
> > +                                     free_duplicates(&duplicates);
> > +                                     return NULL;
> > +                             }
> > +
> > +                             new_duplicate->original_item = prev_item;
> > +                             new_duplicate->next = NULL;
> > +                             current_duplicate->next = new_duplicate;
> > +                             current_duplicate = new_duplicate;
> > +
> > +                             if ((strcmp(current_item->symb_name, prev_item->symb_name) != 0) &&
> > +                                 (prev_was_duplicate))
> > +                                     prev_was_duplicate = false;
> > +                             else
> > +                                     prev_was_duplicate = true;
> > +                     }
> > +             }
> > +
> > +             prev_item = current_item;
> > +             current_item = current_item->next;
> > +     }
> > +
> > +     return duplicates;
> > +}
> > +
> > +void free_duplicates(struct duplicate_item **duplicates)
> > +{
> > +     struct duplicate_item *duplicates_iterator = *duplicates;
> > +     struct duplicate_item *app;
> > +
> > +     while (duplicates_iterator) {
> > +             app = duplicates_iterator;
> > +             duplicates_iterator = duplicates_iterator->next;
> > +             free(app);
> > +     }
> > +
> > +     *duplicates = NULL;
> > +}
> > diff --git a/scripts/kas_alias/duplicates_list.h b/scripts/kas_alias/duplicates_list.h
> > new file mode 100644
> > index 000000000000..76aa73e584bc
> > --- /dev/null
> > +++ b/scripts/kas_alias/duplicates_list.h
> > @@ -0,0 +1,15 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef DUPLICATES_LIST_H
> > +#define DUPLICATES_LIST_H
> > +
> > +#include "item_list.h"
> > +
> > +struct duplicate_item {
> > +     struct item *original_item;
> > +     struct duplicate_item *next;
> > +};
> > +
> > +struct duplicate_item *find_duplicates(struct item *list);
> > +void free_duplicates(struct duplicate_item **duplicates);
> > +
> > +#endif
> > diff --git a/scripts/kas_alias/item_list.c b/scripts/kas_alias/item_list.c
> > new file mode 100644
> > index 000000000000..48f2e525592a
> > --- /dev/null
> > +++ b/scripts/kas_alias/item_list.c
> > @@ -0,0 +1,230 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <stdint.h>
> > +#include <string.h>
> > +#include <stdbool.h>
> > +#include <assert.h>
> > +#include "item_list.h"
> > +
> > +#define CHECK_ORDER_BY_ADDRESS(sort_by, current, temp, op) \
> > +     ((sort_by) == BY_ADDRESS && (current)->addr op (temp)->addr)
> > +#define CHECK_ORDER_BY_NAME(sort_by, current, temp, op) \
> > +     ((sort_by) == BY_NAME && strcmp((current)->symb_name, (temp)->symb_name) op 0)
> > +
> > +struct item *list_index[96] = {0};
> > +
> > +void build_index(struct item *list)
> > +{
> > +     char current_first_letter = ' ';
> > +     struct item *current = list;
> > +
> > +     while (current) {
> > +             if (current->symb_name[0] != current_first_letter) {
> > +                     current_first_letter = current->symb_name[0];
> > +                     list_index[current_first_letter - 32] = current;
> > +             }
> > +             current = current->next;
> > +     }
> > +}
> > +
> > +struct item *add_item(struct item **list, const char *name, char stype, uint64_t addr)
> > +{
> > +     struct item *new_item;
> > +     struct item *current;
> > +
> > +     new_item = malloc(sizeof(struct item));
> > +     if (!new_item)
> > +             return NULL;
> > +
> > +     strncpy(new_item->symb_name, name, MAX_NAME_SIZE);
> > +     new_item->symb_name[MAX_NAME_SIZE - 1] = '\0';
> > +     new_item->addr = addr;
> > +     new_item->stype = stype;
> > +     new_item->next = NULL;
> > +
> > +     if (!(*list)) {
> > +             *list = new_item;
> > +     } else {
> > +             current = *list;
> > +             while (current->next)
> > +                     current = current->next;
> > +
> > +             current->next = new_item;
> > +     }
> > +     return new_item;
> > +}
> > +
> > +void sort_list(struct item **list, int sort_by)
> > +{
> > +     struct item *current = *list;
> > +     struct item *sorted = NULL;
> > +     struct item *next_item;
> > +     struct item *temp;
> > +
> > +     if (!(*list) || !((*list)->next))
> > +             return;
> > +
> > +     while (current) {
> > +             next_item = current->next;
> > +             if (!sorted ||
> > +                 (CHECK_ORDER_BY_ADDRESS(sort_by, current, sorted, <) ||
> > +                 CHECK_ORDER_BY_NAME(sort_by, current, sorted, >=))) {
> > +                     current->next = sorted;
> > +                     sorted = current;
> > +             } else {
> > +                     temp = sorted;
> > +                     while (temp->next &&
> > +                            (CHECK_ORDER_BY_ADDRESS(sort_by, current, temp->next, >=) ||
> > +                            CHECK_ORDER_BY_NAME(sort_by, current, temp->next, >=)))
> > +                             temp = temp->next;
> > +
> > +                     current->next = temp->next;
> > +                     temp->next = current;
> > +             }
> > +             current = next_item;
> > +     }
> > +
> > +     *list = sorted;
> > +}
> > +
> > +struct item *merge(struct item *left, struct item *right, int sort_by)
> > +{
> > +     struct item *current = NULL;
> > +     struct item *result = NULL;
> > +
> > +     if (!left)
> > +             return right;
> > +     if (!right)
> > +             return left;
> > +
> > +     if (sort_by == BY_NAME) {
> > +             if (strcmp(left->symb_name, right->symb_name) <= 0) {
> > +                     result = left;
> > +                     left = left->next;
> > +             } else {
> > +                     result = right;
> > +                     right = right->next;
> > +             }
> > +     } else {
> > +             if (sort_by == BY_ADDRESS) {
> > +                     if (left->addr <= right->addr) {
> > +                             result = left;
> > +                             left = left->next;
> > +                     } else {
> > +                             result = right;
> > +                             right = right->next;
> > +                     }
> > +             }
> > +     }
> > +
> > +     current = result;
> > +
> > +     while (left && right) {
> > +             if (sort_by == BY_NAME) {
> > +                     if (strcmp(left->symb_name, right->symb_name) <= 0) {
> > +                             current->next = left;
> > +                             left = left->next;
> > +                     } else {
> > +                             current->next = right;
> > +                             right = right->next;
> > +                     }
> > +             } else {
> > +                     if (sort_by == BY_ADDRESS) {
> > +                             if (left->addr <= right->addr) {
> > +                                     current->next = left;
> > +                                     left = left->next;
> > +                             } else {
> > +                                     current->next = right;
> > +                                     right = right->next;
> > +                             }
> > +                     }
> > +             }
> > +
> > +             current = current->next;
> > +     }
> > +
> > +     if (left) {
> > +             current->next = left;
> > +     } else {
> > +             if (right)
> > +                     current->next = right;
> > +     }
> > +
> > +     return result;
> > +}
> > +
> > +struct item *merge_sort(struct item *head, int sort_by)
> > +{
> > +     struct item *right;
> > +     struct item *slow;
> > +     struct item *fast;
> > +     struct item *left;
> > +
> > +     if (!head || !head->next)
> > +             return head;
> > +
> > +     slow = head;
> > +     fast = head->next;
> > +
> > +     while (fast && fast->next) {
> > +             slow = slow->next;
> > +             fast = fast->next->next;
> > +     }
> > +
> > +     left = head;
> > +     right = slow->next;
> > +     slow->next = NULL;
> > +
> > +     left = merge_sort(left, sort_by);
> > +     right = merge_sort(right, sort_by);
> > +
> > +     return merge(left, right, sort_by);
> > +}
> > +
> > +void sort_list_m(struct item **head, int sort_by)
> > +{
> > +     if (!(*head) || !((*head)->next))
> > +             return;
> > +
> > +     *head = merge_sort(*head, sort_by);
> > +}
> > +
> > +int insert_after(struct item *list, const uint64_t search_addr,
> > +              const char *name, uint64_t addr, char stype)
> > +{
> > +     struct item *new_item;
> > +     struct item *current;
> > +     int ret = 0;
> > +
> > +     current = (list_index[name[0] - 32]) ? list_index[name[0] - 32] : list;
> > +     while (current) {
> > +             if (current->addr == search_addr) {
> > +                     new_item = malloc(sizeof(struct item));
> > +                     if (!new_item)
> > +                             return ret;
> > +                     strncpy(new_item->symb_name, name, MAX_NAME_SIZE);
> > +                     new_item->symb_name[MAX_NAME_SIZE - 1] = '\0';
> > +                     new_item->addr = addr;
> > +                     new_item->stype = stype;
> > +                     new_item->next = current->next;
> > +                     current->next = new_item;
> > +                     ret = 1;
> > +                     break;
> > +             }
> > +             current = current->next;
> > +     }
> > +     return ret;
> > +}
> > +
> > +void free_items(struct item **head)
> > +{
> > +     struct item *app, *item_iterator = *head;
> > +
> > +     while (item_iterator) {
> > +             app = item_iterator;
> > +             item_iterator = item_iterator->next;
> > +             free(app);
> > +     }
> > +     *head = NULL;
> > +}
> > diff --git a/scripts/kas_alias/item_list.h b/scripts/kas_alias/item_list.h
> > new file mode 100644
> > index 000000000000..b4891cb088ee
> > --- /dev/null
> > +++ b/scripts/kas_alias/item_list.h
> > @@ -0,0 +1,26 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef ITEM_LIST_H
> > +#define ITEM_LIST_H
> > +#include <stdint.h>
> > +
> > +#define MAX_NAME_SIZE 256
> > +#define BY_ADDRESS 1
> > +#define BY_NAME 2
> > +
> > +struct item {
> > +     char            symb_name[MAX_NAME_SIZE];
> > +     uint64_t        addr;
> > +     char            stype;
> > +     struct item     *next;
> > +};
> > +
> > +void build_index(struct item *list);
> > +struct item *add_item(struct item **list, const char *name, char stype, uint64_t addr);
> > +void sort_list(struct item **list, int sort_by);
> > +struct item *merge(struct item *left, struct item *right, int sort_by);
> > +struct item *merge_sort(struct item *head, int sort_by);
> > +void sort_list_m(struct item **head, int sort_by);
> > +int insert_after(struct item *list, const uint64_t search_addr,
> > +              const char *name, uint64_t addr, char stype);
> > +void free_items(struct item **head);
> > +#endif
> > diff --git a/scripts/kas_alias/kas_alias.c b/scripts/kas_alias/kas_alias.c
> > new file mode 100644
> > index 000000000000..532aeb39f851
> > --- /dev/null
> > +++ b/scripts/kas_alias/kas_alias.c
> > @@ -0,0 +1,217 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <stdint.h>
> > +#include <unistd.h>
> > +#include <string.h>
> > +#include <stdbool.h>
> > +#include <stdarg.h>
> > +#include <regex.h>
> > +
> > +#include "item_list.h"
> > +#include "duplicates_list.h"
> > +#include "a2l.h"
> > +
> > +#define SYMB_IS_TEXT(s) ((((s)->stype) == 't') ||  (((s)->stype) == 'T'))
> > +#define SYMB_IS_DATA(s) ((((s)->stype) == 'b') ||  (((s)->stype) == 'B') || \
> > +                      (((s)->stype) == 'd') ||  (((s)->stype) == 'D') || \
> > +                      (((s)->stype) == 'r') ||  (((s)->stype) == 'R'))
> > +#ifdef CONFIG_KALLSYMS_ALIAS_DATA
> > +#define SYMB_NEEDS_ALIAS(s) (SYMB_IS_TEXT(s) || SYMB_IS_DATA(s))
> > +#else
> > +#define SYMB_NEEDS_ALIAS(s) SYMB_IS_TEXT(s)
> > +#endif
> > +#define FNOMATCH 0
> > +#define FMATCH 1
> > +#define EREGEX 2
> > +
> > +const char *ignore_regex[] = {
> > +     "^__cfi_.*$",                           // __cfi_ preamble
> > +#ifndef CONFIG_KALLSYMS_ALIAS_DATA_ALL
> > +     "^_*TRACE_SYSTEM.*$",
> > +     "^__already_done\\.[0-9]+$",            // Call a function once data
> > +     "^___tp_str\\.[0-9]+$",
> > +     "^___done\\.[0-9]+$",
> > +     "^__print_once\\.[0-9]+$",
> > +     "^_rs\\.[0-9]+$",
> > +     "^__compound_literal\\.[0-9]+$",
> > +     "^___once_key\\.[0-9]+$",
> > +     "^__func__\\.[0-9]+$",
> > +     "^__msg\\.[0-9]+$",
> > +     "^CSWTCH\\.[0-9]+$",
> > +     "^__flags\\.[0-9]+$",
> > +     "^__wkey.*$",
> > +     "^__mkey.*$",
> > +     "^__key.*$",
> > +#endif
> > +     "^__pfx_.*$"                            // NOP-padding
> > +};
> > +
> > +int suffix_serial;
> > +
> > +static inline void verbose_msg(bool verbose, const char *fmt, ...)
> > +{
> > +     va_list args;
> > +
> > +     va_start(args, fmt);
> > +     if (verbose)
> > +             printf(fmt, args);
> > +
> > +     va_end(args);
> > +}
> > +
> > +static void create_suffix(const char *name, char *output_suffix)
> > +{
> > +     sprintf(output_suffix, "%s__alias__%d", name, suffix_serial++);
> > +}
> > +
> > +static void create_file_suffix(const char *name, uint64_t address, char *output_suffix, char *cwd)
> > +{
> > +     const char *f_path;
> > +     char *buf;
> > +     int i = 0;
> > +
> > +     buf = addr2line_get_lines(address);
> > +     f_path = remove_subdir(cwd, buf);
> > +     if (f_path) {
> > +             sprintf(output_suffix, "%s@%s", name, f_path);
> > +             while (*(output_suffix + i) != '\0') {
> > +                     switch (*(output_suffix + i)) {
> > +                     case '/':
> > +                     case ':':
> > +                     case '.':
>
> This is not enough because file path will accept more than that.
> E.g. '+', '&', '-'.
> So I think we should check it by '!isalnum()'. (allow-list method)
>
> > +                             *(output_suffix + i) = '_';
> > +                             break;
> > +                     default:
> > +                     }
> > +             i++;
> > +             }
> > +     } else {
> > +             create_suffix(name, output_suffix);
> > +     }
> > +}
> > +
> > +static int filter_symbols(char *symbol, const char **ignore_list, int regex_no)
> > +{
> > +     regex_t regex;
> > +     int res, i;
> > +
> > +     for (i = 0; i < regex_no; i++) {
> > +             res = regcomp(&regex, ignore_list[i], REG_EXTENDED);
> > +             if (res)
> > +                     return -EREGEX;
> > +
> > +             res = regexec(&regex, symbol, 0, NULL, 0);
> > +             regfree(&regex);
> > +             switch (res) {
> > +             case 0:
> > +                     return FMATCH;
> > +             case REG_NOMATCH:
> > +                     break;
> > +             default:
> > +                     return -EREGEX;
> > +             }
> > +     }
> > +
> > +     return FNOMATCH;
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +     char t, sym_name[MAX_NAME_SIZE], new_name[MAX_NAME_SIZE + 15];
> > +     struct duplicate_item  *duplicate_iterator;
> > +     struct duplicate_item *duplicate;
> > +     struct item *head = {NULL};
> > +     bool need_2_process = true;
> > +     struct item *last = {NULL};
> > +     struct item  *current;
> > +     int verbose_mode = 0;
> > +     uint64_t address;
> > +     FILE *fp;
> > +     int res;
> > +
> > +     if (argc < 2 || argc > 3) {
> > +             printf("Usage: %s <nmfile> [-verbose]\n", argv[0]);
> > +             return 1;
> > +     }
> > +
> > +     if (argc == 3 && strcmp(argv[2], "-verbose") == 0)
> > +             verbose_mode = 1;
> > +
> > +     verbose_msg(verbose_mode, "Scanning nm data(%s)\n", argv[1]);
> > +
> > +     fp = fopen(argv[1], "r");
> > +     if (!fp) {
> > +             printf("Can't open input file.\n");
> > +             return 1;
> > +     }
> > +
> > +     if (!addr2line_init(get_addr2line(A2L_DEFAULT), get_vmlinux(argv[1])))
> > +             return 1;
> > +
> > +     while (fscanf(fp, "%lx %c %99s\n", &address, &t, sym_name) == 3) {
> > +             if (strstr(sym_name, "@_")) {
> > +                     if (verbose_mode && need_2_process)
> > +                             printf("Already processed\n");
> > +                     need_2_process = false;
> > +                     }
> > +             last = add_item(&last, sym_name, t, address);
> > +             if (!last) {
> > +                     printf("Error in allocate memory\n");
> > +                     free_items(&head);
> > +                     return 1;
> > +             }
> > +
> > +             if (!head)
> > +                     head = last;
> > +     }
> > +
> > +     fclose(fp);
> > +
> > +     if (need_2_process) {
> > +             verbose_msg(verbose_mode, "Sorting nm data\n");
> > +             sort_list_m(&head, BY_NAME);
> > +             verbose_msg(verbose_mode, "Scanning nm data for duplicates\n");
> > +             duplicate = find_duplicates(head);
> > +             if (!duplicate) {
> > +                     printf("Error in duplicates list\n");
> > +                     return 1;
> > +             }
> > +
> > +             verbose_msg(verbose_mode, "Applying suffixes\n");
> > +             build_index(head);
> > +             duplicate_iterator = duplicate;
> > +             while (duplicate_iterator) {
> > +                     res = filter_symbols(duplicate_iterator->original_item->symb_name,
> > +                                          ignore_regex, sizeof(ignore_regex) /
> > +                                          sizeof(ignore_regex[0]));
> > +                     if (res != FMATCH &&
> > +                         SYMB_NEEDS_ALIAS(duplicate_iterator->original_item)) {
> > +                             if (res < 0)
> > +                                     return 1;
> > +
> > +                             create_file_suffix(duplicate_iterator->original_item->symb_name,
> > +                                                duplicate_iterator->original_item->addr,
> > +                                                new_name, vmlinux_path);
> > +                             if (!insert_after(head, duplicate_iterator->original_item->addr,
> > +                                               new_name, duplicate_iterator->original_item->addr,
> > +                                               duplicate_iterator->original_item->stype))
> > +                                     return 1;
> > +                     }
> > +
> > +                     duplicate_iterator = duplicate_iterator->next;
> > +             }
> > +
> > +             sort_list_m(&head, BY_ADDRESS);
> > +     }
> > +     current = head;
> > +     while (current) {
> > +             printf("%08lx %c %s\n", current->addr, current->stype, current->symb_name);
> > +             current = current->next;
> > +     }
> > +
> > +     free_items(&head);
> > +     free_duplicates(&duplicate);
> > +     addr2line_cleanup();
> > +     return 0;
> > +}
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index a432b171be82..cacf60b597ce 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -89,8 +89,9 @@ vmlinux_link()
> >
> >       ldflags="${ldflags} ${wl}--script=${objtree}/${KBUILD_LDS}"
> >
> > -     # The kallsyms linking does not need debug symbols included.
> > -     if [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
> > +     # The kallsyms linking does not need debug symbols included, unless the KALLSYMS_ALIAS.
> > +     if [ ! is_enabled CONFIG_KALLSYMS_ALIAS ] && \
> > +         [ "$output" != "${output#.tmp_vmlinux.kallsyms}" ] ; then
> >               ldflags="${ldflags} ${wl}--strip-debug"
> >       fi
> >
> > @@ -161,7 +162,11 @@ kallsyms()
> >       fi
> >
> >       info KSYMS ${2}
> > -     scripts/kallsyms ${kallsymopt} ${1} > ${2}
> > +     if is_enabled CONFIG_KALLSYMS_ALIAS; then
> > +             ALIAS=".alias"
> > +             scripts/kas_alias/kas_alias ${1} >${1}${ALIAS}
> > +             fi
>
> Nit: wrong indentation.
> And ALIAS is not defined if CONFIG_KALLSYMS_ALIAS=n.

I rely on the fact that ALIAS is undefined if CONFIG_KALLSYMS_ALIAS=n.
The undefined variable expands to an empty string, and I was constructing
my name using this approach. This method functions in this manner with
all the shells I'm familiar with.
However, I acknowledge that it's more of a hack than a clean programming
practice. In version 4, I will provide a cleaner shell programming
solution.

>
>
> > +     scripts/kallsyms ${kallsymopt} ${1}${ALIAS} > ${2}
> >  }
> >
> >  # Perform one step in kallsyms generation, including temporary linking of
> > --
> > 2.34.1
> >
>
>
> Thank you,
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox