* [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG
@ 2024-07-30 0:54 Song Liu
2024-07-30 0:54 ` [PATCH 1/3] kallsyms: Do not cleanup .llvm.<hash> suffix before sorting symbols Song Liu
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Song Liu @ 2024-07-30 0:54 UTC (permalink / raw)
To: live-patching, linux-kernel, linux-trace-kernel
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, nathan, morbo,
justinstitt, mcgrof, thunder.leizhen, kees, kernel-team, song,
mmaurer, samitolvanen, mhiramat, rostedt
With CONFIG_LTO_CLANG, the compiler/linker adds .llvm.<hash> suffix to
local symbols to avoid duplications. Existing scripts/kallsyms sorts
symbols without .llvm.<hash> suffix. However, this causes quite some
issues later on. Some users of kallsyms, such as livepatch, have to match
symbols exactly; while other users, such as kprobe, would match symbols
without the suffix.
Address this by sorting full symbols (with .llvm.<hash>) at build time, and
split kallsyms APIs to explicitly match full symbols or without suffix.
Specifically, exiting APIs will match symbols exactly. Two new APIs are
added to match symbols with suffix. Use the new APIs in tracing/kprobes.
Song Liu (3):
kallsyms: Do not cleanup .llvm.<hash> suffix before sorting symbols
kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
tracing/kprobes: Use APIs that matches symbols with .llvm.<hash>
suffix
include/linux/kallsyms.h | 14 +++++++
kernel/kallsyms.c | 83 ++++++++++++++++++++++++++-----------
kernel/kallsyms_selftest.c | 22 +---------
kernel/trace/trace_kprobe.c | 10 ++++-
scripts/kallsyms.c | 30 +-------------
scripts/link-vmlinux.sh | 4 --
6 files changed, 83 insertions(+), 80 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] kallsyms: Do not cleanup .llvm.<hash> suffix before sorting symbols
2024-07-30 0:54 [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
@ 2024-07-30 0:54 ` Song Liu
2024-07-30 0:54 ` [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix Song Liu
` (2 subsequent siblings)
3 siblings, 0 replies; 17+ messages in thread
From: Song Liu @ 2024-07-30 0:54 UTC (permalink / raw)
To: live-patching, linux-kernel, linux-trace-kernel
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, nathan, morbo,
justinstitt, mcgrof, thunder.leizhen, kees, kernel-team, song,
mmaurer, samitolvanen, mhiramat, rostedt
Cleaning up the symbols causes various issues afterwards. Let's sort
the list based on original name and handle suffix later during the
symbol lookup.
Signed-off-by: Song Liu <song@kernel.org>
---
scripts/kallsyms.c | 31 ++-----------------------------
scripts/link-vmlinux.sh | 4 ----
2 files changed, 2 insertions(+), 33 deletions(-)
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 0ed873491bf5..123dab0572f8 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -5,8 +5,7 @@
* This software may be used and distributed according to the terms
* of the GNU General Public License, incorporated herein by reference.
*
- * Usage: kallsyms [--all-symbols] [--absolute-percpu]
- * [--lto-clang] in.map > out.S
+ * Usage: kallsyms [--all-symbols] [--absolute-percpu] in.map > out.S
*
* Table compression uses all the unused char codes on the symbols and
* maps these to the most used substrings (tokens). For instance, it might
@@ -62,7 +61,6 @@ static struct sym_entry **table;
static unsigned int table_size, table_cnt;
static int all_symbols;
static int absolute_percpu;
-static int lto_clang;
static int token_profit[0x10000];
@@ -73,8 +71,7 @@ static unsigned char best_table_len[256];
static void usage(void)
{
- fprintf(stderr, "Usage: kallsyms [--all-symbols] [--absolute-percpu] "
- "[--lto-clang] in.map > out.S\n");
+ fprintf(stderr, "Usage: kallsyms [--all-symbols] [--absolute-percpu] in.map > out.S\n");
exit(1);
}
@@ -344,25 +341,6 @@ static bool symbol_absolute(const struct sym_entry *s)
return s->percpu_absolute;
}
-static void cleanup_symbol_name(char *s)
-{
- char *p;
-
- /*
- * ASCII[.] = 2e
- * ASCII[0-9] = 30,39
- * ASCII[A-Z] = 41,5a
- * ASCII[_] = 5f
- * ASCII[a-z] = 61,7a
- *
- * As above, replacing the first '.' in ".llvm." with '\0' does not
- * affect the main sorting, but it helps us with subsorting.
- */
- p = strstr(s, ".llvm.");
- if (p)
- *p = '\0';
-}
-
static int compare_names(const void *a, const void *b)
{
int ret;
@@ -526,10 +504,6 @@ static void write_src(void)
output_address(relative_base);
printf("\n");
- if (lto_clang)
- for (i = 0; i < table_cnt; i++)
- cleanup_symbol_name((char *)table[i]->sym);
-
sort_symbols_by_name();
output_label("kallsyms_seqs_of_names");
for (i = 0; i < table_cnt; i++)
@@ -807,7 +781,6 @@ int main(int argc, char **argv)
static const struct option long_options[] = {
{"all-symbols", no_argument, &all_symbols, 1},
{"absolute-percpu", no_argument, &absolute_percpu, 1},
- {"lto-clang", no_argument, <o_clang, 1},
{},
};
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index f7b2503cdba9..22d0bc843986 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -156,10 +156,6 @@ kallsyms()
kallsymopt="${kallsymopt} --absolute-percpu"
fi
- if is_enabled CONFIG_LTO_CLANG; then
- kallsymopt="${kallsymopt} --lto-clang"
- fi
-
info KSYMS "${2}.S"
scripts/kallsyms ${kallsymopt} "${1}" > "${2}.S"
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-07-30 0:54 [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
2024-07-30 0:54 ` [PATCH 1/3] kallsyms: Do not cleanup .llvm.<hash> suffix before sorting symbols Song Liu
@ 2024-07-30 0:54 ` Song Liu
2024-07-30 13:03 ` Masami Hiramatsu
2024-08-02 17:16 ` Song Liu
2024-07-30 0:54 ` [PATCH 3/3] tracing/kprobes: Use APIs that matches symbols with .llvm.<hash> suffix Song Liu
2024-07-30 5:36 ` [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
3 siblings, 2 replies; 17+ messages in thread
From: Song Liu @ 2024-07-30 0:54 UTC (permalink / raw)
To: live-patching, linux-kernel, linux-trace-kernel
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, nathan, morbo,
justinstitt, mcgrof, thunder.leizhen, kees, kernel-team, song,
mmaurer, samitolvanen, mhiramat, rostedt
With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
to avoid duplication. This causes confusion with users of kallsyms.
On one hand, users like livepatch are required to match the symbols
exactly. On the other hand, users like kprobe would like to match to
original function names.
Solve this by splitting kallsyms APIs. Specifically, existing APIs now
should match the symbols exactly. Add two APIs that matches the full
symbol, or only the part without .llvm.suffix. Specifically, the following
two APIs are added:
1. kallsyms_lookup_name_or_prefix()
2. kallsyms_on_each_match_symbol_or_prefix()
These APIs will be used by kprobe.
Also cleanup some code and adjust kallsyms_selftests accordingly.
Signed-off-by: Song Liu <song@kernel.org>
---
include/linux/kallsyms.h | 14 +++++++
kernel/kallsyms.c | 83 ++++++++++++++++++++++++++------------
kernel/kallsyms_selftest.c | 22 +---------
3 files changed, 73 insertions(+), 46 deletions(-)
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index c3f075e8f60c..09b2d2099107 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -74,9 +74,12 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long),
void *data);
int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
const char *name, void *data);
+int kallsyms_on_each_match_symbol_or_prefix(int (*fn)(void *, unsigned long),
+ const char *name, void *data);
/* Lookup the address for a symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name);
+unsigned long kallsyms_lookup_name_or_prefix(const char *name);
extern int kallsyms_lookup_size_offset(unsigned long addr,
unsigned long *symbolsize,
@@ -104,6 +107,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
return 0;
}
+static inline unsigned long kallsyms_lookup_name_or_prefix(const char *name)
+{
+ return 0;
+}
+
static inline int kallsyms_lookup_size_offset(unsigned long addr,
unsigned long *symbolsize,
unsigned long *offset)
@@ -165,6 +173,12 @@ static inline int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long)
{
return -EOPNOTSUPP;
}
+
+static inline int kallsyms_on_each_match_symbol_or_prefix(int (*fn)(void *, unsigned long),
+ const char *name, void *data)
+{
+ return -EOPNOTSUPP;
+}
#endif /*CONFIG_KALLSYMS*/
static inline void print_ip_sym(const char *loglvl, unsigned long ip)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index fb2c77368d18..4285dd85d814 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -164,9 +164,6 @@ static void cleanup_symbol_name(char *s)
{
char *res;
- if (!IS_ENABLED(CONFIG_LTO_CLANG))
- return;
-
/*
* LLVM appends various suffixes for local functions and variables that
* must be promoted to global scope as part of LTO. This can break
@@ -181,13 +178,13 @@ static void cleanup_symbol_name(char *s)
return;
}
-static int compare_symbol_name(const char *name, char *namebuf)
+static int compare_symbol_name(const char *name, char *namebuf, bool exact_match)
{
- /* The kallsyms_seqs_of_names is sorted based on names after
- * cleanup_symbol_name() (see scripts/kallsyms.c) if clang lto is enabled.
- * To ensure correct bisection in kallsyms_lookup_names(), do
- * cleanup_symbol_name(namebuf) before comparing name and namebuf.
- */
+ int ret = strcmp(name, namebuf);
+
+ if (exact_match || !ret)
+ return ret;
+
cleanup_symbol_name(namebuf);
return strcmp(name, namebuf);
}
@@ -204,13 +201,17 @@ static unsigned int get_symbol_seq(int index)
static int kallsyms_lookup_names(const char *name,
unsigned int *start,
- unsigned int *end)
+ unsigned int *end,
+ bool exact_match)
{
int ret;
int low, mid, high;
unsigned int seq, off;
char namebuf[KSYM_NAME_LEN];
+ if (!IS_ENABLED(CONFIG_LTO_CLANG))
+ exact_match = true;
+
low = 0;
high = kallsyms_num_syms - 1;
@@ -219,7 +220,7 @@ static int kallsyms_lookup_names(const char *name,
seq = get_symbol_seq(mid);
off = get_symbol_offset(seq);
kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- ret = compare_symbol_name(name, namebuf);
+ ret = compare_symbol_name(name, namebuf, exact_match);
if (ret > 0)
low = mid + 1;
else if (ret < 0)
@@ -236,7 +237,7 @@ static int kallsyms_lookup_names(const char *name,
seq = get_symbol_seq(low - 1);
off = get_symbol_offset(seq);
kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- if (compare_symbol_name(name, namebuf))
+ if (compare_symbol_name(name, namebuf, exact_match))
break;
low--;
}
@@ -248,7 +249,7 @@ static int kallsyms_lookup_names(const char *name,
seq = get_symbol_seq(high + 1);
off = get_symbol_offset(seq);
kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- if (compare_symbol_name(name, namebuf))
+ if (compare_symbol_name(name, namebuf, exact_match))
break;
high++;
}
@@ -268,13 +269,35 @@ unsigned long kallsyms_lookup_name(const char *name)
if (!*name)
return 0;
- ret = kallsyms_lookup_names(name, &i, NULL);
+ ret = kallsyms_lookup_names(name, &i, NULL, true);
if (!ret)
return kallsyms_sym_address(get_symbol_seq(i));
return module_kallsyms_lookup_name(name);
}
+/*
+ * Lookup the address for this symbol.
+ *
+ * With CONFIG_LTO_CLANG=y, if there is no exact match, also try lookup
+ * symbol.llvm.<hash>.
+ */
+unsigned long kallsyms_lookup_name_or_prefix(const char *name)
+{
+ unsigned long addr;
+
+ addr = kallsyms_lookup_name(name);
+
+ if (!addr && IS_ENABLED(CONFIG_LTO_CLANG)) {
+ int ret, i;
+
+ ret = kallsyms_lookup_names(name, &i, NULL, false);
+ if (!ret)
+ addr = kallsyms_sym_address(get_symbol_seq(i));
+ }
+ return addr;
+}
+
/*
* Iterate over all symbols in vmlinux. For symbols from modules use
* module_kallsyms_on_each_symbol instead.
@@ -303,7 +326,25 @@ int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
int ret;
unsigned int i, start, end;
- ret = kallsyms_lookup_names(name, &start, &end);
+ ret = kallsyms_lookup_names(name, &start, &end, true);
+ if (ret)
+ return 0;
+
+ for (i = start; !ret && i <= end; i++) {
+ ret = fn(data, kallsyms_sym_address(get_symbol_seq(i)));
+ cond_resched();
+ }
+
+ return ret;
+}
+
+int kallsyms_on_each_match_symbol_or_prefix(int (*fn)(void *, unsigned long),
+ const char *name, void *data)
+{
+ int ret;
+ unsigned int i, start, end;
+
+ ret = kallsyms_lookup_names(name, &start, &end, false);
if (ret)
return 0;
@@ -450,8 +491,6 @@ const char *kallsyms_lookup(unsigned long addr,
int lookup_symbol_name(unsigned long addr, char *symname)
{
- int res;
-
symname[0] = '\0';
symname[KSYM_NAME_LEN - 1] = '\0';
@@ -462,16 +501,10 @@ int lookup_symbol_name(unsigned long addr, char *symname)
/* Grab name */
kallsyms_expand_symbol(get_symbol_offset(pos),
symname, KSYM_NAME_LEN);
- goto found;
+ return 0;
}
/* See if it's in a module. */
- res = lookup_module_symbol_name(addr, symname);
- if (res)
- return res;
-
-found:
- cleanup_symbol_name(symname);
- return 0;
+ return lookup_module_symbol_name(addr, symname);
}
/* Look up a kernel symbol and return it in a text buffer. */
diff --git a/kernel/kallsyms_selftest.c b/kernel/kallsyms_selftest.c
index 2f84896a7bcb..873f7c445488 100644
--- a/kernel/kallsyms_selftest.c
+++ b/kernel/kallsyms_selftest.c
@@ -187,31 +187,11 @@ static void test_perf_kallsyms_lookup_name(void)
stat.min, stat.max, div_u64(stat.sum, stat.real_cnt));
}
-static bool match_cleanup_name(const char *s, const char *name)
-{
- char *p;
- int len;
-
- if (!IS_ENABLED(CONFIG_LTO_CLANG))
- return false;
-
- p = strstr(s, ".llvm.");
- if (!p)
- return false;
-
- len = strlen(name);
- if (p - s != len)
- return false;
-
- return !strncmp(s, name, len);
-}
-
static int find_symbol(void *data, const char *name, unsigned long addr)
{
struct test_stat *stat = (struct test_stat *)data;
- if (strcmp(name, stat->name) == 0 ||
- (!stat->perf && match_cleanup_name(name, stat->name))) {
+ if (!strcmp(name, stat->name)) {
stat->real_cnt++;
stat->addr = addr;
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/3] tracing/kprobes: Use APIs that matches symbols with .llvm.<hash> suffix
2024-07-30 0:54 [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
2024-07-30 0:54 ` [PATCH 1/3] kallsyms: Do not cleanup .llvm.<hash> suffix before sorting symbols Song Liu
2024-07-30 0:54 ` [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix Song Liu
@ 2024-07-30 0:54 ` Song Liu
2024-07-30 13:04 ` Masami Hiramatsu
2024-07-30 5:36 ` [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
3 siblings, 1 reply; 17+ messages in thread
From: Song Liu @ 2024-07-30 0:54 UTC (permalink / raw)
To: live-patching, linux-kernel, linux-trace-kernel
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, nathan, morbo,
justinstitt, mcgrof, thunder.leizhen, kees, kernel-team, song,
mmaurer, samitolvanen, mhiramat, rostedt
Use the new kallsyms APIs that matches symbols name with .llvm.<hash>
suffix. This allows userspace tools to get kprobes on the expected
function name, while the actual symbol has a .llvm.<hash> suffix.
This only effects kernel compared with CONFIG_LTO_CLANG.
Signed-off-by: Song Liu <song@kernel.org>
---
kernel/trace/trace_kprobe.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 61a6da808203..c319382c1a09 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -202,7 +202,8 @@ unsigned long trace_kprobe_address(struct trace_kprobe *tk)
if (tk->symbol) {
addr = (unsigned long)
- kallsyms_lookup_name(trace_kprobe_symbol(tk));
+ kallsyms_lookup_name_or_prefix(trace_kprobe_symbol(tk));
+
if (addr)
addr += tk->rp.kp.offset;
} else {
@@ -766,8 +767,13 @@ static unsigned int number_of_same_symbols(const char *mod, const char *func_nam
{
struct sym_count_ctx ctx = { .count = 0, .name = func_name };
- if (!mod)
+ if (!mod) {
kallsyms_on_each_match_symbol(count_symbols, func_name, &ctx.count);
+ if (IS_ENABLED(CONFIG_LTO_CLANG) && !ctx.count) {
+ kallsyms_on_each_match_symbol_or_prefix(
+ count_symbols, func_name, &ctx.count);
+ }
+ }
module_kallsyms_on_each_symbol(mod, count_mod_symbols, &ctx);
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG
2024-07-30 0:54 [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
` (2 preceding siblings ...)
2024-07-30 0:54 ` [PATCH 3/3] tracing/kprobes: Use APIs that matches symbols with .llvm.<hash> suffix Song Liu
@ 2024-07-30 5:36 ` Song Liu
3 siblings, 0 replies; 17+ messages in thread
From: Song Liu @ 2024-07-30 5:36 UTC (permalink / raw)
To: live-patching, linux-kernel, linux-trace-kernel
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, nathan, morbo,
justinstitt, mcgrof, thunder.leizhen, kees, kernel-team, mmaurer,
samitolvanen, mhiramat, rostedt
On Mon, Jul 29, 2024 at 5:54 PM Song Liu <song@kernel.org> wrote:
>
> With CONFIG_LTO_CLANG, the compiler/linker adds .llvm.<hash> suffix to
> local symbols to avoid duplications. Existing scripts/kallsyms sorts
> symbols without .llvm.<hash> suffix. However, this causes quite some
> issues later on. Some users of kallsyms, such as livepatch, have to match
> symbols exactly; while other users, such as kprobe, would match symbols
> without the suffix.
>
> Address this by sorting full symbols (with .llvm.<hash>) at build time, and
> split kallsyms APIs to explicitly match full symbols or without suffix.
> Specifically, exiting APIs will match symbols exactly. Two new APIs are
> added to match symbols with suffix. Use the new APIs in tracing/kprobes.
Forgot to mention: This is to follow up the discussions in this thread:
https://lore.kernel.org/live-patching/20240605032120.3179157-1-song@kernel.org/T/#u
Thanks,
Song
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-07-30 0:54 ` [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix Song Liu
@ 2024-07-30 13:03 ` Masami Hiramatsu
2024-07-31 1:00 ` Song Liu
2024-08-02 17:16 ` Song Liu
1 sibling, 1 reply; 17+ messages in thread
From: Masami Hiramatsu @ 2024-07-30 13:03 UTC (permalink / raw)
To: Song Liu
Cc: live-patching, linux-kernel, linux-trace-kernel, jpoimboe, jikos,
mbenes, pmladek, joe.lawrence, nathan, morbo, justinstitt, mcgrof,
thunder.leizhen, kees, kernel-team, mmaurer, samitolvanen,
mhiramat, rostedt
On Mon, 29 Jul 2024 17:54:32 -0700
Song Liu <song@kernel.org> wrote:
> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
> to avoid duplication. This causes confusion with users of kallsyms.
> On one hand, users like livepatch are required to match the symbols
> exactly. On the other hand, users like kprobe would like to match to
> original function names.
>
> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
> should match the symbols exactly. Add two APIs that matches the full
> symbol, or only the part without .llvm.suffix. Specifically, the following
> two APIs are added:
>
> 1. kallsyms_lookup_name_or_prefix()
> 2. kallsyms_on_each_match_symbol_or_prefix()
Since this API only removes the suffix, "match prefix" is a bit confusing.
(this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
it only matches "foo" and "foo.llvm.*")
What about the name below?
kallsyms_lookup_name_without_suffix()
kallsyms_on_each_match_symbol_without_suffix()
>
> These APIs will be used by kprobe.
No other user need this?
Thank you,
>
> Also cleanup some code and adjust kallsyms_selftests accordingly.
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
> include/linux/kallsyms.h | 14 +++++++
> kernel/kallsyms.c | 83 ++++++++++++++++++++++++++------------
> kernel/kallsyms_selftest.c | 22 +---------
> 3 files changed, 73 insertions(+), 46 deletions(-)
>
> diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> index c3f075e8f60c..09b2d2099107 100644
> --- a/include/linux/kallsyms.h
> +++ b/include/linux/kallsyms.h
> @@ -74,9 +74,12 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long),
> void *data);
> int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
> const char *name, void *data);
> +int kallsyms_on_each_match_symbol_or_prefix(int (*fn)(void *, unsigned long),
> + const char *name, void *data);
>
> /* Lookup the address for a symbol. Returns 0 if not found. */
> unsigned long kallsyms_lookup_name(const char *name);
> +unsigned long kallsyms_lookup_name_or_prefix(const char *name);
>
> extern int kallsyms_lookup_size_offset(unsigned long addr,
> unsigned long *symbolsize,
> @@ -104,6 +107,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
> return 0;
> }
>
> +static inline unsigned long kallsyms_lookup_name_or_prefix(const char *name)
> +{
> + return 0;
> +}
> +
> static inline int kallsyms_lookup_size_offset(unsigned long addr,
> unsigned long *symbolsize,
> unsigned long *offset)
> @@ -165,6 +173,12 @@ static inline int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long)
> {
> return -EOPNOTSUPP;
> }
> +
> +static inline int kallsyms_on_each_match_symbol_or_prefix(int (*fn)(void *, unsigned long),
> + const char *name, void *data)
> +{
> + return -EOPNOTSUPP;
> +}
> #endif /*CONFIG_KALLSYMS*/
>
> static inline void print_ip_sym(const char *loglvl, unsigned long ip)
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index fb2c77368d18..4285dd85d814 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -164,9 +164,6 @@ static void cleanup_symbol_name(char *s)
> {
> char *res;
>
> - if (!IS_ENABLED(CONFIG_LTO_CLANG))
> - return;
> -
> /*
> * LLVM appends various suffixes for local functions and variables that
> * must be promoted to global scope as part of LTO. This can break
> @@ -181,13 +178,13 @@ static void cleanup_symbol_name(char *s)
> return;
> }
>
> -static int compare_symbol_name(const char *name, char *namebuf)
> +static int compare_symbol_name(const char *name, char *namebuf, bool exact_match)
> {
> - /* The kallsyms_seqs_of_names is sorted based on names after
> - * cleanup_symbol_name() (see scripts/kallsyms.c) if clang lto is enabled.
> - * To ensure correct bisection in kallsyms_lookup_names(), do
> - * cleanup_symbol_name(namebuf) before comparing name and namebuf.
> - */
> + int ret = strcmp(name, namebuf);
> +
> + if (exact_match || !ret)
> + return ret;
> +
> cleanup_symbol_name(namebuf);
> return strcmp(name, namebuf);
> }
> @@ -204,13 +201,17 @@ static unsigned int get_symbol_seq(int index)
>
> static int kallsyms_lookup_names(const char *name,
> unsigned int *start,
> - unsigned int *end)
> + unsigned int *end,
> + bool exact_match)
> {
> int ret;
> int low, mid, high;
> unsigned int seq, off;
> char namebuf[KSYM_NAME_LEN];
>
> + if (!IS_ENABLED(CONFIG_LTO_CLANG))
> + exact_match = true;
> +
> low = 0;
> high = kallsyms_num_syms - 1;
>
> @@ -219,7 +220,7 @@ static int kallsyms_lookup_names(const char *name,
> seq = get_symbol_seq(mid);
> off = get_symbol_offset(seq);
> kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
> - ret = compare_symbol_name(name, namebuf);
> + ret = compare_symbol_name(name, namebuf, exact_match);
> if (ret > 0)
> low = mid + 1;
> else if (ret < 0)
> @@ -236,7 +237,7 @@ static int kallsyms_lookup_names(const char *name,
> seq = get_symbol_seq(low - 1);
> off = get_symbol_offset(seq);
> kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
> - if (compare_symbol_name(name, namebuf))
> + if (compare_symbol_name(name, namebuf, exact_match))
> break;
> low--;
> }
> @@ -248,7 +249,7 @@ static int kallsyms_lookup_names(const char *name,
> seq = get_symbol_seq(high + 1);
> off = get_symbol_offset(seq);
> kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
> - if (compare_symbol_name(name, namebuf))
> + if (compare_symbol_name(name, namebuf, exact_match))
> break;
> high++;
> }
> @@ -268,13 +269,35 @@ unsigned long kallsyms_lookup_name(const char *name)
> if (!*name)
> return 0;
>
> - ret = kallsyms_lookup_names(name, &i, NULL);
> + ret = kallsyms_lookup_names(name, &i, NULL, true);
> if (!ret)
> return kallsyms_sym_address(get_symbol_seq(i));
>
> return module_kallsyms_lookup_name(name);
> }
>
> +/*
> + * Lookup the address for this symbol.
> + *
> + * With CONFIG_LTO_CLANG=y, if there is no exact match, also try lookup
> + * symbol.llvm.<hash>.
> + */
> +unsigned long kallsyms_lookup_name_or_prefix(const char *name)
> +{
> + unsigned long addr;
> +
> + addr = kallsyms_lookup_name(name);
> +
> + if (!addr && IS_ENABLED(CONFIG_LTO_CLANG)) {
> + int ret, i;
> +
> + ret = kallsyms_lookup_names(name, &i, NULL, false);
> + if (!ret)
> + addr = kallsyms_sym_address(get_symbol_seq(i));
> + }
> + return addr;
> +}
> +
> /*
> * Iterate over all symbols in vmlinux. For symbols from modules use
> * module_kallsyms_on_each_symbol instead.
> @@ -303,7 +326,25 @@ int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
> int ret;
> unsigned int i, start, end;
>
> - ret = kallsyms_lookup_names(name, &start, &end);
> + ret = kallsyms_lookup_names(name, &start, &end, true);
> + if (ret)
> + return 0;
> +
> + for (i = start; !ret && i <= end; i++) {
> + ret = fn(data, kallsyms_sym_address(get_symbol_seq(i)));
> + cond_resched();
> + }
> +
> + return ret;
> +}
> +
> +int kallsyms_on_each_match_symbol_or_prefix(int (*fn)(void *, unsigned long),
> + const char *name, void *data)
> +{
> + int ret;
> + unsigned int i, start, end;
> +
> + ret = kallsyms_lookup_names(name, &start, &end, false);
> if (ret)
> return 0;
>
> @@ -450,8 +491,6 @@ const char *kallsyms_lookup(unsigned long addr,
>
> int lookup_symbol_name(unsigned long addr, char *symname)
> {
> - int res;
> -
> symname[0] = '\0';
> symname[KSYM_NAME_LEN - 1] = '\0';
>
> @@ -462,16 +501,10 @@ int lookup_symbol_name(unsigned long addr, char *symname)
> /* Grab name */
> kallsyms_expand_symbol(get_symbol_offset(pos),
> symname, KSYM_NAME_LEN);
> - goto found;
> + return 0;
> }
> /* See if it's in a module. */
> - res = lookup_module_symbol_name(addr, symname);
> - if (res)
> - return res;
> -
> -found:
> - cleanup_symbol_name(symname);
> - return 0;
> + return lookup_module_symbol_name(addr, symname);
> }
>
> /* Look up a kernel symbol and return it in a text buffer. */
> diff --git a/kernel/kallsyms_selftest.c b/kernel/kallsyms_selftest.c
> index 2f84896a7bcb..873f7c445488 100644
> --- a/kernel/kallsyms_selftest.c
> +++ b/kernel/kallsyms_selftest.c
> @@ -187,31 +187,11 @@ static void test_perf_kallsyms_lookup_name(void)
> stat.min, stat.max, div_u64(stat.sum, stat.real_cnt));
> }
>
> -static bool match_cleanup_name(const char *s, const char *name)
> -{
> - char *p;
> - int len;
> -
> - if (!IS_ENABLED(CONFIG_LTO_CLANG))
> - return false;
> -
> - p = strstr(s, ".llvm.");
> - if (!p)
> - return false;
> -
> - len = strlen(name);
> - if (p - s != len)
> - return false;
> -
> - return !strncmp(s, name, len);
> -}
> -
> static int find_symbol(void *data, const char *name, unsigned long addr)
> {
> struct test_stat *stat = (struct test_stat *)data;
>
> - if (strcmp(name, stat->name) == 0 ||
> - (!stat->perf && match_cleanup_name(name, stat->name))) {
> + if (!strcmp(name, stat->name)) {
> stat->real_cnt++;
> stat->addr = addr;
>
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] tracing/kprobes: Use APIs that matches symbols with .llvm.<hash> suffix
2024-07-30 0:54 ` [PATCH 3/3] tracing/kprobes: Use APIs that matches symbols with .llvm.<hash> suffix Song Liu
@ 2024-07-30 13:04 ` Masami Hiramatsu
2024-07-31 1:09 ` Song Liu
0 siblings, 1 reply; 17+ messages in thread
From: Masami Hiramatsu @ 2024-07-30 13:04 UTC (permalink / raw)
To: Song Liu
Cc: live-patching, linux-kernel, linux-trace-kernel, jpoimboe, jikos,
mbenes, pmladek, joe.lawrence, nathan, morbo, justinstitt, mcgrof,
thunder.leizhen, kees, kernel-team, mmaurer, samitolvanen,
mhiramat, rostedt
On Mon, 29 Jul 2024 17:54:33 -0700
Song Liu <song@kernel.org> wrote:
> Use the new kallsyms APIs that matches symbols name with .llvm.<hash>
> suffix. This allows userspace tools to get kprobes on the expected
> function name, while the actual symbol has a .llvm.<hash> suffix.
>
_kprobe_addr@kernel/kprobes.c may also fail with this change.
Thanks,
> This only effects kernel compared with CONFIG_LTO_CLANG.
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
> kernel/trace/trace_kprobe.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 61a6da808203..c319382c1a09 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -202,7 +202,8 @@ unsigned long trace_kprobe_address(struct trace_kprobe *tk)
>
> if (tk->symbol) {
> addr = (unsigned long)
> - kallsyms_lookup_name(trace_kprobe_symbol(tk));
> + kallsyms_lookup_name_or_prefix(trace_kprobe_symbol(tk));
> +
> if (addr)
> addr += tk->rp.kp.offset;
> } else {
> @@ -766,8 +767,13 @@ static unsigned int number_of_same_symbols(const char *mod, const char *func_nam
> {
> struct sym_count_ctx ctx = { .count = 0, .name = func_name };
>
> - if (!mod)
> + if (!mod) {
> kallsyms_on_each_match_symbol(count_symbols, func_name, &ctx.count);
> + if (IS_ENABLED(CONFIG_LTO_CLANG) && !ctx.count) {
> + kallsyms_on_each_match_symbol_or_prefix(
> + count_symbols, func_name, &ctx.count);
> + }
> + }
>
> module_kallsyms_on_each_symbol(mod, count_mod_symbols, &ctx);
>
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-07-30 13:03 ` Masami Hiramatsu
@ 2024-07-31 1:00 ` Song Liu
2024-08-02 1:18 ` Leizhen (ThunderTown)
2024-08-02 15:45 ` Petr Mladek
0 siblings, 2 replies; 17+ messages in thread
From: Song Liu @ 2024-07-31 1:00 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Song Liu, live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Nathan Chancellor,
morbo@google.com, Justin Stitt, Luis Chamberlain, Leizhen,
kees@kernel.org, Kernel Team, Matthew Maurer, Sami Tolvanen,
Steven Rostedt
Hi Masami,
> On Jul 30, 2024, at 6:03 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Mon, 29 Jul 2024 17:54:32 -0700
> Song Liu <song@kernel.org> wrote:
>
>> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
>> to avoid duplication. This causes confusion with users of kallsyms.
>> On one hand, users like livepatch are required to match the symbols
>> exactly. On the other hand, users like kprobe would like to match to
>> original function names.
>>
>> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
>> should match the symbols exactly. Add two APIs that matches the full
>> symbol, or only the part without .llvm.suffix. Specifically, the following
>> two APIs are added:
>>
>> 1. kallsyms_lookup_name_or_prefix()
>> 2. kallsyms_on_each_match_symbol_or_prefix()
>
> Since this API only removes the suffix, "match prefix" is a bit confusing.
> (this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
> it only matches "foo" and "foo.llvm.*")
> What about the name below?
>
> kallsyms_lookup_name_without_suffix()
> kallsyms_on_each_match_symbol_without_suffix()
I am open to name suggestions. I named it as xx or prefix to highlight
that these two APIs will try match full name first, and they only match
the symbol without suffix when there is no full name match.
Maybe we can call them:
- kallsyms_lookup_name_or_without_suffix()
- kallsyms_on_each_match_symbol_or_without_suffix()
Again, I am open to any name selections here.
>
>>
>> These APIs will be used by kprobe.
>
> No other user need this?
AFACIT, kprobe is the only use case here. Sami, please correct
me if I missed any users.
More thoughts on this:
I actually hope we don't need these two new APIs, as they are
confusing. Modern compilers can do many things to the code
(inlining, etc.). So when we are tracing a function, we are not
really tracing "function in the source code". Instead, we are
tracing "function in the binary". If a function is inlined, it
will not show up in the binary. If a function is _partially_
inlined (inlined by some callers, but not by others), it will
show up in the binary, but we won't be tracing it as it appears
in the source code. Therefore, tracing functions by their names
in the source code only works under certain assumptions. And
these assumptions may not hold with modern compilers. Ideally,
I think we cannot promise the user can use name "ping_table" to
trace function "ping_table.llvm.15394922576589127018"
Does this make sense?
Thanks,
Song
[...]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] tracing/kprobes: Use APIs that matches symbols with .llvm.<hash> suffix
2024-07-30 13:04 ` Masami Hiramatsu
@ 2024-07-31 1:09 ` Song Liu
0 siblings, 0 replies; 17+ messages in thread
From: Song Liu @ 2024-07-31 1:09 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Song Liu, live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf,
jikos@kernel.org, mbenes@suse.cz, pmladek@suse.com,
joe.lawrence@redhat.com, nathan@kernel.org, morbo@google.com,
justinstitt@google.com, mcgrof@kernel.org,
thunder.leizhen@huawei.com, kees@kernel.org, Kernel Team,
mmaurer@google.com, samitolvanen@google.com, rostedt@goodmis.org
Hi Masami,
> On Jul 30, 2024, at 6:04 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Mon, 29 Jul 2024 17:54:33 -0700
> Song Liu <song@kernel.org> wrote:
>
>> Use the new kallsyms APIs that matches symbols name with .llvm.<hash>
>> suffix. This allows userspace tools to get kprobes on the expected
>> function name, while the actual symbol has a .llvm.<hash> suffix.
>>
>
> _kprobe_addr@kernel/kprobes.c may also fail with this change.
>
Thanks for catching this! I will fix this in the next version.
Song
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-07-31 1:00 ` Song Liu
@ 2024-08-02 1:18 ` Leizhen (ThunderTown)
2024-08-02 3:45 ` Song Liu
2024-08-02 15:45 ` Petr Mladek
1 sibling, 1 reply; 17+ messages in thread
From: Leizhen (ThunderTown) @ 2024-08-02 1:18 UTC (permalink / raw)
To: Song Liu, Masami Hiramatsu
Cc: Song Liu, live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Nathan Chancellor,
morbo@google.com, Justin Stitt, Luis Chamberlain, Leizhen,
kees@kernel.org, Kernel Team, Matthew Maurer, Sami Tolvanen,
Steven Rostedt
On 2024/7/31 9:00, Song Liu wrote:
> Hi Masami,
>
>> On Jul 30, 2024, at 6:03 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>>
>> On Mon, 29 Jul 2024 17:54:32 -0700
>> Song Liu <song@kernel.org> wrote:
>>
>>> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
>>> to avoid duplication. This causes confusion with users of kallsyms.
>>> On one hand, users like livepatch are required to match the symbols
>>> exactly. On the other hand, users like kprobe would like to match to
>>> original function names.
>>>
>>> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
>>> should match the symbols exactly. Add two APIs that matches the full
>>> symbol, or only the part without .llvm.suffix. Specifically, the following
>>> two APIs are added:
>>>
>>> 1. kallsyms_lookup_name_or_prefix()
>>> 2. kallsyms_on_each_match_symbol_or_prefix()
>>
>> Since this API only removes the suffix, "match prefix" is a bit confusing.
>> (this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
>> it only matches "foo" and "foo.llvm.*")
>> What about the name below?
>>
>> kallsyms_lookup_name_without_suffix()
>> kallsyms_on_each_match_symbol_without_suffix()
>
> I am open to name suggestions. I named it as xx or prefix to highlight
> that these two APIs will try match full name first, and they only match
> the symbol without suffix when there is no full name match.
>
> Maybe we can call them:
> - kallsyms_lookup_name_or_without_suffix()
> - kallsyms_on_each_match_symbol_or_without_suffix()
>
> Again, I am open to any name selections here.
Only static functions have suffixes. In my opinion, explicitly marking static
might be a little clearer.
kallsyms_lookup_static_name()
kallsyms_on_each_match_static_symbol()
>
>>
>>>
>>> These APIs will be used by kprobe.
>>
>> No other user need this?
>
> AFACIT, kprobe is the only use case here. Sami, please correct
> me if I missed any users.
>
>
> More thoughts on this:
>
> I actually hope we don't need these two new APIs, as they are
> confusing. Modern compilers can do many things to the code
> (inlining, etc.). So when we are tracing a function, we are not
> really tracing "function in the source code". Instead, we are
> tracing "function in the binary". If a function is inlined, it
> will not show up in the binary. If a function is _partially_
> inlined (inlined by some callers, but not by others), it will
> show up in the binary, but we won't be tracing it as it appears
> in the source code. Therefore, tracing functions by their names
> in the source code only works under certain assumptions. And
> these assumptions may not hold with modern compilers. Ideally,
> I think we cannot promise the user can use name "ping_table" to
> trace function "ping_table.llvm.15394922576589127018"
>
> Does this make sense?
>
> Thanks,
> Song
>
>
> [...]
>
--
Regards,
Zhen Lei
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-08-02 1:18 ` Leizhen (ThunderTown)
@ 2024-08-02 3:45 ` Song Liu
2024-08-02 6:53 ` Leizhen (ThunderTown)
2024-08-02 13:04 ` Masami Hiramatsu
0 siblings, 2 replies; 17+ messages in thread
From: Song Liu @ 2024-08-02 3:45 UTC (permalink / raw)
To: Leizhen (ThunderTown)
Cc: Song Liu, Masami Hiramatsu, Song Liu,
live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Nathan Chancellor,
morbo@google.com, Justin Stitt, Luis Chamberlain, Leizhen,
kees@kernel.org, Kernel Team, Matthew Maurer, Sami Tolvanen,
Steven Rostedt
> On Aug 1, 2024, at 6:18 PM, Leizhen (ThunderTown) <thunder.leizhen@huaweicloud.com> wrote:
>
> On 2024/7/31 9:00, Song Liu wrote:
>> Hi Masami,
>>
>>> On Jul 30, 2024, at 6:03 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>>>
>>> On Mon, 29 Jul 2024 17:54:32 -0700
>>> Song Liu <song@kernel.org> wrote:
>>>
>>>> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
>>>> to avoid duplication. This causes confusion with users of kallsyms.
>>>> On one hand, users like livepatch are required to match the symbols
>>>> exactly. On the other hand, users like kprobe would like to match to
>>>> original function names.
>>>>
>>>> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
>>>> should match the symbols exactly. Add two APIs that matches the full
>>>> symbol, or only the part without .llvm.suffix. Specifically, the following
>>>> two APIs are added:
>>>>
>>>> 1. kallsyms_lookup_name_or_prefix()
>>>> 2. kallsyms_on_each_match_symbol_or_prefix()
>>>
>>> Since this API only removes the suffix, "match prefix" is a bit confusing.
>>> (this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
>>> it only matches "foo" and "foo.llvm.*")
>>> What about the name below?
>>>
>>> kallsyms_lookup_name_without_suffix()
>>> kallsyms_on_each_match_symbol_without_suffix()
>>
>> I am open to name suggestions. I named it as xx or prefix to highlight
>> that these two APIs will try match full name first, and they only match
>> the symbol without suffix when there is no full name match.
>>
>> Maybe we can call them:
>> - kallsyms_lookup_name_or_without_suffix()
>> - kallsyms_on_each_match_symbol_or_without_suffix()
>>
>> Again, I am open to any name selections here.
>
> Only static functions have suffixes. In my opinion, explicitly marking static
> might be a little clearer.
> kallsyms_lookup_static_name()
> kallsyms_on_each_match_static_symbol()
While these names are shorter, I think they are more confusing. Not all
folks know that only static functions can have suffixes.
Maybe we should not hide the "try match full name first first" in the
API, and let the users handle it. Then, we can safely call the new APIs
*_without_suffix(), as Masami suggested.
If there is no objections, I will send v2 based on this direction.
Thanks,
Song
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-08-02 3:45 ` Song Liu
@ 2024-08-02 6:53 ` Leizhen (ThunderTown)
2024-08-02 13:04 ` Masami Hiramatsu
1 sibling, 0 replies; 17+ messages in thread
From: Leizhen (ThunderTown) @ 2024-08-02 6:53 UTC (permalink / raw)
To: Song Liu
Cc: Masami Hiramatsu, Song Liu, live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Nathan Chancellor,
morbo@google.com, Justin Stitt, Luis Chamberlain, Leizhen,
kees@kernel.org, Kernel Team, Matthew Maurer, Sami Tolvanen,
Steven Rostedt
On 2024/8/2 11:45, Song Liu wrote:
>
>
>> On Aug 1, 2024, at 6:18 PM, Leizhen (ThunderTown) <thunder.leizhen@huaweicloud.com> wrote:
>>
>> On 2024/7/31 9:00, Song Liu wrote:
>>> Hi Masami,
>>>
>>>> On Jul 30, 2024, at 6:03 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>>>>
>>>> On Mon, 29 Jul 2024 17:54:32 -0700
>>>> Song Liu <song@kernel.org> wrote:
>>>>
>>>>> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
>>>>> to avoid duplication. This causes confusion with users of kallsyms.
>>>>> On one hand, users like livepatch are required to match the symbols
>>>>> exactly. On the other hand, users like kprobe would like to match to
>>>>> original function names.
>>>>>
>>>>> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
>>>>> should match the symbols exactly. Add two APIs that matches the full
>>>>> symbol, or only the part without .llvm.suffix. Specifically, the following
>>>>> two APIs are added:
>>>>>
>>>>> 1. kallsyms_lookup_name_or_prefix()
>>>>> 2. kallsyms_on_each_match_symbol_or_prefix()
>>>>
>>>> Since this API only removes the suffix, "match prefix" is a bit confusing.
>>>> (this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
>>>> it only matches "foo" and "foo.llvm.*")
>>>> What about the name below?
>>>>
>>>> kallsyms_lookup_name_without_suffix()
>>>> kallsyms_on_each_match_symbol_without_suffix()
>>>
>>> I am open to name suggestions. I named it as xx or prefix to highlight
>>> that these two APIs will try match full name first, and they only match
>>> the symbol without suffix when there is no full name match.
>>>
>>> Maybe we can call them:
>>> - kallsyms_lookup_name_or_without_suffix()
>>> - kallsyms_on_each_match_symbol_or_without_suffix()
>>>
>>> Again, I am open to any name selections here.
>>
>> Only static functions have suffixes. In my opinion, explicitly marking static
>> might be a little clearer.
>> kallsyms_lookup_static_name()
>> kallsyms_on_each_match_static_symbol()
>
> While these names are shorter, I think they are more confusing. Not all
> folks know that only static functions can have suffixes.
>
> Maybe we should not hide the "try match full name first first" in the
> API, and let the users handle it. Then, we can safely call the new APIs
> *_without_suffix(), as Masami suggested.
Yes, that would be clearer.
>
> If there is no objections, I will send v2 based on this direction.
>
> Thanks,
> Song
>
--
Regards,
Zhen Lei
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-08-02 3:45 ` Song Liu
2024-08-02 6:53 ` Leizhen (ThunderTown)
@ 2024-08-02 13:04 ` Masami Hiramatsu
1 sibling, 0 replies; 17+ messages in thread
From: Masami Hiramatsu @ 2024-08-02 13:04 UTC (permalink / raw)
To: Song Liu
Cc: Leizhen (ThunderTown), Masami Hiramatsu, Song Liu,
live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Nathan Chancellor,
morbo@google.com, Justin Stitt, Luis Chamberlain, Leizhen,
kees@kernel.org, Kernel Team, Matthew Maurer, Sami Tolvanen,
Steven Rostedt
Hi,
Sorry for late reply.
On Fri, 2 Aug 2024 03:45:42 +0000
Song Liu <songliubraving@meta.com> wrote:
>
>
> > On Aug 1, 2024, at 6:18 PM, Leizhen (ThunderTown) <thunder.leizhen@huaweicloud.com> wrote:
> >
> > On 2024/7/31 9:00, Song Liu wrote:
> >> Hi Masami,
> >>
> >>> On Jul 30, 2024, at 6:03 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >>>
> >>> On Mon, 29 Jul 2024 17:54:32 -0700
> >>> Song Liu <song@kernel.org> wrote:
> >>>
> >>>> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
> >>>> to avoid duplication. This causes confusion with users of kallsyms.
> >>>> On one hand, users like livepatch are required to match the symbols
> >>>> exactly. On the other hand, users like kprobe would like to match to
> >>>> original function names.
> >>>>
> >>>> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
> >>>> should match the symbols exactly. Add two APIs that matches the full
> >>>> symbol, or only the part without .llvm.suffix. Specifically, the following
> >>>> two APIs are added:
> >>>>
> >>>> 1. kallsyms_lookup_name_or_prefix()
> >>>> 2. kallsyms_on_each_match_symbol_or_prefix()
> >>>
> >>> Since this API only removes the suffix, "match prefix" is a bit confusing.
> >>> (this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
> >>> it only matches "foo" and "foo.llvm.*")
> >>> What about the name below?
> >>>
> >>> kallsyms_lookup_name_without_suffix()
> >>> kallsyms_on_each_match_symbol_without_suffix()
> >>
> >> I am open to name suggestions. I named it as xx or prefix to highlight
> >> that these two APIs will try match full name first, and they only match
> >> the symbol without suffix when there is no full name match.
> >>
> >> Maybe we can call them:
> >> - kallsyms_lookup_name_or_without_suffix()
> >> - kallsyms_on_each_match_symbol_or_without_suffix()
No, I think "_or_without_suffix" is a bit complicated. If we have
0x1234 "foo.llvm.XXX"
0x2356 "bar"
and user calls
kallsyms_lookup_name_without_suffix("bar");
This returns "bar"'s address(0x2356). Also,
kallsyms_lookup_name_without_suffix("foo");
this will returns 0x1234. These commonly just ignore the suffix.
The difference of kallsyms_lookup_name() is that point.
> >>
> >> Again, I am open to any name selections here.
> >
> > Only static functions have suffixes. In my opinion, explicitly marking static
> > might be a little clearer.
> > kallsyms_lookup_static_name()
> > kallsyms_on_each_match_static_symbol()
But this is not correctly check the symbol is static or not. If you
check the symbol is really static, it is OK. (return NULL if the symbol
is global.)
Thank you,
>
> While these names are shorter, I think they are more confusing. Not all
> folks know that only static functions can have suffixes.
>
> Maybe we should not hide the "try match full name first first" in the
> API, and let the users handle it. Then, we can safely call the new APIs
> *_without_suffix(), as Masami suggested.
>
> If there is no objections, I will send v2 based on this direction.
>
> Thanks,
> Song
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-07-31 1:00 ` Song Liu
2024-08-02 1:18 ` Leizhen (ThunderTown)
@ 2024-08-02 15:45 ` Petr Mladek
2024-08-02 17:09 ` Song Liu
1 sibling, 1 reply; 17+ messages in thread
From: Petr Mladek @ 2024-08-02 15:45 UTC (permalink / raw)
To: Song Liu
Cc: Masami Hiramatsu, Song Liu, live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Joe Lawrence, Nathan Chancellor, morbo@google.com,
Justin Stitt, Luis Chamberlain, Leizhen, kees@kernel.org,
Kernel Team, Matthew Maurer, Sami Tolvanen, Steven Rostedt
On Wed 2024-07-31 01:00:34, Song Liu wrote:
> Hi Masami,
>
> > On Jul 30, 2024, at 6:03 AM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Mon, 29 Jul 2024 17:54:32 -0700
> > Song Liu <song@kernel.org> wrote:
> >
> >> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
> >> to avoid duplication. This causes confusion with users of kallsyms.
> >> On one hand, users like livepatch are required to match the symbols
> >> exactly. On the other hand, users like kprobe would like to match to
> >> original function names.
> >>
> >> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
> >> should match the symbols exactly. Add two APIs that matches the full
> >> symbol, or only the part without .llvm.suffix. Specifically, the following
> >> two APIs are added:
> >>
> >> 1. kallsyms_lookup_name_or_prefix()
> >> 2. kallsyms_on_each_match_symbol_or_prefix()
> >
> > Since this API only removes the suffix, "match prefix" is a bit confusing.
> > (this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
> > it only matches "foo" and "foo.llvm.*")
> > What about the name below?
> >
> > kallsyms_lookup_name_without_suffix()
> > kallsyms_on_each_match_symbol_without_suffix()
This looks like the best variant to me. A reasonable compromise.
> >> These APIs will be used by kprobe.
> >
> > No other user need this?
>
> AFACIT, kprobe is the only use case here. Sami, please correct
> me if I missed any users.
>
>
> More thoughts on this:
>
> I actually hope we don't need these two new APIs, as they are
> confusing. Modern compilers can do many things to the code
> (inlining, etc.). So when we are tracing a function, we are not
> really tracing "function in the source code". Instead, we are
> tracing "function in the binary". If a function is inlined, it
> will not show up in the binary. If a function is _partially_
> inlined (inlined by some callers, but not by others), it will
> show up in the binary, but we won't be tracing it as it appears
> in the source code. Therefore, tracing functions by their names
> in the source code only works under certain assumptions. And
> these assumptions may not hold with modern compilers. Ideally,
> I think we cannot promise the user can use name "ping_table" to
> trace function "ping_table.llvm.15394922576589127018"
>
> Does this make sense?
IMHO, it depends on the use case. Let's keep "ping_table/"
as an example. Why people would want to trace this function?
There might be various reasons, for example:
1. ping_table.llvm.15394922576589127018 appeared in a backtrace
2. ping_table.llvm.15394922576589127018 appeared in a histogram
3. ping_table looks interesting when reading code sources
4. ping_table need to be monitored on all systems because
of security/performance.
The full name "ping_table.llvm.15394922576589127018" is perfectly
fine in the 1st and 2nd scenario. People knew this name already
before they start thinking about tracing.
The short name is more practical in 3rd and 4th scenario. Especially,
when there is only one static symbol with this short name. Otherwise,
the user would need an extra step to find the full name.
The full name is even more problematic for system monitors. These
applications might need to probe particular symbols. They might
have hard times when the symbol is:
<symbol_name_from_sources>.<random_suffix_generated_by_compiler>
They will have to deal with it. But it means that every such tool
would need an extra (non-trivial) code for this. Every tool would
try its own approach => a lot of problems.
IMHO, the two APIs could make the life easier.
Well, even kprobe might need two APIs to allow probing by
full name or without the suffix.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-08-02 15:45 ` Petr Mladek
@ 2024-08-02 17:09 ` Song Liu
2024-08-05 12:53 ` Masami Hiramatsu
0 siblings, 1 reply; 17+ messages in thread
From: Song Liu @ 2024-08-02 17:09 UTC (permalink / raw)
To: Petr Mladek
Cc: Song Liu, Masami Hiramatsu, Song Liu,
live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Joe Lawrence, Nathan Chancellor, morbo@google.com,
Justin Stitt, Luis Chamberlain, Leizhen, kees@kernel.org,
Kernel Team, Matthew Maurer, Sami Tolvanen, Steven Rostedt
Hi Petr,
> On Aug 2, 2024, at 8:45 AM, Petr Mladek <pmladek@suse.com> wrote:
[...]
>
> IMHO, it depends on the use case. Let's keep "ping_table/"
> as an example. Why people would want to trace this function?
> There might be various reasons, for example:
>
> 1. ping_table.llvm.15394922576589127018 appeared in a backtrace
>
> 2. ping_table.llvm.15394922576589127018 appeared in a histogram
>
> 3. ping_table looks interesting when reading code sources
>
> 4. ping_table need to be monitored on all systems because
> of security/performance.
>
> The full name "ping_table.llvm.15394922576589127018" is perfectly
> fine in the 1st and 2nd scenario. People knew this name already
> before they start thinking about tracing.
>
> The short name is more practical in 3rd and 4th scenario. Especially,
> when there is only one static symbol with this short name. Otherwise,
> the user would need an extra step to find the full name.
>
> The full name is even more problematic for system monitors. These
> applications might need to probe particular symbols. They might
> have hard times when the symbol is:
>
> <symbol_name_from_sources>.<random_suffix_generated_by_compiler>
>
> They will have to deal with it. But it means that every such tool
> would need an extra (non-trivial) code for this. Every tool would
> try its own approach => a lot of problems.
>
> IMHO, the two APIs could make the life easier.
>
> Well, even kprobe might need two APIs to allow probing by
> full name or without the suffix.
The problem is, with potential partial inlining by modern compilers,
tracing "symbol name from sources" is not accurate. In our production
kernels, we have to add some explicit "noline" to make sure we can
trace these functions reliably.
Of course, this issue exists without random suffix: any function
could be partially inlined. However, allowing tracing without the
suffix seems to hint that tracing with "symbol name from sources"
is valid, which is not really true.
At the moment, I have no objections to keep the _without_suffix
APIs. But for long term, I still think we need to set clear
expectations for the users: tracing symbols from sources is not
reliable.
Thanks,
Song
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-07-30 0:54 ` [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix Song Liu
2024-07-30 13:03 ` Masami Hiramatsu
@ 2024-08-02 17:16 ` Song Liu
1 sibling, 0 replies; 17+ messages in thread
From: Song Liu @ 2024-08-02 17:16 UTC (permalink / raw)
To: Song Liu
Cc: live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, pmladek@suse.com, joe.lawrence@redhat.com,
nathan@kernel.org, morbo@google.com, justinstitt@google.com,
mcgrof@kernel.org, thunder.leizhen@huawei.com, kees@kernel.org,
Kernel Team, mmaurer@google.com, samitolvanen@google.com,
mhiramat@kernel.org, rostedt@goodmis.org
> On Jul 29, 2024, at 5:54 PM, Song Liu <song@kernel.org> wrote:
>
> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
> to avoid duplication. This causes confusion with users of kallsyms.
> On one hand, users like livepatch are required to match the symbols
> exactly. On the other hand, users like kprobe would like to match to
> original function names.
>
> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
> should match the symbols exactly. Add two APIs that matches the full
> symbol, or only the part without .llvm.suffix. Specifically, the following
> two APIs are added:
>
> 1. kallsyms_lookup_name_or_prefix()
> 2. kallsyms_on_each_match_symbol_or_prefix()
>
> These APIs will be used by kprobe.
>
> Also cleanup some code and adjust kallsyms_selftests accordingly.
>
> Signed-off-by: Song Liu <song@kernel.org>
Actually, if we only remove .llvm.<hash> suffix, but keep other .XXX
suffix, the *_without_suffx APIs will have the same issue Yonghong
tried to fix in commit 33f0467fe06934d5e4ea6e24ce2b9c65ce618e26:
binary search with symbols - .llvm.<hash> suffix is not correct.
(Please see the commit log of 33f0467fe06934d5e4ea6e24ce2b9c65ce618e26
for more details.)
I am updating the code to remove all .XXX suffix. This design will
not have this issue.
Thanks,
Song
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix.
2024-08-02 17:09 ` Song Liu
@ 2024-08-05 12:53 ` Masami Hiramatsu
0 siblings, 0 replies; 17+ messages in thread
From: Masami Hiramatsu @ 2024-08-05 12:53 UTC (permalink / raw)
To: Song Liu
Cc: Petr Mladek, Masami Hiramatsu, Song Liu,
live-patching@vger.kernel.org, LKML,
linux-trace-kernel@vger.kernel.org, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Joe Lawrence, Nathan Chancellor, morbo@google.com,
Justin Stitt, Luis Chamberlain, Leizhen, kees@kernel.org,
Kernel Team, Matthew Maurer, Sami Tolvanen, Steven Rostedt
On Fri, 2 Aug 2024 17:09:12 +0000
Song Liu <songliubraving@meta.com> wrote:
> Hi Petr,
>
> > On Aug 2, 2024, at 8:45 AM, Petr Mladek <pmladek@suse.com> wrote:
>
> [...]
>
> >
> > IMHO, it depends on the use case. Let's keep "ping_table/"
> > as an example. Why people would want to trace this function?
> > There might be various reasons, for example:
> >
> > 1. ping_table.llvm.15394922576589127018 appeared in a backtrace
> >
> > 2. ping_table.llvm.15394922576589127018 appeared in a histogram
> >
> > 3. ping_table looks interesting when reading code sources
> >
> > 4. ping_table need to be monitored on all systems because
> > of security/performance.
> >
> > The full name "ping_table.llvm.15394922576589127018" is perfectly
> > fine in the 1st and 2nd scenario. People knew this name already
> > before they start thinking about tracing.
> >
> > The short name is more practical in 3rd and 4th scenario. Especially,
> > when there is only one static symbol with this short name. Otherwise,
> > the user would need an extra step to find the full name.
> >
> > The full name is even more problematic for system monitors. These
> > applications might need to probe particular symbols. They might
> > have hard times when the symbol is:
> >
> > <symbol_name_from_sources>.<random_suffix_generated_by_compiler>
> >
> > They will have to deal with it. But it means that every such tool
> > would need an extra (non-trivial) code for this. Every tool would
> > try its own approach => a lot of problems.
> >
> > IMHO, the two APIs could make the life easier.
> >
> > Well, even kprobe might need two APIs to allow probing by
> > full name or without the suffix.
>
> The problem is, with potential partial inlining by modern compilers,
> tracing "symbol name from sources" is not accurate. In our production
> kernels, we have to add some explicit "noline" to make sure we can
> trace these functions reliably.
>
> Of course, this issue exists without random suffix: any function
> could be partially inlined. However, allowing tracing without the
> suffix seems to hint that tracing with "symbol name from sources"
> is valid, which is not really true.
>
> At the moment, I have no objections to keep the _without_suffix
> APIs. But for long term, I still think we need to set clear
> expectations for the users: tracing symbols from sources is not
> reliable.
OK, I understand this part. I agree the problem. Even if the symbol
is unique on kallsyms (without suffix), it may have a suffix and is
not correct function entry.
I think to solve this issue, we need a better DWARF, or add a symbol
suffix like;
https://lkml.org/lkml/2023/12/4/1535
Thank you,
>
> Thanks,
> Song
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-08-05 12:53 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-30 0:54 [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
2024-07-30 0:54 ` [PATCH 1/3] kallsyms: Do not cleanup .llvm.<hash> suffix before sorting symbols Song Liu
2024-07-30 0:54 ` [PATCH 2/3] kallsyms: Add APIs to match symbol without .llmv.<hash> suffix Song Liu
2024-07-30 13:03 ` Masami Hiramatsu
2024-07-31 1:00 ` Song Liu
2024-08-02 1:18 ` Leizhen (ThunderTown)
2024-08-02 3:45 ` Song Liu
2024-08-02 6:53 ` Leizhen (ThunderTown)
2024-08-02 13:04 ` Masami Hiramatsu
2024-08-02 15:45 ` Petr Mladek
2024-08-02 17:09 ` Song Liu
2024-08-05 12:53 ` Masami Hiramatsu
2024-08-02 17:16 ` Song Liu
2024-07-30 0:54 ` [PATCH 3/3] tracing/kprobes: Use APIs that matches symbols with .llvm.<hash> suffix Song Liu
2024-07-30 13:04 ` Masami Hiramatsu
2024-07-31 1:09 ` Song Liu
2024-07-30 5:36 ` [PATCH 0/3] Fix kallsyms with CONFIG_LTO_CLANG Song Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).