* [RFC dwarves] btf_encoder: Remove duplicates from functions entries @ 2025-07-17 15:25 Jiri Olsa 2025-07-21 11:41 ` Alan Maguire 0 siblings, 1 reply; 12+ messages in thread From: Jiri Olsa @ 2025-07-17 15:25 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Alan Maguire, Eduard Zingerman, Ihor Solodrai Menglong reported issue where we can have function in BTF which has multiple addresses in kallsysm [1]. Rather than filtering this in runtime, let's teach pahole to remove such functions. Removing duplicate records from functions entries that have more at least one different address. This way btf_encoder__find_function won't find such functions and they won't be added in BTF. In my setup it removed 428 functions out of 77141. [1] https://lore.kernel.org/bpf/20250710070835.260831-1-dongml2@chinatelecom.cn/ Reported-by: Menglong Dong <menglong8.dong@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> --- Alan, I'd like to test this in the pahole CI, is there a way to manualy trigger it? thanks, jirka --- btf_encoder.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/btf_encoder.c b/btf_encoder.c index 16739066caae..a25fe2f8bfb1 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -99,6 +99,7 @@ struct elf_function { size_t prefixlen; bool kfunc; uint32_t kfunc_flags; + unsigned long addr; }; struct elf_secinfo { @@ -1469,6 +1470,7 @@ static void elf_functions__collect_function(struct elf_functions *functions, GEl func = &functions->entries[functions->cnt]; func->name = name; + func->addr = sym->st_value; if (strchr(name, '.')) { const char *suffix = strchr(name, '.'); @@ -2143,6 +2145,40 @@ int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf) return err; } +/* + * Remove name duplicates from functions->entries that have + * at least 2 different addresses. + */ +static void functions_remove_dups(struct elf_functions *functions) +{ + struct elf_function *n = &functions->entries[0]; + bool matched = false, diff = false; + int i, j; + + for (i = 0, j = 1; i < functions->cnt && j < functions->cnt; i++, j++) { + struct elf_function *a = &functions->entries[i]; + struct elf_function *b = &functions->entries[j]; + + if (!strcmp(a->name, b->name)) { + matched = true; + diff |= a->addr != b->addr; + continue; + } + + /* + * Keep only not-matched entries and last one of the matched/duplicates + * ones if all of the matched entries had the same address. + **/ + if (!matched || !diff) + *n++ = *a; + matched = diff = false; + } + + if (!matched || !diff) + *n++ = functions->entries[functions->cnt - 1]; + functions->cnt = n - &functions->entries[0]; +} + static int elf_functions__collect(struct elf_functions *functions) { uint32_t nr_symbols = elf_symtab__nr_symbols(functions->symtab); @@ -2168,6 +2204,7 @@ static int elf_functions__collect(struct elf_functions *functions) if (functions->cnt) { qsort(functions->entries, functions->cnt, sizeof(*functions->entries), functions_cmp); + functions_remove_dups(functions); } else { err = 0; goto out_free; -- 2.50.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-17 15:25 [RFC dwarves] btf_encoder: Remove duplicates from functions entries Jiri Olsa @ 2025-07-21 11:41 ` Alan Maguire 2025-07-21 14:27 ` Jiri Olsa 0 siblings, 1 reply; 12+ messages in thread From: Alan Maguire @ 2025-07-21 11:41 UTC (permalink / raw) To: Jiri Olsa, Arnaldo Carvalho de Melo Cc: Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman, Ihor Solodrai On 17/07/2025 16:25, Jiri Olsa wrote: > Menglong reported issue where we can have function in BTF which has > multiple addresses in kallsysm [1]. > > Rather than filtering this in runtime, let's teach pahole to remove > such functions. > > Removing duplicate records from functions entries that have more > at least one different address. This way btf_encoder__find_function > won't find such functions and they won't be added in BTF. > > In my setup it removed 428 functions out of 77141. > Is such removal necessary? If the presence of an mcount annotation is the requirement, couldn't we just utilize /sys/kernel/tracing/available_filter_functions_addrs to map name to address safely? It identifies mcount-containing functions and some of these appear to be duplicates, for example there I see ffffffff8376e8b4 acpi_attr_is_visible ffffffff8379b7d4 acpi_attr_is_visible ? > [1] https://lore.kernel.org/bpf/20250710070835.260831-1-dongml2@chinatelecom.cn/ > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > --- > > Alan, > I'd like to test this in the pahole CI, is there a way to manualy trigger it? > Easiest way is to base from pahole's next branch and push to a github repo; the tests will run as actions there. I've just merged the function comparison work so that will be available if you base/sync a branch on next from git.kernel.org/pub/scm/devel/pahole/pahole.git/ . Thanks! Alan > thanks, > jirka > > > --- > btf_encoder.c | 37 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/btf_encoder.c b/btf_encoder.c > index 16739066caae..a25fe2f8bfb1 100644 > --- a/btf_encoder.c > +++ b/btf_encoder.c > @@ -99,6 +99,7 @@ struct elf_function { > size_t prefixlen; > bool kfunc; > uint32_t kfunc_flags; > + unsigned long addr; > }; > > struct elf_secinfo { > @@ -1469,6 +1470,7 @@ static void elf_functions__collect_function(struct elf_functions *functions, GEl > > func = &functions->entries[functions->cnt]; > func->name = name; > + func->addr = sym->st_value; > if (strchr(name, '.')) { > const char *suffix = strchr(name, '.'); > > @@ -2143,6 +2145,40 @@ int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf) > return err; > } > > +/* > + * Remove name duplicates from functions->entries that have > + * at least 2 different addresses. > + */ > +static void functions_remove_dups(struct elf_functions *functions) > +{ > + struct elf_function *n = &functions->entries[0]; > + bool matched = false, diff = false; > + int i, j; > + > + for (i = 0, j = 1; i < functions->cnt && j < functions->cnt; i++, j++) { > + struct elf_function *a = &functions->entries[i]; > + struct elf_function *b = &functions->entries[j]; > + > + if (!strcmp(a->name, b->name)) { > + matched = true; > + diff |= a->addr != b->addr; > + continue; > + } > + > + /* > + * Keep only not-matched entries and last one of the matched/duplicates > + * ones if all of the matched entries had the same address. > + **/ > + if (!matched || !diff) > + *n++ = *a; > + matched = diff = false; > + } > + > + if (!matched || !diff) > + *n++ = functions->entries[functions->cnt - 1]; > + functions->cnt = n - &functions->entries[0]; > +} > + > static int elf_functions__collect(struct elf_functions *functions) > { > uint32_t nr_symbols = elf_symtab__nr_symbols(functions->symtab); > @@ -2168,6 +2204,7 @@ static int elf_functions__collect(struct elf_functions *functions) > > if (functions->cnt) { > qsort(functions->entries, functions->cnt, sizeof(*functions->entries), functions_cmp); > + functions_remove_dups(functions); > } else { > err = 0; > goto out_free; ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-21 11:41 ` Alan Maguire @ 2025-07-21 14:27 ` Jiri Olsa 2025-07-21 14:32 ` Nick Alcock 2025-07-21 23:27 ` Ihor Solodrai 0 siblings, 2 replies; 12+ messages in thread From: Jiri Olsa @ 2025-07-21 14:27 UTC (permalink / raw) To: Alan Maguire Cc: Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman, Ihor Solodrai On Mon, Jul 21, 2025 at 12:41:00PM +0100, Alan Maguire wrote: > On 17/07/2025 16:25, Jiri Olsa wrote: > > Menglong reported issue where we can have function in BTF which has > > multiple addresses in kallsysm [1]. > > > > Rather than filtering this in runtime, let's teach pahole to remove > > such functions. > > > > Removing duplicate records from functions entries that have more > > at least one different address. This way btf_encoder__find_function > > won't find such functions and they won't be added in BTF. > > > > In my setup it removed 428 functions out of 77141. > > > > Is such removal necessary? If the presence of an mcount annotation is > the requirement, couldn't we just utilize > > /sys/kernel/tracing/available_filter_functions_addrs > > to map name to address safely? It identifies mcount-containing functions > and some of these appear to be duplicates, for example there I see > > ffffffff8376e8b4 acpi_attr_is_visible > ffffffff8379b7d4 acpi_attr_is_visible for that we'd need new interface for loading fentry/fexit.. programs, right? the current interface to get fentry/fexit.. attach address is: - user specifies function name, that translates to btf_id - in kernel that btf id translates back to function name - kernel uses kallsyms_lookup_name or find_kallsyms_symbol_value to get the address so we don't really know which address user wanted in the first place I think we discussed this issue some time ago, but I'm not sure what the proposal was at the end (function address stored in BTF?) this change is to make sure there are no duplicate functions in BTF that could cause this confusion.. rather than bad result, let's deny the attach for such function jirka > > ? > > > [1] https://lore.kernel.org/bpf/20250710070835.260831-1-dongml2@chinatelecom.cn/ > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > > --- > > > > Alan, > > I'd like to test this in the pahole CI, is there a way to manualy trigger it? > > > > Easiest way is to base from pahole's next branch and push to a github > repo; the tests will run as actions there. I've just merged the function > comparison work so that will be available if you base/sync a branch on > next from git.kernel.org/pub/scm/devel/pahole/pahole.git/ . Thanks! > > Alan > > > > thanks, > > jirka > > > > > > --- > > btf_encoder.c | 37 +++++++++++++++++++++++++++++++++++++ > > 1 file changed, 37 insertions(+) > > > > diff --git a/btf_encoder.c b/btf_encoder.c > > index 16739066caae..a25fe2f8bfb1 100644 > > --- a/btf_encoder.c > > +++ b/btf_encoder.c > > @@ -99,6 +99,7 @@ struct elf_function { > > size_t prefixlen; > > bool kfunc; > > uint32_t kfunc_flags; > > + unsigned long addr; > > }; > > > > struct elf_secinfo { > > @@ -1469,6 +1470,7 @@ static void elf_functions__collect_function(struct elf_functions *functions, GEl > > > > func = &functions->entries[functions->cnt]; > > func->name = name; > > + func->addr = sym->st_value; > > if (strchr(name, '.')) { > > const char *suffix = strchr(name, '.'); > > > > @@ -2143,6 +2145,40 @@ int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf) > > return err; > > } > > > > +/* > > + * Remove name duplicates from functions->entries that have > > + * at least 2 different addresses. > > + */ > > +static void functions_remove_dups(struct elf_functions *functions) > > +{ > > + struct elf_function *n = &functions->entries[0]; > > + bool matched = false, diff = false; > > + int i, j; > > + > > + for (i = 0, j = 1; i < functions->cnt && j < functions->cnt; i++, j++) { > > + struct elf_function *a = &functions->entries[i]; > > + struct elf_function *b = &functions->entries[j]; > > + > > + if (!strcmp(a->name, b->name)) { > > + matched = true; > > + diff |= a->addr != b->addr; > > + continue; > > + } > > + > > + /* > > + * Keep only not-matched entries and last one of the matched/duplicates > > + * ones if all of the matched entries had the same address. > > + **/ > > + if (!matched || !diff) > > + *n++ = *a; > > + matched = diff = false; > > + } > > + > > + if (!matched || !diff) > > + *n++ = functions->entries[functions->cnt - 1]; > > + functions->cnt = n - &functions->entries[0]; > > +} > > + > > static int elf_functions__collect(struct elf_functions *functions) > > { > > uint32_t nr_symbols = elf_symtab__nr_symbols(functions->symtab); > > @@ -2168,6 +2204,7 @@ static int elf_functions__collect(struct elf_functions *functions) > > > > if (functions->cnt) { > > qsort(functions->entries, functions->cnt, sizeof(*functions->entries), functions_cmp); > > + functions_remove_dups(functions); > > } else { > > err = 0; > > goto out_free; > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-21 14:27 ` Jiri Olsa @ 2025-07-21 14:32 ` Nick Alcock 2025-07-21 23:27 ` Ihor Solodrai 1 sibling, 0 replies; 12+ messages in thread From: Nick Alcock @ 2025-07-21 14:32 UTC (permalink / raw) To: Jiri Olsa Cc: Alan Maguire, Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman, Ihor Solodrai On 21 Jul 2025, Jiri Olsa verbalised: > On Mon, Jul 21, 2025 at 12:41:00PM +0100, Alan Maguire wrote: >> On 17/07/2025 16:25, Jiri Olsa wrote: >> > Menglong reported issue where we can have function in BTF which has >> > multiple addresses in kallsysm [1]. >> > >> > Rather than filtering this in runtime, let's teach pahole to remove >> > such functions. >> > >> > Removing duplicate records from functions entries that have more >> > at least one different address. This way btf_encoder__find_function >> > won't find such functions and they won't be added in BTF. >> > >> > In my setup it removed 428 functions out of 77141. >> > >> >> Is such removal necessary? If the presence of an mcount annotation is >> the requirement, couldn't we just utilize >> >> /sys/kernel/tracing/available_filter_functions_addrs >> >> to map name to address safely? It identifies mcount-containing functions >> and some of these appear to be duplicates, for example there I see >> >> ffffffff8376e8b4 acpi_attr_is_visible >> ffffffff8379b7d4 acpi_attr_is_visible > > for that we'd need new interface for loading fentry/fexit.. programs, right? > > the current interface to get fentry/fexit.. attach address is: > - user specifies function name, that translates to btf_id > - in kernel that btf id translates back to function name > - kernel uses kallsyms_lookup_name or find_kallsyms_symbol_value > to get the address > > so we don't really know which address user wanted in the first place > > I think we discussed this issue some time ago, but I'm not sure what > the proposal was at the end (function address stored in BTF?) Function address, translation unit name, *some* disambiguator. Really both seem like they might be useful in different situations. -- NULL && (void) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-21 14:27 ` Jiri Olsa 2025-07-21 14:32 ` Nick Alcock @ 2025-07-21 23:27 ` Ihor Solodrai 2025-07-22 10:45 ` Alan Maguire 2025-07-22 10:54 ` Jiri Olsa 1 sibling, 2 replies; 12+ messages in thread From: Ihor Solodrai @ 2025-07-21 23:27 UTC (permalink / raw) To: Jiri Olsa, Alan Maguire Cc: Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On 7/21/25 7:27 AM, Jiri Olsa wrote: > On Mon, Jul 21, 2025 at 12:41:00PM +0100, Alan Maguire wrote: >> On 17/07/2025 16:25, Jiri Olsa wrote: >>> Menglong reported issue where we can have function in BTF which has >>> multiple addresses in kallsysm [1]. >>> >>> Rather than filtering this in runtime, let's teach pahole to remove >>> such functions. >>> >>> Removing duplicate records from functions entries that have more >>> at least one different address. This way btf_encoder__find_function >>> won't find such functions and they won't be added in BTF. >>> >>> In my setup it removed 428 functions out of 77141. >>> >> >> Is such removal necessary? If the presence of an mcount annotation is >> the requirement, couldn't we just utilize >> >> /sys/kernel/tracing/available_filter_functions_addrs >> >> to map name to address safely? It identifies mcount-containing functions >> and some of these appear to be duplicates, for example there I see >> >> ffffffff8376e8b4 acpi_attr_is_visible >> ffffffff8379b7d4 acpi_attr_is_visible > > for that we'd need new interface for loading fentry/fexit.. programs, right? > > the current interface to get fentry/fexit.. attach address is: > - user specifies function name, that translates to btf_id > - in kernel that btf id translates back to function name > - kernel uses kallsyms_lookup_name or find_kallsyms_symbol_value > to get the address > > so we don't really know which address user wanted in the first place Hi Jiri, Alan. I stumbled on a bug today which seems to be relevant to this discussion. I tried building the kernel with KASAN and UBSAN, and that resulted in some kfuncs disappearing from vmlinux.h, triggering selftests/bpf compilation errors, for example: CLNG-BPF [test_progs-no_alu32] cgroup_read_xattr.bpf.o progs/cgroup_read_xattr.c:127:13: error: call to undeclared function 'bpf_cgroup_ancestor'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 127 | ancestor = bpf_cgroup_ancestor(cgrp, 1); | ^ Here is a piece of vmlinux.h diff between CONFIG_UBSAN=y/n: --- ./tools/testing/selftests/bpf/tools/include/vmlinux.h 2025-07-21 17:35:14.415733105 +0000 +++ ubsan_vmlinux.h 2025-07-21 17:33:10.455312623 +0000 @@ -117203,7 +117292,6 @@ extern int bpf_arena_reserve_pages(void *p__map, void __attribute__((address_space(1))) *ptr__ign, u32 page_cnt) __weak __ksym; extern __bpf_fastcall void *bpf_cast_to_kern_ctx(void *obj) __weak __ksym; extern struct cgroup *bpf_cgroup_acquire(struct cgroup *cgrp) __weak __ksym; -extern struct cgroup *bpf_cgroup_ancestor(struct cgroup *cgrp, int level) __weak __ksym; extern struct cgroup *bpf_cgroup_from_id(u64 cgid) __weak __ksym; extern int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; extern void bpf_cgroup_release(struct cgroup *cgrp) __weak __ksym; @@ -117260,7 +117348,6 @@ extern bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *p) __weak __ksym; extern int bpf_dynptr_memset(struct bpf_dynptr *p, u32 offset, u32 size, u8 val) __weak __ksym; extern __u32 bpf_dynptr_size(const struct bpf_dynptr *p) __weak __ksym; -extern void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; extern void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; extern int bpf_fentry_test1(int a) __weak __ksym; extern int bpf_get_dentry_xattr(struct dentry *dentry, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; @@ -117287,7 +117374,6 @@ extern int bpf_iter_num_new(struct bpf_iter_num *it, int start, int end) __weak __ksym; extern int *bpf_iter_num_next(struct bpf_iter_num *it) __weak __ksym; extern void bpf_iter_scx_dsq_destroy(struct bpf_iter_scx_dsq *it) __weak __ksym; -extern int bpf_iter_scx_dsq_new(struct bpf_iter_scx_dsq *it, u64 dsq_id, u64 flags) __weak __ksym; extern struct task_struct *bpf_iter_scx_dsq_next(struct bpf_iter_scx_dsq *it) __weak __ksym; extern void bpf_iter_task_destroy(struct bpf_iter_task *it) __weak __ksym; extern int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task__nullable, unsigned int flags) __weak __ksym; @@ -117373,11 +117459,8 @@ extern int bpf_strspn(const char *s__ign, const char *accept__ign) __weak __ksym; extern int bpf_strstr(const char *s1__ign, const char *s2__ign) __weak __ksym; extern struct task_struct *bpf_task_acquire(struct task_struct *p) __weak __ksym; -extern struct task_struct *bpf_task_from_pid(s32 pid) __weak __ksym; -extern struct task_struct *bpf_task_from_vpid(s32 vpid) __weak __ksym; extern struct cgroup *bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id) __weak __ksym; extern void bpf_task_release(struct task_struct *p) __weak __ksym; -extern long int bpf_task_under_cgroup(struct task_struct *task, struct cgroup *ancestor) __weak __ksym; extern void bpf_throw(u64 cookie) __weak __ksym; extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_p, struct bpf_dynptr *sig_p, struct bpf_key *trusted_keyring) __weak __ksym; extern int bpf_wq_init(struct bpf_wq *wq, void *p__map, unsigned int flags) __weak __ksym; @@ -117412,15 +117495,10 @@ extern u32 scx_bpf_cpuperf_cur(s32 cpu) __weak __ksym; extern void scx_bpf_cpuperf_set(s32 cpu, u32 perf) __weak __ksym; extern s32 scx_bpf_create_dsq(u64 dsq_id, s32 node) __weak __ksym; -extern void scx_bpf_destroy_dsq(u64 dsq_id) __weak __ksym; -extern void scx_bpf_dispatch(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; extern void scx_bpf_dispatch_cancel(void) __weak __ksym; extern bool scx_bpf_dispatch_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; -extern void scx_bpf_dispatch_from_dsq_set_slice(struct bpf_iter_scx_dsq *it__iter, u64 slice) __weak __ksym; -extern void scx_bpf_dispatch_from_dsq_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; extern u32 scx_bpf_dispatch_nr_slots(void) __weak __ksym; extern void scx_bpf_dispatch_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; -extern bool scx_bpf_dispatch_vtime_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; extern void scx_bpf_dsq_insert(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; extern void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; extern bool scx_bpf_dsq_move(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; @@ -117428,10 +117506,8 @@ extern void scx_bpf_dsq_move_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; extern bool scx_bpf_dsq_move_to_local(u64 dsq_id) __weak __ksym; extern bool scx_bpf_dsq_move_vtime(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; -extern s32 scx_bpf_dsq_nr_queued(u64 dsq_id) __weak __ksym; extern void scx_bpf_dump_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; extern void scx_bpf_error_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; -extern void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __weak __ksym; extern void scx_bpf_exit_bstr(s64 exit_code, char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; extern const struct cpumask *scx_bpf_get_idle_cpumask(void) __weak __ksym; extern const struct cpumask *scx_bpf_get_idle_cpumask_node(int node) __weak __ksym; Then I checked the difference between BTFs, and found that there is no DECL_TAG 'bpf_kfunc' produced for the affected functions: $ diff -u vmlinux.btf.out vmlinux_ubsan.btf.out | grep -C5 cgroup_ancestor +[52749] FUNC 'bpf_cgroup_acquire' type_id=52748 linkage=static +[52750] DECL_TAG 'bpf_kfunc' type_id=52749 component_idx=-1 +[52751] FUNC_PROTO '(anon)' ret_type_id=426 vlen=2 'cgrp' type_id=426 'level' type_id=21 -[52681] FUNC 'bpf_cgroup_ancestor' type_id=52680 linkage=static -[52682] DECL_TAG 'bpf_kfunc' type_id=52681 component_idx=-1 -[52683] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 +[52752] FUNC 'bpf_cgroup_ancestor' type_id=52751 linkage=static +[52753] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 'attach_type' type_id=1767 'attach_btf_id' type_id=34 -[52684] FUNC 'bpf_cgroup_atype_find' type_id=52683 linkage=static -[52685] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 Which is clearly wrong and suggests a bug. After some debugging, I found that the problem is in btf_encoder__find_function(), and more specifically in the comparator used for bsearch and qsort. static int functions_cmp(const void *_a, const void *_b) { const struct elf_function *a = _a; const struct elf_function *b = _b; /* if search key allows prefix match, verify target has matching * prefix len and prefix matches. */ if (a->prefixlen && a->prefixlen == b->prefixlen) return strncmp(a->name, b->name, b->prefixlen); return strcmp(a->name, b->name); } For this particular vmlinux that I compiled, btf_encoder__find_function("bpf_cgroup_ancestor", prefixlen=0) returns NULL, even though there is an elf_function struct for bpf_cgroup_ancestor in the table. The reason for it is that bsearch() happens to hit "bpf_cgroup_ancestor.cold" in the table first. strcmp("bpf_cgroup_ancestor", "bpf_cgroup_ancestor.cold)") gives a negative value, but bpf_cgroup_ancestor entry is stored in the other direction in the table. This is surprising at the first glance, because we use the same functions_cmp both for search and for qsort. But as it turns we are actually using two different comparators within functions_cmp(): we set key.prefixlen=0 for exact match and when it's non-zero we search for prefix match. When sorting the table, there are no entries with prefixlen=0, so the order of elements is not exactly right for the bsearch(). That's a nasty bug, but as far as I understand, all this complexity is unnecessary in case of '.cold' suffix, because they simply are not supposed to be in the elf_functions table: it's usually just a piece of a target function. There are a couple of ways this particular bug could be fixed (filtering out .cold symbols, for example). But I think this bug and the problem Jiri is trying to solve stems from the fact that one function name, which is an identifier the users care about the most, may be associated with many ELF symbols and/or addresses. What is clear to me in the context of pahole's BTF encoding is that we want elf_functions table to only have a single entry per name (where name is an actual name that might be referred to by users, not an ELF sym name), and have a deterministic mechanism for selecting one (or none) func from many at the time of processing ELF data. The current implementation is just buggy in this regard. I am not aware of long term plans for addressing this, though it looks like this was discussed before. I'd appreciate if you share any relevant threads. Thanks. > > I think we discussed this issue some time ago, but I'm not sure what > the proposal was at the end (function address stored in BTF?) > > this change is to make sure there are no duplicate functions in BTF > that could cause this confusion.. rather than bad result, let's deny > the attach for such function > > jirka > > >> >> ? >> >>> [1] https://lore.kernel.org/bpf/20250710070835.260831-1-dongml2@chinatelecom.cn/ >>> Reported-by: Menglong Dong <menglong8.dong@gmail.com> >>> Signed-off-by: Jiri Olsa <jolsa@kernel.org> >>> --- >>> >>> Alan, >>> I'd like to test this in the pahole CI, is there a way to manualy trigger it? >>> >> >> Easiest way is to base from pahole's next branch and push to a github >> repo; the tests will run as actions there. I've just merged the function >> comparison work so that will be available if you base/sync a branch on >> next from git.kernel.org/pub/scm/devel/pahole/pahole.git/ . Thanks! >> >> Alan >> >> >>> thanks, >>> jirka >>> >>> >>> --- >>> btf_encoder.c | 37 +++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 37 insertions(+) >>> >>> diff --git a/btf_encoder.c b/btf_encoder.c >>> index 16739066caae..a25fe2f8bfb1 100644 >>> --- a/btf_encoder.c >>> +++ b/btf_encoder.c >>> @@ -99,6 +99,7 @@ struct elf_function { >>> size_t prefixlen; >>> bool kfunc; >>> uint32_t kfunc_flags; >>> + unsigned long addr; >>> }; >>> >>> struct elf_secinfo { >>> @@ -1469,6 +1470,7 @@ static void elf_functions__collect_function(struct elf_functions *functions, GEl >>> >>> func = &functions->entries[functions->cnt]; >>> func->name = name; >>> + func->addr = sym->st_value; >>> if (strchr(name, '.')) { >>> const char *suffix = strchr(name, '.'); >>> >>> @@ -2143,6 +2145,40 @@ int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf) >>> return err; >>> } >>> >>> +/* >>> + * Remove name duplicates from functions->entries that have >>> + * at least 2 different addresses. >>> + */ >>> +static void functions_remove_dups(struct elf_functions *functions) >>> +{ >>> + struct elf_function *n = &functions->entries[0]; >>> + bool matched = false, diff = false; >>> + int i, j; >>> + >>> + for (i = 0, j = 1; i < functions->cnt && j < functions->cnt; i++, j++) { >>> + struct elf_function *a = &functions->entries[i]; >>> + struct elf_function *b = &functions->entries[j]; >>> + >>> + if (!strcmp(a->name, b->name)) { >>> + matched = true; >>> + diff |= a->addr != b->addr; >>> + continue; >>> + } >>> + >>> + /* >>> + * Keep only not-matched entries and last one of the matched/duplicates >>> + * ones if all of the matched entries had the same address. >>> + **/ >>> + if (!matched || !diff) >>> + *n++ = *a; >>> + matched = diff = false; >>> + } >>> + >>> + if (!matched || !diff) >>> + *n++ = functions->entries[functions->cnt - 1]; >>> + functions->cnt = n - &functions->entries[0]; >>> +} >>> + >>> static int elf_functions__collect(struct elf_functions *functions) >>> { >>> uint32_t nr_symbols = elf_symtab__nr_symbols(functions->symtab); >>> @@ -2168,6 +2204,7 @@ static int elf_functions__collect(struct elf_functions *functions) >>> >>> if (functions->cnt) { >>> qsort(functions->entries, functions->cnt, sizeof(*functions->entries), functions_cmp); >>> + functions_remove_dups(functions); >>> } else { >>> err = 0; >>> goto out_free; >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-21 23:27 ` Ihor Solodrai @ 2025-07-22 10:45 ` Alan Maguire 2025-07-22 22:58 ` Ihor Solodrai 2025-07-22 10:54 ` Jiri Olsa 1 sibling, 1 reply; 12+ messages in thread From: Alan Maguire @ 2025-07-22 10:45 UTC (permalink / raw) To: Ihor Solodrai, Jiri Olsa Cc: Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On 22/07/2025 00:27, Ihor Solodrai wrote: > On 7/21/25 7:27 AM, Jiri Olsa wrote: >> On Mon, Jul 21, 2025 at 12:41:00PM +0100, Alan Maguire wrote: >>> On 17/07/2025 16:25, Jiri Olsa wrote: >>>> Menglong reported issue where we can have function in BTF which has >>>> multiple addresses in kallsysm [1]. >>>> >>>> Rather than filtering this in runtime, let's teach pahole to remove >>>> such functions. >>>> >>>> Removing duplicate records from functions entries that have more >>>> at least one different address. This way btf_encoder__find_function >>>> won't find such functions and they won't be added in BTF. >>>> >>>> In my setup it removed 428 functions out of 77141. >>>> >>> >>> Is such removal necessary? If the presence of an mcount annotation is >>> the requirement, couldn't we just utilize >>> >>> /sys/kernel/tracing/available_filter_functions_addrs >>> >>> to map name to address safely? It identifies mcount-containing functions >>> and some of these appear to be duplicates, for example there I see >>> >>> ffffffff8376e8b4 acpi_attr_is_visible >>> ffffffff8379b7d4 acpi_attr_is_visible >> >> for that we'd need new interface for loading fentry/fexit.. programs, right? >> >> the current interface to get fentry/fexit.. attach address is: >> - user specifies function name, that translates to btf_id >> - in kernel that btf id translates back to function name >> - kernel uses kallsyms_lookup_name or find_kallsyms_symbol_value >> to get the address >> >> so we don't really know which address user wanted in the first place > > Hi Jiri, Alan. > > I stumbled on a bug today which seems to be relevant to this > discussion. > > I tried building the kernel with KASAN and UBSAN, and that resulted in > some kfuncs disappearing from vmlinux.h, triggering selftests/bpf > compilation errors, for example: > > CLNG-BPF [test_progs-no_alu32] cgroup_read_xattr.bpf.o > progs/cgroup_read_xattr.c:127:13: error: call to undeclared function 'bpf_cgroup_ancestor'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] > 127 | ancestor = bpf_cgroup_ancestor(cgrp, 1); > | ^ > > Here is a piece of vmlinux.h diff between CONFIG_UBSAN=y/n: > > --- ./tools/testing/selftests/bpf/tools/include/vmlinux.h 2025-07-21 17:35:14.415733105 +0000 > +++ ubsan_vmlinux.h 2025-07-21 17:33:10.455312623 +0000 > @@ -117203,7 +117292,6 @@ > extern int bpf_arena_reserve_pages(void *p__map, void __attribute__((address_space(1))) *ptr__ign, u32 page_cnt) __weak __ksym; > extern __bpf_fastcall void *bpf_cast_to_kern_ctx(void *obj) __weak __ksym; > extern struct cgroup *bpf_cgroup_acquire(struct cgroup *cgrp) __weak __ksym; > -extern struct cgroup *bpf_cgroup_ancestor(struct cgroup *cgrp, int level) __weak __ksym; > extern struct cgroup *bpf_cgroup_from_id(u64 cgid) __weak __ksym; > extern int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; > extern void bpf_cgroup_release(struct cgroup *cgrp) __weak __ksym; > @@ -117260,7 +117348,6 @@ > extern bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *p) __weak __ksym; > extern int bpf_dynptr_memset(struct bpf_dynptr *p, u32 offset, u32 size, u8 val) __weak __ksym; > extern __u32 bpf_dynptr_size(const struct bpf_dynptr *p) __weak __ksym; > -extern void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; > extern void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; > extern int bpf_fentry_test1(int a) __weak __ksym; > extern int bpf_get_dentry_xattr(struct dentry *dentry, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; > @@ -117287,7 +117374,6 @@ > extern int bpf_iter_num_new(struct bpf_iter_num *it, int start, int end) __weak __ksym; > extern int *bpf_iter_num_next(struct bpf_iter_num *it) __weak __ksym; > extern void bpf_iter_scx_dsq_destroy(struct bpf_iter_scx_dsq *it) __weak __ksym; > -extern int bpf_iter_scx_dsq_new(struct bpf_iter_scx_dsq *it, u64 dsq_id, u64 flags) __weak __ksym; > extern struct task_struct *bpf_iter_scx_dsq_next(struct bpf_iter_scx_dsq *it) __weak __ksym; > extern void bpf_iter_task_destroy(struct bpf_iter_task *it) __weak __ksym; > extern int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task__nullable, unsigned int flags) __weak __ksym; > @@ -117373,11 +117459,8 @@ > extern int bpf_strspn(const char *s__ign, const char *accept__ign) __weak __ksym; > extern int bpf_strstr(const char *s1__ign, const char *s2__ign) __weak __ksym; > extern struct task_struct *bpf_task_acquire(struct task_struct *p) __weak __ksym; > -extern struct task_struct *bpf_task_from_pid(s32 pid) __weak __ksym; > -extern struct task_struct *bpf_task_from_vpid(s32 vpid) __weak __ksym; > extern struct cgroup *bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id) __weak __ksym; > extern void bpf_task_release(struct task_struct *p) __weak __ksym; > -extern long int bpf_task_under_cgroup(struct task_struct *task, struct cgroup *ancestor) __weak __ksym; > extern void bpf_throw(u64 cookie) __weak __ksym; > extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_p, struct bpf_dynptr *sig_p, struct bpf_key *trusted_keyring) __weak __ksym; > extern int bpf_wq_init(struct bpf_wq *wq, void *p__map, unsigned int flags) __weak __ksym; > @@ -117412,15 +117495,10 @@ > extern u32 scx_bpf_cpuperf_cur(s32 cpu) __weak __ksym; > extern void scx_bpf_cpuperf_set(s32 cpu, u32 perf) __weak __ksym; > extern s32 scx_bpf_create_dsq(u64 dsq_id, s32 node) __weak __ksym; > -extern void scx_bpf_destroy_dsq(u64 dsq_id) __weak __ksym; > -extern void scx_bpf_dispatch(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; > extern void scx_bpf_dispatch_cancel(void) __weak __ksym; > extern bool scx_bpf_dispatch_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > -extern void scx_bpf_dispatch_from_dsq_set_slice(struct bpf_iter_scx_dsq *it__iter, u64 slice) __weak __ksym; > -extern void scx_bpf_dispatch_from_dsq_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; > extern u32 scx_bpf_dispatch_nr_slots(void) __weak __ksym; > extern void scx_bpf_dispatch_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; > -extern bool scx_bpf_dispatch_vtime_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > extern void scx_bpf_dsq_insert(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; > extern void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; > extern bool scx_bpf_dsq_move(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > @@ -117428,10 +117506,8 @@ > extern void scx_bpf_dsq_move_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; > extern bool scx_bpf_dsq_move_to_local(u64 dsq_id) __weak __ksym; > extern bool scx_bpf_dsq_move_vtime(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > -extern s32 scx_bpf_dsq_nr_queued(u64 dsq_id) __weak __ksym; > extern void scx_bpf_dump_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; > extern void scx_bpf_error_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; > -extern void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __weak __ksym; > extern void scx_bpf_exit_bstr(s64 exit_code, char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; > extern const struct cpumask *scx_bpf_get_idle_cpumask(void) __weak __ksym; > extern const struct cpumask *scx_bpf_get_idle_cpumask_node(int node) __weak __ksym; > > Then I checked the difference between BTFs, and found that there is no > DECL_TAG 'bpf_kfunc' produced for the affected functions: > > $ diff -u vmlinux.btf.out vmlinux_ubsan.btf.out | grep -C5 cgroup_ancestor > +[52749] FUNC 'bpf_cgroup_acquire' type_id=52748 linkage=static > +[52750] DECL_TAG 'bpf_kfunc' type_id=52749 component_idx=-1 > +[52751] FUNC_PROTO '(anon)' ret_type_id=426 vlen=2 > 'cgrp' type_id=426 > 'level' type_id=21 > -[52681] FUNC 'bpf_cgroup_ancestor' type_id=52680 linkage=static > -[52682] DECL_TAG 'bpf_kfunc' type_id=52681 component_idx=-1 > -[52683] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 > +[52752] FUNC 'bpf_cgroup_ancestor' type_id=52751 linkage=static > +[52753] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 > 'attach_type' type_id=1767 > 'attach_btf_id' type_id=34 > -[52684] FUNC 'bpf_cgroup_atype_find' type_id=52683 linkage=static > -[52685] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > > Which is clearly wrong and suggests a bug. > > After some debugging, I found that the problem is in > btf_encoder__find_function(), and more specifically in > the comparator used for bsearch and qsort. > > static int functions_cmp(const void *_a, const void *_b) > { > const struct elf_function *a = _a; > const struct elf_function *b = _b; > > /* if search key allows prefix match, verify target has matching > * prefix len and prefix matches. > */ > if (a->prefixlen && a->prefixlen == b->prefixlen) > return strncmp(a->name, b->name, b->prefixlen); > return strcmp(a->name, b->name); > } > > For this particular vmlinux that I compiled, > btf_encoder__find_function("bpf_cgroup_ancestor", prefixlen=0) returns > NULL, even though there is an elf_function struct for > bpf_cgroup_ancestor in the table. > > The reason for it is that bsearch() happens to hit > "bpf_cgroup_ancestor.cold" in the table first. > strcmp("bpf_cgroup_ancestor", "bpf_cgroup_ancestor.cold)") gives a > negative value, but bpf_cgroup_ancestor entry is stored in the other > direction in the table. > > This is surprising at the first glance, because we use the same > functions_cmp both for search and for qsort. > > But as it turns we are actually using two different comparators within > functions_cmp(): we set key.prefixlen=0 for exact match and when it's > non-zero we search for prefix match. When sorting the table, there are > no entries with prefixlen=0, so the order of elements is not exactly > right for the bsearch(). > > That's a nasty bug, but as far as I understand, all this complexity is > unnecessary in case of '.cold' suffix, because they simply are not > supposed to be in the elf_functions table: it's usually just a piece > of a target function. > > There are a couple of ways this particular bug could be fixed > (filtering out .cold symbols, for example). But I think this bug and > the problem Jiri is trying to solve stems from the fact that one > function name, which is an identifier the users care about the most, > may be associated with many ELF symbols and/or addresses. > > What is clear to me in the context of pahole's BTF encoding is that we > want elf_functions table to only have a single entry per name (where > name is an actual name that might be referred to by users, not an ELF > sym name), and have a deterministic mechanism for selecting one (or > none) func from many at the time of processing ELF data. > > The current implementation is just buggy in this regard. > There are a few separable issues here I think. First as you say, certain suffixes should not be eligible as matches at all - .cold is one, and .part is another (partial inlining). As such they should be filtered and removed as potential matches. Second we need to fix the function sort/search logic. Third we need to decide how to deal with cases where the function does not correspond to an mcount boundary. It'd be interesting to see if the filtering helps here, but part of the problem is also that we don't currently have a mechanism to help guide the match between function name and function site that is done when the fentry attach is carried out. Yonghong and I talked about it in [1]. Addresses seem like the natural means to help guide that, so a DATASEC-like set of addresses would help this matching. I had a WIP version of this but it wasn't working fully yet. I'll revive it and see if I can get it out as an RFC. Needs to take into account the work being done on inlines too [1]. In terms of the tracer's actual intent, multi-site functions are often "static inline" functions in a .h file that don't actually get inlined; the user intent would be often to trace all instances, but it seems to me we need to provide a means to support both this or to trace a specific instance. How the latter is best represented from the tracer side I'm not sure; raw addresses would be one way I suppose. Absent an explicit request from the tracer I'm not sure what heuristics make most sense; currently we just pick the first instance I suspect, and might need to continue to do so for backwards compatibility. > I am not aware of long term plans for addressing this, though it looks > like this was discussed before. I'd appreciate if you share any > relevant threads. > Yonghong and I discussed this a bit in [1], and the inline thread in [2] has some more details. [1] https://lpc.events/event/18/contributions/1945/attachments/1508/3179/Kernel%20func%20tracing%20in%20the%20face%20of%20compiler%20optimization.pdf [2] https://lore.kernel.org/bpf/20250416-btf_inline-v1-0-e4bd2f8adae5@meta.com/ > Thanks. > > >> >> I think we discussed this issue some time ago, but I'm not sure what >> the proposal was at the end (function address stored in BTF?) >> >> this change is to make sure there are no duplicate functions in BTF >> that could cause this confusion.. rather than bad result, let's deny >> the attach for such function >> >> jirka >> >> >>> >>> ? >>> >>>> [1] https://lore.kernel.org/bpf/20250710070835.260831-1-dongml2@chinatelecom.cn/ >>>> Reported-by: Menglong Dong <menglong8.dong@gmail.com> >>>> Signed-off-by: Jiri Olsa <jolsa@kernel.org> >>>> --- >>>> >>>> Alan, >>>> I'd like to test this in the pahole CI, is there a way to manualy trigger it? >>>> >>> >>> Easiest way is to base from pahole's next branch and push to a github >>> repo; the tests will run as actions there. I've just merged the function >>> comparison work so that will be available if you base/sync a branch on >>> next from git.kernel.org/pub/scm/devel/pahole/pahole.git/ . Thanks! >>> >>> Alan >>> >>> >>>> thanks, >>>> jirka >>>> >>>> >>>> --- >>>> btf_encoder.c | 37 +++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 37 insertions(+) >>>> >>>> diff --git a/btf_encoder.c b/btf_encoder.c >>>> index 16739066caae..a25fe2f8bfb1 100644 >>>> --- a/btf_encoder.c >>>> +++ b/btf_encoder.c >>>> @@ -99,6 +99,7 @@ struct elf_function { >>>> size_t prefixlen; >>>> bool kfunc; >>>> uint32_t kfunc_flags; >>>> + unsigned long addr; >>>> }; >>>> >>>> struct elf_secinfo { >>>> @@ -1469,6 +1470,7 @@ static void elf_functions__collect_function(struct elf_functions *functions, GEl >>>> >>>> func = &functions->entries[functions->cnt]; >>>> func->name = name; >>>> + func->addr = sym->st_value; >>>> if (strchr(name, '.')) { >>>> const char *suffix = strchr(name, '.'); >>>> >>>> @@ -2143,6 +2145,40 @@ int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf) >>>> return err; >>>> } >>>> >>>> +/* >>>> + * Remove name duplicates from functions->entries that have >>>> + * at least 2 different addresses. >>>> + */ >>>> +static void functions_remove_dups(struct elf_functions *functions) >>>> +{ >>>> + struct elf_function *n = &functions->entries[0]; >>>> + bool matched = false, diff = false; >>>> + int i, j; >>>> + >>>> + for (i = 0, j = 1; i < functions->cnt && j < functions->cnt; i++, j++) { >>>> + struct elf_function *a = &functions->entries[i]; >>>> + struct elf_function *b = &functions->entries[j]; >>>> + >>>> + if (!strcmp(a->name, b->name)) { >>>> + matched = true; >>>> + diff |= a->addr != b->addr; >>>> + continue; >>>> + } >>>> + >>>> + /* >>>> + * Keep only not-matched entries and last one of the matched/duplicates >>>> + * ones if all of the matched entries had the same address. >>>> + **/ >>>> + if (!matched || !diff) >>>> + *n++ = *a; >>>> + matched = diff = false; >>>> + } >>>> + >>>> + if (!matched || !diff) >>>> + *n++ = functions->entries[functions->cnt - 1]; >>>> + functions->cnt = n - &functions->entries[0]; >>>> +} >>>> + >>>> static int elf_functions__collect(struct elf_functions *functions) >>>> { >>>> uint32_t nr_symbols = elf_symtab__nr_symbols(functions->symtab); >>>> @@ -2168,6 +2204,7 @@ static int elf_functions__collect(struct elf_functions *functions) >>>> >>>> if (functions->cnt) { >>>> qsort(functions->entries, functions->cnt, sizeof(*functions->entries), functions_cmp); >>>> + functions_remove_dups(functions); >>>> } else { >>>> err = 0; >>>> goto out_free; >>> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-22 10:45 ` Alan Maguire @ 2025-07-22 22:58 ` Ihor Solodrai 2025-07-23 11:22 ` Jiri Olsa 0 siblings, 1 reply; 12+ messages in thread From: Ihor Solodrai @ 2025-07-22 22:58 UTC (permalink / raw) To: Alan Maguire, Jiri Olsa Cc: Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On 7/22/25 3:45 AM, Alan Maguire wrote: > On 22/07/2025 00:27, Ihor Solodrai wrote: >> On 7/21/25 7:27 AM, Jiri Olsa wrote: >>> On Mon, Jul 21, 2025 at 12:41:00PM +0100, Alan Maguire wrote: >>>> On 17/07/2025 16:25, Jiri Olsa wrote: >> >> [...] >> >> The current implementation is just buggy in this regard. >> > > There are a few separable issues here I think. You're right. I think in my mind all the issues we are discussing here boil down to a single fundamental problem: one (function) name maps to many things. But then there are a lot of details about what those things are, how to find them, represent them etc. > > First as you say, certain suffixes should not be eligible as matches at > all - .cold is one, and .part is another (partial inlining). As such > they should be filtered and removed as potential matches. > > Second we need to fix the function sort/search logic. Ok, here is my stab at it. See a patch draft below. Using ELF symbol name as the key in elf_functions table is bug-prone, given that we lookup a name loaded from DWARF (without suffixes). So instead of having a table of ELF sym names directly and attempting to search there, I tried a table of unsuffixed names (assuming that any prefix before first '.' is a proper name, which is an assumption already present in pahole). To not lose the information from the ELF symbols, we simply collect it at the time that table is built, basically constructing a mapping: function name -> [array of elf sym info] Then we can inspect this array when a decision about inclusion into BTF has to made. In the patch I check for .cold and multiple addresses. There is little difference in resulting BTF, but the bug I reported gets fixed with the change, as well as ambigous address problem. Please let me know if the approach makes sense. From 1088367c1facb8b2e6700df17aa5b6e306578334 Mon Sep 17 00:00:00 2001 From: Ihor Solodrai <isolodrai@meta.com> Date: Tue, 22 Jul 2025 15:16:36 -0700 Subject: [PATCH] btf_encoder: group all function ELF syms by function name btf_encoder collects function ELF symbols into a table, which is later used for processing DWARF data and determining whether a function can be added to BTF. So far, the ELF symbol name was used as a key for search in this table, and a search by prefix match was attempted in cases when ELF symbol name has a compiler-generated suffix. This implementation has bugs, causing some functions to be inappropriately excluded from (or included into) BTF. Rework the implementation of the ELF functions table: use a proper name of a function (without any suffix) as a key, and collect each relevant ELF symbol information for later examination. This way we can guarantee that btf_encoder__find_function() always returns correct elf_function object (or NULL). Collect an array of symbol name + address pairs from GElf_Sym for each elf_function. Then, in saved_functions_combine() use this information when deciding whether a function should be included in BTF. In particular, exclude functions with ambiguous address, taking into account that .cold symbols can be ignored. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> --- btf_encoder.c | 248 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 162 insertions(+), 86 deletions(-) diff --git a/btf_encoder.c b/btf_encoder.c index 0bc2334..fcc30aa 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -87,16 +87,22 @@ struct btf_encoder_func_state { uint8_t optimized_parms:1; uint8_t unexpected_reg:1; uint8_t inconsistent_proto:1; + uint8_t ambiguous_addr:1; int ret_type_id; struct btf_encoder_func_parm *parms; struct btf_encoder_func_annot *annots; }; +struct elf_function_sym { + const char *name; + uint64_t addr; +}; + struct elf_function { - const char *name; - char *alias; - size_t prefixlen; - bool kfunc; + char *name; + struct elf_function_sym *syms; + uint8_t sym_cnt; + uint8_t kfunc:1; uint32_t kfunc_flags; }; @@ -161,10 +167,18 @@ struct btf_kfunc_set_range { uint64_t end; }; +static inline void elf_function__free_content(struct elf_function *func) { + free(func->name); + if (func->sym_cnt) + free(func->syms); + memset(func, 0, sizeof(*func)); +} + static inline void elf_functions__delete(struct elf_functions *funcs) { - for (int i = 0; i < funcs->cnt; i++) - free(funcs->entries[i].alias); + for (int i = 0; i < funcs->cnt; i++) { + elf_function__free_content(&funcs->entries[i]); + } free(funcs->entries); elf_symtab__delete(funcs->symtab); list_del(&funcs->node); @@ -981,8 +995,7 @@ static void btf_encoder__log_func_skip(struct btf_encoder *encoder, struct elf_f if (!encoder->verbose) return; - printf("%s (%s): skipping BTF encoding of function due to ", - func->alias ?: func->name, func->name); + printf("%s: skipping BTF encoding of function due to ", func->name); va_start(ap, fmt); vprintf(fmt, ap); va_end(ap); @@ -1176,6 +1189,33 @@ static struct btf_encoder_func_state *btf_encoder__alloc_func_state(struct btf_e return state; } +static inline bool str_has_suffix(const char *str, const char *suffix) { + int prefixlen = strlen(str) - strlen(suffix); + + if (prefixlen < 0) + return false; + + return !strcmp(str + prefixlen, suffix); +} + +static bool elf_function__has_ambiguous_address(struct elf_function *func) { + if (func->sym_cnt <= 1) + return false; + + uint64_t addr = 0; + for (int i = 0; i < func->sym_cnt; i++) { + struct elf_function_sym *sym = &func->syms[i]; + if (!str_has_suffix(sym->name, ".cold")) { + if (addr && addr != sym->addr) + return true; + else + addr = sym->addr; + } + } + + return false; +} + static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct function *fn, struct elf_function *func) { struct btf_encoder_func_state *state = btf_encoder__alloc_func_state(encoder); @@ -1191,6 +1231,9 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi state->encoder = encoder; state->elf = func; + + state->ambiguous_addr = elf_function__has_ambiguous_address(func); + state->nr_parms = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0); state->ret_type_id = ftype->tag.type == 0 ? 0 : encoder->type_id_off + ftype->tag.type; if (state->nr_parms > 0) { @@ -1294,7 +1337,7 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, int err; btf_fnproto_id = btf_encoder__add_func_proto(encoder, NULL, state); - name = func->alias ?: func->name; + name = func->name; if (btf_fnproto_id >= 0) btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false); @@ -1338,48 +1381,39 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, return 0; } -static int functions_cmp(const void *_a, const void *_b) +static int elf_function__name_cmp(const void *_a, const void *_b) { const struct elf_function *a = _a; const struct elf_function *b = _b; - /* if search key allows prefix match, verify target has matching - * prefix len and prefix matches. - */ - if (a->prefixlen && a->prefixlen == b->prefixlen) - return strncmp(a->name, b->name, b->prefixlen); return strcmp(a->name, b->name); } -#ifndef max -#define max(x, y) ((x) < (y) ? (y) : (x)) -#endif - static int saved_functions_cmp(const void *_a, const void *_b) { const struct btf_encoder_func_state *a = _a; const struct btf_encoder_func_state *b = _b; - return functions_cmp(a->elf, b->elf); + return elf_function__name_cmp(a->elf, b->elf); } static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b) { - uint8_t optimized, unexpected, inconsistent; - int ret; + uint8_t optimized, unexpected, inconsistent, ambiguous_addr; + + if (a->elf != b->elf) + return 1; - ret = strncmp(a->elf->name, b->elf->name, - max(a->elf->prefixlen, b->elf->prefixlen)); - if (ret != 0) - return ret; optimized = a->optimized_parms | b->optimized_parms; unexpected = a->unexpected_reg | b->unexpected_reg; inconsistent = a->inconsistent_proto | b->inconsistent_proto; - if (!unexpected && !inconsistent && !funcs__match(a, b)) + ambiguous_addr = a->ambiguous_addr | b->ambiguous_addr; + if (!unexpected && !inconsistent && !ambiguous_addr && !funcs__match(a, b)) inconsistent = 1; a->optimized_parms = b->optimized_parms = optimized; a->unexpected_reg = b->unexpected_reg = unexpected; a->inconsistent_proto = b->inconsistent_proto = inconsistent; + a->ambiguous_addr = b->ambiguous_addr = ambiguous_addr; return 0; } @@ -1447,32 +1481,6 @@ out: return err; } -static void elf_functions__collect_function(struct elf_functions *functions, GElf_Sym *sym) -{ - struct elf_function *func; - const char *name; - - if (elf_sym__type(sym) != STT_FUNC) - return; - - name = elf_sym__name(sym, functions->symtab); - if (!name) - return; - - func = &functions->entries[functions->cnt]; - func->name = name; - if (strchr(name, '.')) { - const char *suffix = strchr(name, '.'); - - functions->suffix_cnt++; - func->prefixlen = suffix - name; - } else { - func->prefixlen = strlen(name); - } - - functions->cnt++; -} - static struct elf_functions *btf_encoder__elf_functions(struct btf_encoder *encoder) { struct elf_functions *funcs = NULL; @@ -1490,13 +1498,12 @@ static struct elf_functions *btf_encoder__elf_functions(struct btf_encoder *enco return funcs; } -static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, - const char *name, size_t prefixlen) +static struct elf_function *btf_encoder__find_function(const struct btf_encoder *encoder, const char *name) { struct elf_functions *funcs = elf_functions__find(encoder->cu->elf, &encoder->elf_functions_list); - struct elf_function key = { .name = name, .prefixlen = prefixlen }; + struct elf_function key = { .name = (char*)name }; - return bsearch(&key, funcs->entries, funcs->cnt, sizeof(key), functions_cmp); + return bsearch(&key, funcs->entries, funcs->cnt, sizeof(key), elf_function__name_cmp); } static bool btf_name_char_ok(char c, bool first) @@ -2060,7 +2067,7 @@ static int btf_encoder__collect_kfuncs(struct btf_encoder *encoder) continue; } - elf_fn = btf_encoder__find_function(encoder, func, 0); + elf_fn = btf_encoder__find_function(encoder, func); if (elf_fn) { elf_fn->kfunc = true; elf_fn->kfunc_flags = pair->flags; @@ -2135,14 +2142,45 @@ int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf) return err; } +static inline int elf_function__push_sym(struct elf_function *func, struct elf_function_sym *sym) { + struct elf_function_sym *tmp; + + if (func->sym_cnt) + tmp = realloc(func->syms, (func->sym_cnt + 1) * sizeof(func->syms[0])); + else + tmp = calloc(sizeof(func->syms[0]), 1); + + if (!tmp) + return -ENOMEM; + + func->syms = tmp; + func->syms[func->sym_cnt] = *sym; + func->sym_cnt++; + + return 0; +} + +static void print_func_table(struct elf_functions *functions) { + for (int i = 0; i < functions->cnt; i++) { + struct elf_function *func = &functions->entries[i]; + printf("%s -> [", func->name); + for (int j = 0; j < func->sym_cnt; j++) { + printf("{ %s %lx } ", func->syms[j].name, func->syms[j].addr); + } + printf("]\n"); + } +} + static int elf_functions__collect(struct elf_functions *functions) { uint32_t nr_symbols = elf_symtab__nr_symbols(functions->symtab); - struct elf_function *tmp; + struct elf_function *func, *tmp; Elf32_Word sym_sec_idx; + const char *sym_name; uint32_t core_id; + struct elf_function_sym func_sym; GElf_Sym sym; - int err = 0; + int err = 0, i, j; /* We know that number of functions is less than number of symbols, * so we can overallocate temporarily. @@ -2153,18 +2191,75 @@ static int elf_functions__collect(struct elf_functions *functions) goto out_free; } + /* First, collect an elf_function for each GElf_Sym + * Where func->name is without a suffix + */ functions->cnt = 0; elf_symtab__for_each_symbol_index(functions->symtab, core_id, sym, sym_sec_idx) { - elf_functions__collect_function(functions, &sym); + + if (elf_sym__type(&sym) != STT_FUNC) + continue; + + sym_name = elf_sym__name(&sym, functions->symtab); + if (!sym_name) + continue; + + func = &functions->entries[functions->cnt]; + + const char *suffix = strchr(sym_name, '.'); + if (suffix) { + functions->suffix_cnt++; + func->name = strndup(sym_name, suffix - sym_name); + } else { + func->name = strdup(sym_name); + } + if (!func->name) { + err = -ENOMEM; + goto out_free; + } + + func_sym.name = sym_name; + func_sym.addr = sym.st_value; + + err = elf_function__push_sym(func, &func_sym); + if (err) + goto out_free; + + functions->cnt++; } + /* At this point functions->entries is an unordered array of elf_function + * each having a name (without a suffix) and a single elf_function_sym (maybe with suffix) + * Now let's sort this table by name. + */ if (functions->cnt) { - qsort(functions->entries, functions->cnt, sizeof(*functions->entries), functions_cmp); + qsort(functions->entries, functions->cnt, sizeof(*functions->entries), elf_function__name_cmp); } else { err = 0; goto out_free; } + /* Finally dedup by name, transforming { name -> syms[1] } entries into { name -> syms[n] } */ + i = 0; + j = 1; + for (j = 1; j < functions->cnt; j++) { + struct elf_function *a = &functions->entries[i]; + struct elf_function *b = &functions->entries[j]; + + if (!strcmp(a->name, b->name)) { + elf_function__push_sym(a, &b->syms[0]); + elf_function__free_content(b); + } else { + i++; + if (i != j) + functions->entries[i] = functions->entries[j]; + } + } + + functions->cnt = i + 1; + + print_func_table(functions); + /* Reallocate to the exact size */ tmp = realloc(functions->entries, functions->cnt * sizeof(struct elf_function)); if (tmp) { @@ -2661,30 +2756,11 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co if (!name) continue; - /* prefer exact function name match... */ - func = btf_encoder__find_function(encoder, name, 0); - if (!func && funcs->suffix_cnt && - conf_load->btf_gen_optimized) { - /* falling back to name.isra.0 match if no exact - * match is found; only bother if we found any - * .suffix function names. The function - * will be saved and added once we ensure - * it does not have optimized-out parameters - * in any cu. - */ - func = btf_encoder__find_function(encoder, name, - strlen(name)); - if (func) { - if (encoder->verbose) - printf("matched function '%s' with '%s'%s\n", - name, func->name, - fn->proto.optimized_parms ? - ", has optimized-out parameters" : - fn->proto.unexpected_reg ? ", has unexpected register use by params" : - ""); - if (!func->alias) - func->alias = strdup(name); - } + func = btf_encoder__find_function(encoder, name); + if (!func) { + if (encoder->verbose) + printf("could not find function '%s' in the ELF functions table\n", name); + continue; } } else { if (!fn->external) -- 2.50.1 > > Third we need to decide how to deal with cases where the function does > not correspond to an mcount boundary. It'd be interesting to see if the > filtering helps here, but part of the problem is also that we don't > currently have a mechanism to help guide the match between function name > and function site that is done when the fentry attach is carried out. > Yonghong and I talked about it in [1]. > > Addresses seem like the natural means to help guide that, so a > DATASEC-like set of addresses would help this matching. I had a WIP > version of this but it wasn't working fully yet. I'll revive it and see > if I can get it out as an RFC. Needs to take into account the work being > done on inlines too [1]. > > In terms of the tracer's actual intent, multi-site functions are often > "static inline" functions in a .h file that don't actually get inlined; > the user intent would be often to trace all instances, but it seems to > me we need to provide a means to support both this or to trace a > specific instance. How the latter is best represented from the tracer > side I'm not sure; raw addresses would be one way I suppose. Absent an > explicit request from the tracer I'm not sure what heuristics make most > sense; currently we just pick the first instance I suspect, and might > need to continue to do so for backwards compatibility. > > >> I am not aware of long term plans for addressing this, though it looks >> like this was discussed before. I'd appreciate if you share any >> relevant threads. >> > > Yonghong and I discussed this a bit in [1], and the inline thread in [2] > has some more details. Thank you for sharing. I guess I was wrong when I said I'm not aware of the plans, because I am a little bit familiar with the work on BTF representation of inlined funcs. As far as I understand, partly because of the current limitations of BTF, the best pahole/btf_encoder can do with optimized functions right now is determine whether a function was not modified too much across instances (optimized_parms, unexpected_reg etc.), and if yes then exclude it from BTF. Anything smarter requires extending BTF. > > [1] > https://lpc.events/event/18/contributions/1945/attachments/1508/3179/Kernel%20func%20tracing%20in%20the%20face%20of%20compiler%20optimization.pdf > [2] > https://lore.kernel.org/bpf/20250416-btf_inline-v1-0-e4bd2f8adae5@meta.com/ > >> Thanks. >> >> [...] ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-22 22:58 ` Ihor Solodrai @ 2025-07-23 11:22 ` Jiri Olsa 2025-07-24 17:54 ` Alan Maguire 0 siblings, 1 reply; 12+ messages in thread From: Jiri Olsa @ 2025-07-23 11:22 UTC (permalink / raw) To: Ihor Solodrai Cc: Alan Maguire, Jiri Olsa, Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On Tue, Jul 22, 2025 at 10:58:52PM +0000, Ihor Solodrai wrote: SNIP > @@ -1338,48 +1381,39 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, > return 0; > } > > -static int functions_cmp(const void *_a, const void *_b) > +static int elf_function__name_cmp(const void *_a, const void *_b) > { > const struct elf_function *a = _a; > const struct elf_function *b = _b; > > - /* if search key allows prefix match, verify target has matching > - * prefix len and prefix matches. > - */ > - if (a->prefixlen && a->prefixlen == b->prefixlen) > - return strncmp(a->name, b->name, b->prefixlen); nice to see this one removed ;-) > return strcmp(a->name, b->name); > } > > -#ifndef max > -#define max(x, y) ((x) < (y) ? (y) : (x)) > -#endif > - > static int saved_functions_cmp(const void *_a, const void *_b) > { > const struct btf_encoder_func_state *a = _a; > const struct btf_encoder_func_state *b = _b; > > - return functions_cmp(a->elf, b->elf); > + return elf_function__name_cmp(a->elf, b->elf); > } > > static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b) > { > - uint8_t optimized, unexpected, inconsistent; > - int ret; > + uint8_t optimized, unexpected, inconsistent, ambiguous_addr; > + > + if (a->elf != b->elf) > + return 1; > > - ret = strncmp(a->elf->name, b->elf->name, > - max(a->elf->prefixlen, b->elf->prefixlen)); > - if (ret != 0) > - return ret; > optimized = a->optimized_parms | b->optimized_parms; > unexpected = a->unexpected_reg | b->unexpected_reg; > inconsistent = a->inconsistent_proto | b->inconsistent_proto; > - if (!unexpected && !inconsistent && !funcs__match(a, b)) > + ambiguous_addr = a->ambiguous_addr | b->ambiguous_addr; > + if (!unexpected && !inconsistent && !ambiguous_addr && !funcs__match(a, b)) > inconsistent = 1; > a->optimized_parms = b->optimized_parms = optimized; > a->unexpected_reg = b->unexpected_reg = unexpected; > a->inconsistent_proto = b->inconsistent_proto = inconsistent; > + a->ambiguous_addr = b->ambiguous_addr = ambiguous_addr; I had to add change below to get the functions with multiple addresses out diff --git a/btf_encoder.c b/btf_encoder.c index fcc30aa9d97f..7b9679794790 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -1466,7 +1466,7 @@ static int btf_encoder__add_saved_funcs(struct btf_encoder *encoder, bool skip_e * just do not _use_ them. Only exclude functions with * unexpected register use or multiple inconsistent prototypes. */ - add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto; + add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto && !state->ambiguous_addr; if (add_to_btf) { err = btf_encoder__add_func(state->encoder, state); other than that I like the approach SNIP > @@ -2153,18 +2191,75 @@ static int elf_functions__collect(struct elf_functions *functions) > goto out_free; > } > > + /* First, collect an elf_function for each GElf_Sym > + * Where func->name is without a suffix > + */ > functions->cnt = 0; > elf_symtab__for_each_symbol_index(functions->symtab, core_id, sym, sym_sec_idx) { > - elf_functions__collect_function(functions, &sym); > + > + if (elf_sym__type(&sym) != STT_FUNC) > + continue; > + > + sym_name = elf_sym__name(&sym, functions->symtab); > + if (!sym_name) > + continue; > + > + func = &functions->entries[functions->cnt]; > + > + const char *suffix = strchr(sym_name, '.'); > + if (suffix) { > + functions->suffix_cnt++; do we need suffix_cnt now? thanks, jirka > + func->name = strndup(sym_name, suffix - sym_name); > + } else { > + func->name = strdup(sym_name); > + } > + if (!func->name) { > + err = -ENOMEM; > + goto out_free; > + } > + > + func_sym.name = sym_name; > + func_sym.addr = sym.st_value; > + > + err = elf_function__push_sym(func, &func_sym); > + if (err) > + goto out_free; > + > + functions->cnt++; > } SNIP ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-23 11:22 ` Jiri Olsa @ 2025-07-24 17:54 ` Alan Maguire 2025-07-24 21:26 ` Ihor Solodrai 0 siblings, 1 reply; 12+ messages in thread From: Alan Maguire @ 2025-07-24 17:54 UTC (permalink / raw) To: Jiri Olsa, Ihor Solodrai Cc: Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On 23/07/2025 12:22, Jiri Olsa wrote: > On Tue, Jul 22, 2025 at 10:58:52PM +0000, Ihor Solodrai wrote: > > SNIP > >> @@ -1338,48 +1381,39 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, >> return 0; >> } >> >> -static int functions_cmp(const void *_a, const void *_b) >> +static int elf_function__name_cmp(const void *_a, const void *_b) >> { >> const struct elf_function *a = _a; >> const struct elf_function *b = _b; >> >> - /* if search key allows prefix match, verify target has matching >> - * prefix len and prefix matches. >> - */ >> - if (a->prefixlen && a->prefixlen == b->prefixlen) >> - return strncmp(a->name, b->name, b->prefixlen); > > nice to see this one removed ;-) > >> return strcmp(a->name, b->name); >> } >> >> -#ifndef max >> -#define max(x, y) ((x) < (y) ? (y) : (x)) >> -#endif >> - >> static int saved_functions_cmp(const void *_a, const void *_b) >> { >> const struct btf_encoder_func_state *a = _a; >> const struct btf_encoder_func_state *b = _b; >> >> - return functions_cmp(a->elf, b->elf); >> + return elf_function__name_cmp(a->elf, b->elf); >> } >> >> static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b) >> { >> - uint8_t optimized, unexpected, inconsistent; >> - int ret; >> + uint8_t optimized, unexpected, inconsistent, ambiguous_addr; >> + >> + if (a->elf != b->elf) >> + return 1; >> >> - ret = strncmp(a->elf->name, b->elf->name, >> - max(a->elf->prefixlen, b->elf->prefixlen)); >> - if (ret != 0) >> - return ret; >> optimized = a->optimized_parms | b->optimized_parms; >> unexpected = a->unexpected_reg | b->unexpected_reg; >> inconsistent = a->inconsistent_proto | b->inconsistent_proto; >> - if (!unexpected && !inconsistent && !funcs__match(a, b)) >> + ambiguous_addr = a->ambiguous_addr | b->ambiguous_addr; >> + if (!unexpected && !inconsistent && !ambiguous_addr && !funcs__match(a, b)) >> inconsistent = 1; >> a->optimized_parms = b->optimized_parms = optimized; >> a->unexpected_reg = b->unexpected_reg = unexpected; >> a->inconsistent_proto = b->inconsistent_proto = inconsistent; >> + a->ambiguous_addr = b->ambiguous_addr = ambiguous_addr; > > > I had to add change below to get the functions with multiple addresses out > > diff --git a/btf_encoder.c b/btf_encoder.c > index fcc30aa9d97f..7b9679794790 100644 > --- a/btf_encoder.c > +++ b/btf_encoder.c > @@ -1466,7 +1466,7 @@ static int btf_encoder__add_saved_funcs(struct btf_encoder *encoder, bool skip_e > * just do not _use_ them. Only exclude functions with > * unexpected register use or multiple inconsistent prototypes. > */ > - add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto; > + add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto && !state->ambiguous_addr; > > if (add_to_btf) { > err = btf_encoder__add_func(state->encoder, state); > > > other than that I like the approach > Thanks for the patch! I ran it through CI [1] with the above change plus an added whitespace after the function name in the printf() in btf_encoder__log_func_skip(). The btf_functions.sh test expects whitespace after function names when examining skipped functions, so either the test should be updated to handle no whitespace or we should ensure the space is there after the function name like this: printf("%s : skipping BTF encoding of function due to ", func->name); Otherwise we get a CI failure that is nothing to do with the changes. With this in place we do however lose a lot of functions it seems, some I suspect unnecessarily. For example: Looking at < void __tcp_send_ack(struct sock * sk, u32 rcv_nxt, u16 flags); ffffffff83c83170 t __tcp_send_ack.part.0 ffffffff83c83310 T __tcp_send_ack So __tcp_send_ack is partially inlined, but partial inlines should not count as ambiguous addresses I think. We should probably ensure we skip .part suffixes as well as .cold in calculating ambiguous addresses. I modified the patch somewhat and we wind up losing ~400 functions instead of over 700, see [2]. Modified patch is at [3]. If the mods look okay to you Ihor would you mind sending it officially? Would be great to get wider testing to ensure it doesn't break anything or leave any functions out unexpectedly. > SNIP > >> @@ -2153,18 +2191,75 @@ static int elf_functions__collect(struct elf_functions *functions) >> goto out_free; >> } >> >> + /* First, collect an elf_function for each GElf_Sym >> + * Where func->name is without a suffix >> + */ >> functions->cnt = 0; >> elf_symtab__for_each_symbol_index(functions->symtab, core_id, sym, sym_sec_idx) { >> - elf_functions__collect_function(functions, &sym); >> + >> + if (elf_sym__type(&sym) != STT_FUNC) >> + continue; >> + >> + sym_name = elf_sym__name(&sym, functions->symtab); >> + if (!sym_name) >> + continue; >> + >> + func = &functions->entries[functions->cnt]; >> + >> + const char *suffix = strchr(sym_name, '.'); >> + if (suffix) { >> + functions->suffix_cnt++; > > do we need suffix_cnt now? > think it's been unused for a while now, so can be removed I think. Thanks again for working on this! Alan [1] https://github.com/alan-maguire/dwarves/actions/runs/16500065295 [2] https://github.com/alan-maguire/dwarves/actions/runs/16501897430/job/46662503155 [3] https://github.com/acmel/dwarves/commit/30dffd7fc34e7753b3d21b4b3f1a5e17814c224f > thanks, > jirka > > >> + func->name = strndup(sym_name, suffix - sym_name); >> + } else { >> + func->name = strdup(sym_name); >> + } >> + if (!func->name) { >> + err = -ENOMEM; >> + goto out_free; >> + } >> + >> + func_sym.name = sym_name; >> + func_sym.addr = sym.st_value; >> + >> + err = elf_function__push_sym(func, &func_sym); >> + if (err) >> + goto out_free; >> + >> + functions->cnt++; >> } > > SNIP ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-24 17:54 ` Alan Maguire @ 2025-07-24 21:26 ` Ihor Solodrai 0 siblings, 0 replies; 12+ messages in thread From: Ihor Solodrai @ 2025-07-24 21:26 UTC (permalink / raw) To: Alan Maguire, Jiri Olsa Cc: Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On 7/24/25 10:54 AM, Alan Maguire wrote: > On 23/07/2025 12:22, Jiri Olsa wrote: >> On Tue, Jul 22, 2025 at 10:58:52PM +0000, Ihor Solodrai wrote: >> >> SNIP >> >>> @@ -1338,48 +1381,39 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, >>> return 0; >>> } >>> >>> -static int functions_cmp(const void *_a, const void *_b) >>> +static int elf_function__name_cmp(const void *_a, const void *_b) >>> { >>> const struct elf_function *a = _a; >>> const struct elf_function *b = _b; >>> >>> - /* if search key allows prefix match, verify target has matching >>> - * prefix len and prefix matches. >>> - */ >>> - if (a->prefixlen && a->prefixlen == b->prefixlen) >>> - return strncmp(a->name, b->name, b->prefixlen); >> >> nice to see this one removed ;-) >> >>> return strcmp(a->name, b->name); >>> } >>> >>> -#ifndef max >>> -#define max(x, y) ((x) < (y) ? (y) : (x)) >>> -#endif >>> - >>> static int saved_functions_cmp(const void *_a, const void *_b) >>> { >>> const struct btf_encoder_func_state *a = _a; >>> const struct btf_encoder_func_state *b = _b; >>> >>> - return functions_cmp(a->elf, b->elf); >>> + return elf_function__name_cmp(a->elf, b->elf); >>> } >>> >>> static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b) >>> { >>> - uint8_t optimized, unexpected, inconsistent; >>> - int ret; >>> + uint8_t optimized, unexpected, inconsistent, ambiguous_addr; >>> + >>> + if (a->elf != b->elf) >>> + return 1; >>> >>> - ret = strncmp(a->elf->name, b->elf->name, >>> - max(a->elf->prefixlen, b->elf->prefixlen)); >>> - if (ret != 0) >>> - return ret; >>> optimized = a->optimized_parms | b->optimized_parms; >>> unexpected = a->unexpected_reg | b->unexpected_reg; >>> inconsistent = a->inconsistent_proto | b->inconsistent_proto; >>> - if (!unexpected && !inconsistent && !funcs__match(a, b)) >>> + ambiguous_addr = a->ambiguous_addr | b->ambiguous_addr; >>> + if (!unexpected && !inconsistent && !ambiguous_addr && !funcs__match(a, b)) >>> inconsistent = 1; >>> a->optimized_parms = b->optimized_parms = optimized; >>> a->unexpected_reg = b->unexpected_reg = unexpected; >>> a->inconsistent_proto = b->inconsistent_proto = inconsistent; >>> + a->ambiguous_addr = b->ambiguous_addr = ambiguous_addr; >> >> >> I had to add change below to get the functions with multiple addresses out >> >> diff --git a/btf_encoder.c b/btf_encoder.c >> index fcc30aa9d97f..7b9679794790 100644 >> --- a/btf_encoder.c >> +++ b/btf_encoder.c >> @@ -1466,7 +1466,7 @@ static int btf_encoder__add_saved_funcs(struct btf_encoder *encoder, bool skip_e >> * just do not _use_ them. Only exclude functions with >> * unexpected register use or multiple inconsistent prototypes. >> */ >> - add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto; >> + add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto && !state->ambiguous_addr; >> >> if (add_to_btf) { >> err = btf_encoder__add_func(state->encoder, state); >> >> >> other than that I like the approach >> > > Thanks for the patch! I ran it through CI [1] with the above change plus > an added whitespace after the function name in the printf() in > btf_encoder__log_func_skip(). The btf_functions.sh test expects > whitespace after function names when examining skipped functions, so > either the test should be updated to handle no whitespace or we should > ensure the space is there after the function name like this: > > printf("%s : skipping BTF encoding of function due to ", > func->name); > > Otherwise we get a CI failure that is nothing to do with the changes. > > With this in place we do however lose a lot of functions it seems, some > I suspect unnecessarily. For example: > > > Looking at > > < void __tcp_send_ack(struct sock * sk, u32 rcv_nxt, u16 flags); > > ffffffff83c83170 t __tcp_send_ack.part.0 > ffffffff83c83310 T __tcp_send_ack > > So __tcp_send_ack is partially inlined, but partial inlines should not > count as ambiguous addresses I think. We should probably ensure we skip > .part suffixes as well as .cold in calculating ambiguous addresses. > > I modified the patch somewhat and we wind up losing ~400 functions > instead of over 700, see [2]. > > Modified patch is at [3]. If the mods look okay to you Ihor would you > mind sending it officially? Would be great to get wider testing to > ensure it doesn't break anything or leave any functions out unexpectedly. Alan, Jiri, thank you for review and testing. I sent this draft in a bit of a rush, sorry. I'll incorporate your suggestions, test the patch a bit more and then will send a clean version. I am curious what functions are lost and why, will report if notice anything interesting. > >> SNIP >> >>> @@ -2153,18 +2191,75 @@ static int elf_functions__collect(struct elf_functions *functions) >>> goto out_free; >>> } >>> >>> + /* First, collect an elf_function for each GElf_Sym >>> + * Where func->name is without a suffix >>> + */ >>> functions->cnt = 0; >>> elf_symtab__for_each_symbol_index(functions->symtab, core_id, sym, sym_sec_idx) { >>> - elf_functions__collect_function(functions, &sym); >>> + >>> + if (elf_sym__type(&sym) != STT_FUNC) >>> + continue; >>> + >>> + sym_name = elf_sym__name(&sym, functions->symtab); >>> + if (!sym_name) >>> + continue; >>> + >>> + func = &functions->entries[functions->cnt]; >>> + >>> + const char *suffix = strchr(sym_name, '.'); >>> + if (suffix) { >>> + functions->suffix_cnt++; >> >> do we need suffix_cnt now? >> > > think it's been unused for a while now, so can be removed I think. > > Thanks again for working on this! > > Alan > > [1] https://github.com/alan-maguire/dwarves/actions/runs/16500065295 > [2] > https://github.com/alan-maguire/dwarves/actions/runs/16501897430/job/46662503155 > [3] > https://github.com/acmel/dwarves/commit/30dffd7fc34e7753b3d21b4b3f1a5e17814c224f > >> thanks, >> jirka >> >> >>> + func->name = strndup(sym_name, suffix - sym_name); >>> + } else { >>> + func->name = strdup(sym_name); >>> + } >>> + if (!func->name) { >>> + err = -ENOMEM; >>> + goto out_free; >>> + } >>> + >>> + func_sym.name = sym_name; >>> + func_sym.addr = sym.st_value; >>> + >>> + err = elf_function__push_sym(func, &func_sym); >>> + if (err) >>> + goto out_free; >>> + >>> + functions->cnt++; >>> } >> >> SNIP > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-21 23:27 ` Ihor Solodrai 2025-07-22 10:45 ` Alan Maguire @ 2025-07-22 10:54 ` Jiri Olsa 2025-07-22 16:07 ` Ihor Solodrai 1 sibling, 1 reply; 12+ messages in thread From: Jiri Olsa @ 2025-07-22 10:54 UTC (permalink / raw) To: Ihor Solodrai Cc: Jiri Olsa, Alan Maguire, Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On Mon, Jul 21, 2025 at 11:27:31PM +0000, Ihor Solodrai wrote: > On 7/21/25 7:27 AM, Jiri Olsa wrote: > > On Mon, Jul 21, 2025 at 12:41:00PM +0100, Alan Maguire wrote: > >> On 17/07/2025 16:25, Jiri Olsa wrote: > >>> Menglong reported issue where we can have function in BTF which has > >>> multiple addresses in kallsysm [1]. > >>> > >>> Rather than filtering this in runtime, let's teach pahole to remove > >>> such functions. > >>> > >>> Removing duplicate records from functions entries that have more > >>> at least one different address. This way btf_encoder__find_function > >>> won't find such functions and they won't be added in BTF. > >>> > >>> In my setup it removed 428 functions out of 77141. > >>> > >> > >> Is such removal necessary? If the presence of an mcount annotation is > >> the requirement, couldn't we just utilize > >> > >> /sys/kernel/tracing/available_filter_functions_addrs > >> > >> to map name to address safely? It identifies mcount-containing functions > >> and some of these appear to be duplicates, for example there I see > >> > >> ffffffff8376e8b4 acpi_attr_is_visible > >> ffffffff8379b7d4 acpi_attr_is_visible > > > > for that we'd need new interface for loading fentry/fexit.. programs, right? > > > > the current interface to get fentry/fexit.. attach address is: > > - user specifies function name, that translates to btf_id > > - in kernel that btf id translates back to function name > > - kernel uses kallsyms_lookup_name or find_kallsyms_symbol_value > > to get the address > > > > so we don't really know which address user wanted in the first place > > Hi Jiri, Alan. > > I stumbled on a bug today which seems to be relevant to this > discussion. > > I tried building the kernel with KASAN and UBSAN, and that resulted in > some kfuncs disappearing from vmlinux.h, triggering selftests/bpf > compilation errors, for example: > > CLNG-BPF [test_progs-no_alu32] cgroup_read_xattr.bpf.o > progs/cgroup_read_xattr.c:127:13: error: call to undeclared function 'bpf_cgroup_ancestor'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] > 127 | ancestor = bpf_cgroup_ancestor(cgrp, 1); > | ^ hi, I tried that and tests build for me with KASAN and UBSAN, both with gcc (15.1.1) and clang (22.0.0) could you share your config? > > Here is a piece of vmlinux.h diff between CONFIG_UBSAN=y/n: > > --- ./tools/testing/selftests/bpf/tools/include/vmlinux.h 2025-07-21 17:35:14.415733105 +0000 > +++ ubsan_vmlinux.h 2025-07-21 17:33:10.455312623 +0000 > @@ -117203,7 +117292,6 @@ > extern int bpf_arena_reserve_pages(void *p__map, void __attribute__((address_space(1))) *ptr__ign, u32 page_cnt) __weak __ksym; > extern __bpf_fastcall void *bpf_cast_to_kern_ctx(void *obj) __weak __ksym; > extern struct cgroup *bpf_cgroup_acquire(struct cgroup *cgrp) __weak __ksym; > -extern struct cgroup *bpf_cgroup_ancestor(struct cgroup *cgrp, int level) __weak __ksym; > extern struct cgroup *bpf_cgroup_from_id(u64 cgid) __weak __ksym; > extern int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; > extern void bpf_cgroup_release(struct cgroup *cgrp) __weak __ksym; > @@ -117260,7 +117348,6 @@ > extern bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *p) __weak __ksym; > extern int bpf_dynptr_memset(struct bpf_dynptr *p, u32 offset, u32 size, u8 val) __weak __ksym; > extern __u32 bpf_dynptr_size(const struct bpf_dynptr *p) __weak __ksym; > -extern void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; > extern void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; > extern int bpf_fentry_test1(int a) __weak __ksym; > extern int bpf_get_dentry_xattr(struct dentry *dentry, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; > @@ -117287,7 +117374,6 @@ > extern int bpf_iter_num_new(struct bpf_iter_num *it, int start, int end) __weak __ksym; > extern int *bpf_iter_num_next(struct bpf_iter_num *it) __weak __ksym; > extern void bpf_iter_scx_dsq_destroy(struct bpf_iter_scx_dsq *it) __weak __ksym; > -extern int bpf_iter_scx_dsq_new(struct bpf_iter_scx_dsq *it, u64 dsq_id, u64 flags) __weak __ksym; > extern struct task_struct *bpf_iter_scx_dsq_next(struct bpf_iter_scx_dsq *it) __weak __ksym; > extern void bpf_iter_task_destroy(struct bpf_iter_task *it) __weak __ksym; > extern int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task__nullable, unsigned int flags) __weak __ksym; > @@ -117373,11 +117459,8 @@ > extern int bpf_strspn(const char *s__ign, const char *accept__ign) __weak __ksym; > extern int bpf_strstr(const char *s1__ign, const char *s2__ign) __weak __ksym; > extern struct task_struct *bpf_task_acquire(struct task_struct *p) __weak __ksym; > -extern struct task_struct *bpf_task_from_pid(s32 pid) __weak __ksym; > -extern struct task_struct *bpf_task_from_vpid(s32 vpid) __weak __ksym; > extern struct cgroup *bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id) __weak __ksym; > extern void bpf_task_release(struct task_struct *p) __weak __ksym; > -extern long int bpf_task_under_cgroup(struct task_struct *task, struct cgroup *ancestor) __weak __ksym; > extern void bpf_throw(u64 cookie) __weak __ksym; > extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_p, struct bpf_dynptr *sig_p, struct bpf_key *trusted_keyring) __weak __ksym; > extern int bpf_wq_init(struct bpf_wq *wq, void *p__map, unsigned int flags) __weak __ksym; > @@ -117412,15 +117495,10 @@ > extern u32 scx_bpf_cpuperf_cur(s32 cpu) __weak __ksym; > extern void scx_bpf_cpuperf_set(s32 cpu, u32 perf) __weak __ksym; > extern s32 scx_bpf_create_dsq(u64 dsq_id, s32 node) __weak __ksym; > -extern void scx_bpf_destroy_dsq(u64 dsq_id) __weak __ksym; > -extern void scx_bpf_dispatch(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; > extern void scx_bpf_dispatch_cancel(void) __weak __ksym; > extern bool scx_bpf_dispatch_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > -extern void scx_bpf_dispatch_from_dsq_set_slice(struct bpf_iter_scx_dsq *it__iter, u64 slice) __weak __ksym; > -extern void scx_bpf_dispatch_from_dsq_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; > extern u32 scx_bpf_dispatch_nr_slots(void) __weak __ksym; > extern void scx_bpf_dispatch_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; > -extern bool scx_bpf_dispatch_vtime_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > extern void scx_bpf_dsq_insert(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; > extern void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; > extern bool scx_bpf_dsq_move(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > @@ -117428,10 +117506,8 @@ > extern void scx_bpf_dsq_move_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; > extern bool scx_bpf_dsq_move_to_local(u64 dsq_id) __weak __ksym; > extern bool scx_bpf_dsq_move_vtime(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; > -extern s32 scx_bpf_dsq_nr_queued(u64 dsq_id) __weak __ksym; > extern void scx_bpf_dump_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; > extern void scx_bpf_error_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; > -extern void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __weak __ksym; > extern void scx_bpf_exit_bstr(s64 exit_code, char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; > extern const struct cpumask *scx_bpf_get_idle_cpumask(void) __weak __ksym; > extern const struct cpumask *scx_bpf_get_idle_cpumask_node(int node) __weak __ksym; > > Then I checked the difference between BTFs, and found that there is no > DECL_TAG 'bpf_kfunc' produced for the affected functions: > > $ diff -u vmlinux.btf.out vmlinux_ubsan.btf.out | grep -C5 cgroup_ancestor > +[52749] FUNC 'bpf_cgroup_acquire' type_id=52748 linkage=static > +[52750] DECL_TAG 'bpf_kfunc' type_id=52749 component_idx=-1 > +[52751] FUNC_PROTO '(anon)' ret_type_id=426 vlen=2 > 'cgrp' type_id=426 > 'level' type_id=21 > -[52681] FUNC 'bpf_cgroup_ancestor' type_id=52680 linkage=static > -[52682] DECL_TAG 'bpf_kfunc' type_id=52681 component_idx=-1 > -[52683] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 > +[52752] FUNC 'bpf_cgroup_ancestor' type_id=52751 linkage=static > +[52753] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 > 'attach_type' type_id=1767 > 'attach_btf_id' type_id=34 > -[52684] FUNC 'bpf_cgroup_atype_find' type_id=52683 linkage=static > -[52685] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > > Which is clearly wrong and suggests a bug. > > After some debugging, I found that the problem is in > btf_encoder__find_function(), and more specifically in > the comparator used for bsearch and qsort. > > static int functions_cmp(const void *_a, const void *_b) > { > const struct elf_function *a = _a; > const struct elf_function *b = _b; > > /* if search key allows prefix match, verify target has matching > * prefix len and prefix matches. > */ > if (a->prefixlen && a->prefixlen == b->prefixlen) > return strncmp(a->name, b->name, b->prefixlen); > return strcmp(a->name, b->name); > } > > For this particular vmlinux that I compiled, > btf_encoder__find_function("bpf_cgroup_ancestor", prefixlen=0) returns > NULL, even though there is an elf_function struct for > bpf_cgroup_ancestor in the table. > > The reason for it is that bsearch() happens to hit > "bpf_cgroup_ancestor.cold" in the table first. > strcmp("bpf_cgroup_ancestor", "bpf_cgroup_ancestor.cold)") gives a > negative value, but bpf_cgroup_ancestor entry is stored in the other > direction in the table. > > This is surprising at the first glance, because we use the same > functions_cmp both for search and for qsort. > > But as it turns we are actually using two different comparators within > functions_cmp(): we set key.prefixlen=0 for exact match and when it's > non-zero we search for prefix match. When sorting the table, there are > no entries with prefixlen=0, so the order of elements is not exactly > right for the bsearch(). > > That's a nasty bug, but as far as I understand, all this complexity is > unnecessary in case of '.cold' suffix, because they simply are not > supposed to be in the elf_functions table: it's usually just a piece > of a target function. > > There are a couple of ways this particular bug could be fixed > (filtering out .cold symbols, for example). But I think this bug and > the problem Jiri is trying to solve stems from the fact that one > function name, which is an identifier the users care about the most, > may be associated with many ELF symbols and/or addresses. > > What is clear to me in the context of pahole's BTF encoding is that we > want elf_functions table to only have a single entry per name (where > name is an actual name that might be referred to by users, not an ELF > sym name), and have a deterministic mechanism for selecting one (or > none) func from many at the time of processing ELF data. > > The current implementation is just buggy in this regard. > > I am not aware of long term plans for addressing this, though it looks > like this was discussed before. I'd appreciate if you share any > relevant threads. I did some ill attempt to have function addresses in BTF but that never took place.. I thought perhaps Alan took over that, but can't find any evidence ;-) jirka ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries 2025-07-22 10:54 ` Jiri Olsa @ 2025-07-22 16:07 ` Ihor Solodrai 0 siblings, 0 replies; 12+ messages in thread From: Ihor Solodrai @ 2025-07-22 16:07 UTC (permalink / raw) To: Jiri Olsa Cc: Alan Maguire, Arnaldo Carvalho de Melo, Menglong Dong, dwarves, bpf, Alexei Starovoitov, Andrii Nakryiko, Yonghong Song, Song Liu, Eduard Zingerman On 7/22/25 3:54 AM, Jiri Olsa wrote: > On Mon, Jul 21, 2025 at 11:27:31PM +0000, Ihor Solodrai wrote: >> On 7/21/25 7:27 AM, Jiri Olsa wrote: >>> On Mon, Jul 21, 2025 at 12:41:00PM +0100, Alan Maguire wrote: >>>> On 17/07/2025 16:25, Jiri Olsa wrote: >>>>> Menglong reported issue where we can have function in BTF which has >>>>> multiple addresses in kallsysm [1]. >>>>> >>>>> Rather than filtering this in runtime, let's teach pahole to remove >>>>> such functions. >>>>> >>>>> Removing duplicate records from functions entries that have more >>>>> at least one different address. This way btf_encoder__find_function >>>>> won't find such functions and they won't be added in BTF. >>>>> >>>>> In my setup it removed 428 functions out of 77141. >>>>> >>>> >>>> Is such removal necessary? If the presence of an mcount annotation is >>>> the requirement, couldn't we just utilize >>>> >>>> /sys/kernel/tracing/available_filter_functions_addrs >>>> >>>> to map name to address safely? It identifies mcount-containing functions >>>> and some of these appear to be duplicates, for example there I see >>>> >>>> ffffffff8376e8b4 acpi_attr_is_visible >>>> ffffffff8379b7d4 acpi_attr_is_visible >>> >>> for that we'd need new interface for loading fentry/fexit.. programs, right? >>> >>> the current interface to get fentry/fexit.. attach address is: >>> - user specifies function name, that translates to btf_id >>> - in kernel that btf id translates back to function name >>> - kernel uses kallsyms_lookup_name or find_kallsyms_symbol_value >>> to get the address >>> >>> so we don't really know which address user wanted in the first place >> >> Hi Jiri, Alan. >> >> I stumbled on a bug today which seems to be relevant to this >> discussion. >> >> I tried building the kernel with KASAN and UBSAN, and that resulted in >> some kfuncs disappearing from vmlinux.h, triggering selftests/bpf >> compilation errors, for example: >> >> CLNG-BPF [test_progs-no_alu32] cgroup_read_xattr.bpf.o >> progs/cgroup_read_xattr.c:127:13: error: call to undeclared function 'bpf_cgroup_ancestor'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] >> 127 | ancestor = bpf_cgroup_ancestor(cgrp, 1); >> | ^ > > hi, > I tried that and tests build for me with KASAN and UBSAN, > both with gcc (15.1.1) and clang (22.0.0) > > could you share your config? Sure. This is a slightly modified BPF CI config, pasting: CONFIG_BLK_DEV_LOOP=y CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y CONFIG_BPF=y CONFIG_BPF_EVENTS=y CONFIG_BPF_JIT=y CONFIG_BPF_KPROBE_OVERRIDE=y CONFIG_BPF_LIRC_MODE2=y CONFIG_BPF_LSM=y CONFIG_BPF_STREAM_PARSER=y CONFIG_BPF_SYSCALL=y # CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set CONFIG_CGROUP_BPF=y CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_USER_API=y CONFIG_CRYPTO_USER_API_HASH=y CONFIG_CRYPTO_USER_API_SKCIPHER=y CONFIG_CRYPTO_SKCIPHER=y CONFIG_CRYPTO_ECB=y CONFIG_CRYPTO_AES=y CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_BTF=y CONFIG_DEBUG_INFO_DWARF4=y CONFIG_DMABUF_HEAPS=y CONFIG_DMABUF_HEAPS_SYSTEM=y CONFIG_DUMMY=y CONFIG_DYNAMIC_FTRACE=y CONFIG_FPROBE=y CONFIG_FTRACE_SYSCALLS=y CONFIG_FUNCTION_ERROR_INJECTION=y CONFIG_FUNCTION_TRACER=y CONFIG_FS_VERITY=y CONFIG_GENEVE=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_IMA=y CONFIG_IMA_READ_POLICY=y CONFIG_IMA_WRITE_POLICY=y CONFIG_INET_ESP=y CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_RAW=y CONFIG_IP_NF_TARGET_SYNPROXY=y CONFIG_IPV6=y CONFIG_IPV6_FOU=y CONFIG_IPV6_FOU_TUNNEL=y CONFIG_IPV6_GRE=y CONFIG_IPV6_SEG6_BPF=y CONFIG_IPV6_SIT=y CONFIG_IPV6_TUNNEL=y CONFIG_KEYS=y CONFIG_LIRC=y CONFIG_LWTUNNEL=y CONFIG_MODULE_SIG=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULES=y CONFIG_MODVERSIONS=y CONFIG_MPLS=y CONFIG_MPLS_IPTUNNEL=y CONFIG_MPLS_ROUTING=y CONFIG_MPTCP=y CONFIG_NET_ACT_GACT=y CONFIG_NET_ACT_SKBMOD=y CONFIG_NET_CLS=y CONFIG_NET_CLS_ACT=y CONFIG_NET_CLS_BPF=y CONFIG_NET_CLS_FLOWER=y CONFIG_NET_CLS_MATCHALL=y CONFIG_NET_FOU=y CONFIG_NET_FOU_IP_TUNNELS=y CONFIG_NET_IPGRE=y CONFIG_NET_IPGRE_DEMUX=y CONFIG_NET_IPIP=y CONFIG_NET_MPLS_GSO=y CONFIG_NET_SCH_BPF=y CONFIG_NET_SCH_FQ=y CONFIG_NET_SCH_INGRESS=y CONFIG_NET_SCH_HTB=y CONFIG_NET_SCHED=y CONFIG_NETDEVSIM=y CONFIG_NETFILTER=y CONFIG_NETFILTER_ADVANCED=y CONFIG_NETFILTER_SYNPROXY=y CONFIG_NETFILTER_XT_CONNMARK=y CONFIG_NETFILTER_XT_MATCH_STATE=y CONFIG_NETFILTER_XT_TARGET_CT=y CONFIG_NETKIT=y CONFIG_NF_CONNTRACK=y CONFIG_NF_CONNTRACK_MARK=y CONFIG_NF_CONNTRACK_ZONES=y CONFIG_NF_DEFRAG_IPV4=y CONFIG_NF_DEFRAG_IPV6=y CONFIG_NF_TABLES=y CONFIG_NF_TABLES_INET=y CONFIG_NF_TABLES_NETDEV=y CONFIG_NF_TABLES_IPV4=y CONFIG_NF_TABLES_IPV6=y CONFIG_NETFILTER_INGRESS=y CONFIG_NF_FLOW_TABLE=y CONFIG_NF_FLOW_TABLE_INET=y CONFIG_NETFILTER_NETLINK=y CONFIG_NFT_FLOW_OFFLOAD=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP6_NF_IPTABLES=y CONFIG_IP6_NF_FILTER=y CONFIG_NF_NAT=y CONFIG_PACKET=y CONFIG_RC_CORE=y CONFIG_SECURITY=y CONFIG_SECURITYFS=y CONFIG_SYN_COOKIES=y CONFIG_TEST_BPF=m CONFIG_UDMABUF=y CONFIG_USERFAULTFD=y CONFIG_VSOCKETS=y CONFIG_VXLAN=y CONFIG_XDP_SOCKETS=y CONFIG_XFRM_INTERFACE=y CONFIG_TCP_CONG_DCTCP=y CONFIG_TCP_CONG_BBR=y CONFIG_9P_FS_POSIX_ACL=y CONFIG_9P_FS_SECURITY=y CONFIG_9P_FS=y CONFIG_CRYPTO_DEV_VIRTIO=y CONFIG_FUSE_FS=y CONFIG_FUSE_PASSTHROUGH=y CONFIG_NET_9P_VIRTIO=y CONFIG_NET_9P=y CONFIG_VIRTIO_BALLOON=y CONFIG_VIRTIO_BLK=y CONFIG_VIRTIO_CONSOLE=y CONFIG_VIRTIO_FS=y CONFIG_VIRTIO_NET=y CONFIG_VIRTIO_PCI=y CONFIG_VIRTIO_VSOCKETS_COMMON=y CONFIG_AGP=y CONFIG_AGP_AMD64=y CONFIG_AGP_INTEL=y CONFIG_AGP_SIS=y CONFIG_AGP_VIA=y CONFIG_AMIGA_PARTITION=y CONFIG_AUDIT=y CONFIG_BACKLIGHT_CLASS_DEVICE=y CONFIG_BINFMT_MISC=y CONFIG_BLK_CGROUP=y CONFIG_BLK_CGROUP_IOLATENCY=y CONFIG_BLK_DEV_BSGLIB=y CONFIG_BLK_DEV_IO_TRACE=y CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_SIZE=16384 CONFIG_BLK_DEV_THROTTLING=y CONFIG_BONDING=y CONFIG_BOOTTIME_TRACING=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_BPF_PRELOAD=y CONFIG_BPF_PRELOAD_UMD=y CONFIG_BSD_DISKLABEL=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_CFS_BANDWIDTH=y CONFIG_CGROUP_CPUACCT=y CONFIG_CGROUP_DEVICE=y CONFIG_CGROUP_FREEZER=y CONFIG_CGROUP_HUGETLB=y CONFIG_CGROUP_PERF=y CONFIG_CGROUP_SCHED=y CONFIG_CGROUPS=y CONFIG_CMA=y CONFIG_CMA_AREAS=7 CONFIG_COMPAT_32BIT_TIME=y CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y CONFIG_CPU_FREQ_GOV_ONDEMAND=y CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_IDLE_GOV_LADDER=y CONFIG_CPUSETS=y CONFIG_CRYPTO_BLAKE2B=y CONFIG_CRYPTO_SEQIV=y CONFIG_CRYPTO_XXHASH=y CONFIG_DCB=y CONFIG_DEBUG_ATOMIC_SLEEP=y CONFIG_DEBUG_INFO_BTF=y CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y CONFIG_DEBUG_MEMORY_INIT=y CONFIG_DEFAULT_FQ_CODEL=y CONFIG_DEFAULT_RENO=y CONFIG_DEFAULT_SECURITY_DAC=y CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y CONFIG_DMA_CMA=y CONFIG_DNS_RESOLVER=y CONFIG_EFI=y CONFIG_EFI_STUB=y CONFIG_EXPERT=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_EXT4_FS_SECURITY=y CONFIG_FAIL_FUNCTION=y CONFIG_FAULT_INJECTION=y CONFIG_FAULT_INJECTION_DEBUG_FS=y CONFIG_FB=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y CONFIG_FB_VESA=y CONFIG_FONT_8x16=y CONFIG_FONT_MINI_4x6=y CONFIG_FONTS=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y CONFIG_FW_LOADER_USER_HELPER=y CONFIG_GART_IOMMU=y CONFIG_GENERIC_PHY=y CONFIG_HARDLOCKUP_DETECTOR=y CONFIG_HID_A4TECH=y CONFIG_HID_BELKIN=y CONFIG_HID_CHERRY=y CONFIG_HID_CYPRESS=y CONFIG_HID_DRAGONRISE=y CONFIG_HID_EZKEY=y CONFIG_HID_GREENASIA=y CONFIG_HID_GYRATION=y CONFIG_HID_KENSINGTON=y CONFIG_HID_KYE=y CONFIG_HID_MICROSOFT=y CONFIG_HID_MONTEREY=y CONFIG_HID_PANTHERLORD=y CONFIG_HID_PETALYNX=y CONFIG_HID_SMARTJOYPLUS=y CONFIG_HID_SUNPLUS=y CONFIG_HID_TOPSEED=y CONFIG_HID_TWINHAN=y CONFIG_HID_ZEROPLUS=y CONFIG_HIGH_RES_TIMERS=y CONFIG_HPET=y CONFIG_HUGETLBFS=y CONFIG_HWPOISON_INJECT=y CONFIG_HZ_1000=y CONFIG_INET=y CONFIG_INPUT_EVDEV=y CONFIG_INTEL_POWERCLAMP=y CONFIG_IP6_NF_IPTABLES=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_IP_MROUTE=y CONFIG_IP_MULTICAST=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP_PIMSM_V1=y CONFIG_IP_PIMSM_V2=y CONFIG_IP_ROUTE_MULTIPATH=y CONFIG_IP_ROUTE_VERBOSE=y CONFIG_IPV6_MIP6=y CONFIG_IPV6_ROUTE_INFO=y CONFIG_IPV6_ROUTER_PREF=y CONFIG_IPV6_SEG6_LWTUNNEL=y CONFIG_IPV6_SUBTREES=y CONFIG_IRQ_POLL=y CONFIG_JUMP_LABEL=y CONFIG_KARMA_PARTITION=y CONFIG_KEXEC=y CONFIG_KPROBES=y CONFIG_KSM=y CONFIG_LEGACY_VSYSCALL_NONE=y CONFIG_LOG_BUF_SHIFT=21 CONFIG_LOG_CPU_MAX_BUF_SHIFT=0 CONFIG_LOGO=y CONFIG_LSM="selinux,bpf,integrity" CONFIG_MAC_PARTITION=y CONFIG_MAGIC_SYSRQ=y CONFIG_MCORE2=y CONFIG_MEMCG=y CONFIG_MEMORY_FAILURE=y CONFIG_MINIX_SUBPARTITION=y CONFIG_NAMESPACES=y CONFIG_NET=y CONFIG_NET_ACT_BPF=y CONFIG_NET_CLS_CGROUP=y CONFIG_NET_EMATCH=y CONFIG_NET_IPGRE_BROADCAST=y CONFIG_NET_L3_MASTER_DEV=y CONFIG_NET_SCH_DEFAULT=y CONFIG_NET_SCH_FQ_CODEL=y CONFIG_NET_TC_SKB_EXT=y CONFIG_NET_VRF=y CONFIG_NETDEVICES=y CONFIG_NETFILTER_NETLINK_LOG=y CONFIG_NETFILTER_NETLINK_QUEUE=y CONFIG_NETFILTER_XT_MATCH_BPF=y CONFIG_NETFILTER_XT_MATCH_STATISTIC=y CONFIG_NETLABEL=y CONFIG_NLS_ASCII=y CONFIG_NLS_CODEPAGE_437=y CONFIG_NLS_DEFAULT="utf8" CONFIG_NO_HZ=y CONFIG_NR_CPUS=128 CONFIG_NUMA=y CONFIG_NUMA_BALANCING=y CONFIG_NVMEM=y CONFIG_OSF_PARTITION=y CONFIG_PACKET=y CONFIG_PANIC_ON_OOPS=y CONFIG_PARTITION_ADVANCED=y CONFIG_PCI=y CONFIG_PCI_IOV=y CONFIG_PCI_MSI=y CONFIG_PCIEPORTBUS=y CONFIG_PHYSICAL_ALIGN=0x1000000 CONFIG_POSIX_MQUEUE=y CONFIG_POWER_SUPPLY=y CONFIG_PREEMPT=y CONFIG_PRINTK_TIME=y CONFIG_PROC_KCORE=y CONFIG_PROFILING=y CONFIG_PROVE_LOCKING=y CONFIG_PTP_1588_CLOCK=y CONFIG_RC_DEVICES=y CONFIG_RC_LOOPBACK=y CONFIG_RCU_CPU_STALL_TIMEOUT=60 CONFIG_SCHED_STACK_END_CHECK=y CONFIG_SCHEDSTATS=y CONFIG_SECURITY_NETWORK=y CONFIG_SECURITY_SELINUX=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_DETECT_IRQ=y CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_MANY_PORTS=y CONFIG_SERIAL_8250_NR_UARTS=32 CONFIG_SERIAL_8250_RSA=y CONFIG_SERIAL_8250_SHARE_IRQ=y CONFIG_SERIAL_NONSTANDARD=y CONFIG_SERIO_LIBPS2=y CONFIG_SGI_PARTITION=y CONFIG_SMP=y CONFIG_SOLARIS_X86_PARTITION=y CONFIG_SUN_PARTITION=y CONFIG_SYNC_FILE=y CONFIG_SYSVIPC=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_IO_ACCOUNTING=y CONFIG_TASK_XACCT=y CONFIG_TASKSTATS=y CONFIG_TCP_CONG_ADVANCED=y CONFIG_TCP_MD5SIG=y CONFIG_TLS=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y CONFIG_TUN=y CONFIG_UNIX=y CONFIG_UNIXWARE_DISKLABEL=y CONFIG_USER_NS=y CONFIG_VALIDATE_FS_PARSER=y CONFIG_VETH=y CONFIG_VIRT_DRIVERS=y CONFIG_VLAN_8021Q=y CONFIG_VSOCKETS=y CONFIG_VSOCKETS_LOOPBACK=y CONFIG_X86_ACPI_CPUFREQ=y CONFIG_X86_CPUID=y CONFIG_X86_MSR=y CONFIG_X86_POWERNOW_K8=y CONFIG_XDP_SOCKETS_DIAG=y CONFIG_XFRM_SUB_POLICY=y CONFIG_XFRM_USER=y CONFIG_ZEROPLUS_FF=y CONFIG_SCHED_CLASS_EXT=y CONFIG_CGROUPS=y CONFIG_CGROUP_SCHED=y CONFIG_EXT_GROUP_SCHED=y CONFIG_BPF=y CONFIG_BPF_SYSCALL=y CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_BTF=y CONFIG_KASAN=y CONFIG_KASAN_GENERIC=y CONFIG_KASAN_VMALLOC=y CONFIG_UBSAN=y And then: $ make olddefconfig && make -j$(nproc) all I was at 42be23e8f2dc of bpf-next, using GCC 14: $ gcc --version gcc (Ubuntu 14.2.0-4ubuntu2~24.04) 14.2.0 > >> >> Here is a piece of vmlinux.h diff between CONFIG_UBSAN=y/n: >> >> --- ./tools/testing/selftests/bpf/tools/include/vmlinux.h 2025-07-21 17:35:14.415733105 +0000 >> +++ ubsan_vmlinux.h 2025-07-21 17:33:10.455312623 +0000 >> @@ -117203,7 +117292,6 @@ >> extern int bpf_arena_reserve_pages(void *p__map, void __attribute__((address_space(1))) *ptr__ign, u32 page_cnt) __weak __ksym; >> extern __bpf_fastcall void *bpf_cast_to_kern_ctx(void *obj) __weak __ksym; >> extern struct cgroup *bpf_cgroup_acquire(struct cgroup *cgrp) __weak __ksym; >> -extern struct cgroup *bpf_cgroup_ancestor(struct cgroup *cgrp, int level) __weak __ksym; >> extern struct cgroup *bpf_cgroup_from_id(u64 cgid) __weak __ksym; >> extern int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; >> extern void bpf_cgroup_release(struct cgroup *cgrp) __weak __ksym; >> @@ -117260,7 +117348,6 @@ >> extern bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *p) __weak __ksym; >> extern int bpf_dynptr_memset(struct bpf_dynptr *p, u32 offset, u32 size, u8 val) __weak __ksym; >> extern __u32 bpf_dynptr_size(const struct bpf_dynptr *p) __weak __ksym; >> -extern void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; >> extern void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *p, u32 offset, void *buffer__opt, u32 buffer__szk) __weak __ksym; >> extern int bpf_fentry_test1(int a) __weak __ksym; >> extern int bpf_get_dentry_xattr(struct dentry *dentry, const char *name__str, struct bpf_dynptr *value_p) __weak __ksym; >> @@ -117287,7 +117374,6 @@ >> extern int bpf_iter_num_new(struct bpf_iter_num *it, int start, int end) __weak __ksym; >> extern int *bpf_iter_num_next(struct bpf_iter_num *it) __weak __ksym; >> extern void bpf_iter_scx_dsq_destroy(struct bpf_iter_scx_dsq *it) __weak __ksym; >> -extern int bpf_iter_scx_dsq_new(struct bpf_iter_scx_dsq *it, u64 dsq_id, u64 flags) __weak __ksym; >> extern struct task_struct *bpf_iter_scx_dsq_next(struct bpf_iter_scx_dsq *it) __weak __ksym; >> extern void bpf_iter_task_destroy(struct bpf_iter_task *it) __weak __ksym; >> extern int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task__nullable, unsigned int flags) __weak __ksym; >> @@ -117373,11 +117459,8 @@ >> extern int bpf_strspn(const char *s__ign, const char *accept__ign) __weak __ksym; >> extern int bpf_strstr(const char *s1__ign, const char *s2__ign) __weak __ksym; >> extern struct task_struct *bpf_task_acquire(struct task_struct *p) __weak __ksym; >> -extern struct task_struct *bpf_task_from_pid(s32 pid) __weak __ksym; >> -extern struct task_struct *bpf_task_from_vpid(s32 vpid) __weak __ksym; >> extern struct cgroup *bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id) __weak __ksym; >> extern void bpf_task_release(struct task_struct *p) __weak __ksym; >> -extern long int bpf_task_under_cgroup(struct task_struct *task, struct cgroup *ancestor) __weak __ksym; >> extern void bpf_throw(u64 cookie) __weak __ksym; >> extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_p, struct bpf_dynptr *sig_p, struct bpf_key *trusted_keyring) __weak __ksym; >> extern int bpf_wq_init(struct bpf_wq *wq, void *p__map, unsigned int flags) __weak __ksym; >> @@ -117412,15 +117495,10 @@ >> extern u32 scx_bpf_cpuperf_cur(s32 cpu) __weak __ksym; >> extern void scx_bpf_cpuperf_set(s32 cpu, u32 perf) __weak __ksym; >> extern s32 scx_bpf_create_dsq(u64 dsq_id, s32 node) __weak __ksym; >> -extern void scx_bpf_destroy_dsq(u64 dsq_id) __weak __ksym; >> -extern void scx_bpf_dispatch(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; >> extern void scx_bpf_dispatch_cancel(void) __weak __ksym; >> extern bool scx_bpf_dispatch_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; >> -extern void scx_bpf_dispatch_from_dsq_set_slice(struct bpf_iter_scx_dsq *it__iter, u64 slice) __weak __ksym; >> -extern void scx_bpf_dispatch_from_dsq_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; >> extern u32 scx_bpf_dispatch_nr_slots(void) __weak __ksym; >> extern void scx_bpf_dispatch_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; >> -extern bool scx_bpf_dispatch_vtime_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; >> extern void scx_bpf_dsq_insert(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __weak __ksym; >> extern void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym; >> extern bool scx_bpf_dsq_move(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; >> @@ -117428,10 +117506,8 @@ >> extern void scx_bpf_dsq_move_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym; >> extern bool scx_bpf_dsq_move_to_local(u64 dsq_id) __weak __ksym; >> extern bool scx_bpf_dsq_move_vtime(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym; >> -extern s32 scx_bpf_dsq_nr_queued(u64 dsq_id) __weak __ksym; >> extern void scx_bpf_dump_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; >> extern void scx_bpf_error_bstr(char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; >> -extern void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __weak __ksym; >> extern void scx_bpf_exit_bstr(s64 exit_code, char *fmt, long long unsigned int *data, u32 data__sz) __weak __ksym; >> extern const struct cpumask *scx_bpf_get_idle_cpumask(void) __weak __ksym; >> extern const struct cpumask *scx_bpf_get_idle_cpumask_node(int node) __weak __ksym; >> >> Then I checked the difference between BTFs, and found that there is no >> DECL_TAG 'bpf_kfunc' produced for the affected functions: >> >> $ diff -u vmlinux.btf.out vmlinux_ubsan.btf.out | grep -C5 cgroup_ancestor >> +[52749] FUNC 'bpf_cgroup_acquire' type_id=52748 linkage=static >> +[52750] DECL_TAG 'bpf_kfunc' type_id=52749 component_idx=-1 >> +[52751] FUNC_PROTO '(anon)' ret_type_id=426 vlen=2 >> 'cgrp' type_id=426 >> 'level' type_id=21 >> -[52681] FUNC 'bpf_cgroup_ancestor' type_id=52680 linkage=static >> -[52682] DECL_TAG 'bpf_kfunc' type_id=52681 component_idx=-1 >> -[52683] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 >> +[52752] FUNC 'bpf_cgroup_ancestor' type_id=52751 linkage=static >> +[52753] FUNC_PROTO '(anon)' ret_type_id=3987 vlen=2 >> 'attach_type' type_id=1767 >> 'attach_btf_id' type_id=34 >> -[52684] FUNC 'bpf_cgroup_atype_find' type_id=52683 linkage=static >> -[52685] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 >> >> Which is clearly wrong and suggests a bug. >> >> After some debugging, I found that the problem is in >> btf_encoder__find_function(), and more specifically in >> the comparator used for bsearch and qsort. >> >> static int functions_cmp(const void *_a, const void *_b) >> { >> const struct elf_function *a = _a; >> const struct elf_function *b = _b; >> >> /* if search key allows prefix match, verify target has matching >> * prefix len and prefix matches. >> */ >> if (a->prefixlen && a->prefixlen == b->prefixlen) >> return strncmp(a->name, b->name, b->prefixlen); >> return strcmp(a->name, b->name); >> } >> >> For this particular vmlinux that I compiled, >> btf_encoder__find_function("bpf_cgroup_ancestor", prefixlen=0) returns >> NULL, even though there is an elf_function struct for >> bpf_cgroup_ancestor in the table. >> >> The reason for it is that bsearch() happens to hit >> "bpf_cgroup_ancestor.cold" in the table first. >> strcmp("bpf_cgroup_ancestor", "bpf_cgroup_ancestor.cold)") gives a >> negative value, but bpf_cgroup_ancestor entry is stored in the other >> direction in the table. >> >> This is surprising at the first glance, because we use the same >> functions_cmp both for search and for qsort. >> >> But as it turns we are actually using two different comparators within >> functions_cmp(): we set key.prefixlen=0 for exact match and when it's >> non-zero we search for prefix match. When sorting the table, there are >> no entries with prefixlen=0, so the order of elements is not exactly >> right for the bsearch(). >> >> That's a nasty bug, but as far as I understand, all this complexity is >> unnecessary in case of '.cold' suffix, because they simply are not >> supposed to be in the elf_functions table: it's usually just a piece >> of a target function. >> >> There are a couple of ways this particular bug could be fixed >> (filtering out .cold symbols, for example). But I think this bug and >> the problem Jiri is trying to solve stems from the fact that one >> function name, which is an identifier the users care about the most, >> may be associated with many ELF symbols and/or addresses. >> >> What is clear to me in the context of pahole's BTF encoding is that we >> want elf_functions table to only have a single entry per name (where >> name is an actual name that might be referred to by users, not an ELF >> sym name), and have a deterministic mechanism for selecting one (or >> none) func from many at the time of processing ELF data. >> >> The current implementation is just buggy in this regard. >> >> I am not aware of long term plans for addressing this, though it looks >> like this was discussed before. I'd appreciate if you share any >> relevant threads. > > I did some ill attempt to have function addresses in BTF but that never > took place.. I thought perhaps Alan took over that, but can't find any > evidence ;-) > > jirka ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-07-24 21:27 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-17 15:25 [RFC dwarves] btf_encoder: Remove duplicates from functions entries Jiri Olsa 2025-07-21 11:41 ` Alan Maguire 2025-07-21 14:27 ` Jiri Olsa 2025-07-21 14:32 ` Nick Alcock 2025-07-21 23:27 ` Ihor Solodrai 2025-07-22 10:45 ` Alan Maguire 2025-07-22 22:58 ` Ihor Solodrai 2025-07-23 11:22 ` Jiri Olsa 2025-07-24 17:54 ` Alan Maguire 2025-07-24 21:26 ` Ihor Solodrai 2025-07-22 10:54 ` Jiri Olsa 2025-07-22 16:07 ` Ihor Solodrai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).