* [PATCH dwarves v4 0/2] pahole: Inject kfunc decl tags into BTF
@ 2024-02-04 18:40 Daniel Xu
2024-02-04 18:40 ` [PATCH dwarves v4 1/2] pahole: Add --btf_feature=decl_tag_kfuncs feature Daniel Xu
2024-02-04 18:40 ` [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF Daniel Xu
0 siblings, 2 replies; 14+ messages in thread
From: Daniel Xu @ 2024-02-04 18:40 UTC (permalink / raw)
To: acme, jolsa, quentin, alan.maguire; +Cc: andrii.nakryiko, ast, daniel, bpf
This patchset teaches pahole to parse symbols in .BTF_ids section in
vmlinux and discover exported kfuncs. Pahole then takes the list of
kfuncs and injects a BTF_KIND_DECL_TAG for each kfunc.
Example of encoding:
$ bpftool btf dump file .tmp_vmlinux.btf | rg "DECL_TAG 'bpf_kfunc'" | wc -l
121
$ bpftool btf dump file .tmp_vmlinux.btf | rg 56337
[56337] FUNC 'bpf_ct_change_timeout' type_id=56336 linkage=static
[127861] DECL_TAG 'bpf_kfunc' type_id=56337 component_idx=-1
This enables downstream users and tools to dynamically discover which
kfuncs are available on a system by parsing vmlinux or module BTF, both
available in /sys/kernel/btf.
=== Changelog ===
Changes from v3:
* Guard kfunc tagging behind feature flag
* Use struct btf_id_set8 definition
* Remove unnecessary member from btf_encoder
* Fix code styling
Changes from v2:
* More reliably detect kfunc membership in set8 by tracking set addr ranges
* Rename some variables/functions to be more clear about kfunc vs func
Changes from v1:
* Fix resource leaks
* Fix callee -> caller typo
* Rename btf_decl_tag from kfunc -> bpf_kfunc
* Only grab btf_id_set funcs tagged kfunc
* Presort btf func list
Daniel Xu (2):
pahole: Add --btf_feature=decl_tag_kfuncs feature
pahole: Inject kfunc decl tags into BTF
btf_encoder.c | 361 ++++++++++++++++++++++++++++++++++++++++++++++++++
dwarves.h | 1 +
pahole.c | 1 +
3 files changed, 363 insertions(+)
--
2.42.1
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH dwarves v4 1/2] pahole: Add --btf_feature=decl_tag_kfuncs feature 2024-02-04 18:40 [PATCH dwarves v4 0/2] pahole: Inject kfunc decl tags into BTF Daniel Xu @ 2024-02-04 18:40 ` Daniel Xu 2024-02-06 12:47 ` Jiri Olsa 2024-02-04 18:40 ` [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF Daniel Xu 1 sibling, 1 reply; 14+ messages in thread From: Daniel Xu @ 2024-02-04 18:40 UTC (permalink / raw) To: acme, jolsa, quentin, alan.maguire; +Cc: andrii.nakryiko, ast, daniel, bpf Add a feature flag to guard tagging of kfuncs. The next commit will implement the actual tagging. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> --- btf_encoder.c | 2 ++ dwarves.h | 1 + pahole.c | 1 + 3 files changed, 4 insertions(+) diff --git a/btf_encoder.c b/btf_encoder.c index fd04008..e325f66 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -77,6 +77,7 @@ struct btf_encoder { verbose, force, gen_floats, + tag_kfuncs, is_rel; uint32_t array_index_id; struct { @@ -1642,6 +1643,7 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam encoder->force = conf_load->btf_encode_force; encoder->gen_floats = conf_load->btf_gen_floats; encoder->skip_encoding_vars = conf_load->skip_encoding_btf_vars; + encoder->tag_kfuncs = conf_load->btf_decl_tag_kfuncs; encoder->verbose = verbose; encoder->has_index_type = false; encoder->need_index_type = false; diff --git a/dwarves.h b/dwarves.h index 857b37c..996eb70 100644 --- a/dwarves.h +++ b/dwarves.h @@ -87,6 +87,7 @@ struct conf_load { bool skip_encoding_btf_vars; bool btf_gen_floats; bool btf_encode_force; + bool btf_decl_tag_kfuncs; uint8_t hashtable_bits; uint8_t max_hashtable_bits; uint16_t kabi_prefix_len; diff --git a/pahole.c b/pahole.c index 768a2fe..48c19b7 100644 --- a/pahole.c +++ b/pahole.c @@ -1278,6 +1278,7 @@ struct btf_feature { BTF_FEATURE(enum64, skip_encoding_btf_enum64, true), BTF_FEATURE(optimized_func, btf_gen_optimized, false), BTF_FEATURE(consistent_func, skip_encoding_btf_inconsistent_proto, false), + BTF_FEATURE(decl_tag_kfuncs, btf_decl_tag_kfuncs, false), }; #define BTF_MAX_FEATURE_STR 1024 -- 2.42.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 1/2] pahole: Add --btf_feature=decl_tag_kfuncs feature 2024-02-04 18:40 ` [PATCH dwarves v4 1/2] pahole: Add --btf_feature=decl_tag_kfuncs feature Daniel Xu @ 2024-02-06 12:47 ` Jiri Olsa 2024-02-06 14:14 ` Alan Maguire 0 siblings, 1 reply; 14+ messages in thread From: Jiri Olsa @ 2024-02-06 12:47 UTC (permalink / raw) To: Daniel Xu; +Cc: acme, quentin, alan.maguire, andrii.nakryiko, ast, daniel, bpf On Sun, Feb 04, 2024 at 11:40:37AM -0700, Daniel Xu wrote: > Add a feature flag to guard tagging of kfuncs. The next commit will > implement the actual tagging. > > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> > --- > btf_encoder.c | 2 ++ > dwarves.h | 1 + > pahole.c | 1 + > 3 files changed, 4 insertions(+) we should update man page as well also we need to update the kernel's scripts/Makefile.btf with the new option for the next pahole version (1.26 I guess) jirka > > diff --git a/btf_encoder.c b/btf_encoder.c > index fd04008..e325f66 100644 > --- a/btf_encoder.c > +++ b/btf_encoder.c > @@ -77,6 +77,7 @@ struct btf_encoder { > verbose, > force, > gen_floats, > + tag_kfuncs, > is_rel; > uint32_t array_index_id; > struct { > @@ -1642,6 +1643,7 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam > encoder->force = conf_load->btf_encode_force; > encoder->gen_floats = conf_load->btf_gen_floats; > encoder->skip_encoding_vars = conf_load->skip_encoding_btf_vars; > + encoder->tag_kfuncs = conf_load->btf_decl_tag_kfuncs; > encoder->verbose = verbose; > encoder->has_index_type = false; > encoder->need_index_type = false; > diff --git a/dwarves.h b/dwarves.h > index 857b37c..996eb70 100644 > --- a/dwarves.h > +++ b/dwarves.h > @@ -87,6 +87,7 @@ struct conf_load { > bool skip_encoding_btf_vars; > bool btf_gen_floats; > bool btf_encode_force; > + bool btf_decl_tag_kfuncs; > uint8_t hashtable_bits; > uint8_t max_hashtable_bits; > uint16_t kabi_prefix_len; > diff --git a/pahole.c b/pahole.c > index 768a2fe..48c19b7 100644 > --- a/pahole.c > +++ b/pahole.c > @@ -1278,6 +1278,7 @@ struct btf_feature { > BTF_FEATURE(enum64, skip_encoding_btf_enum64, true), > BTF_FEATURE(optimized_func, btf_gen_optimized, false), > BTF_FEATURE(consistent_func, skip_encoding_btf_inconsistent_proto, false), > + BTF_FEATURE(decl_tag_kfuncs, btf_decl_tag_kfuncs, false), > }; > > #define BTF_MAX_FEATURE_STR 1024 > -- > 2.42.1 > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 1/2] pahole: Add --btf_feature=decl_tag_kfuncs feature 2024-02-06 12:47 ` Jiri Olsa @ 2024-02-06 14:14 ` Alan Maguire 0 siblings, 0 replies; 14+ messages in thread From: Alan Maguire @ 2024-02-06 14:14 UTC (permalink / raw) To: Jiri Olsa, Daniel Xu; +Cc: acme, quentin, andrii.nakryiko, ast, daniel, bpf On 06/02/2024 12:47, Jiri Olsa wrote: > On Sun, Feb 04, 2024 at 11:40:37AM -0700, Daniel Xu wrote: >> Add a feature flag to guard tagging of kfuncs. The next commit will >> implement the actual tagging. >> >> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Thanks for adding this! >> --- >> btf_encoder.c | 2 ++ >> dwarves.h | 1 + >> pahole.c | 1 + >> 3 files changed, 4 insertions(+) > > we should update man page as well > > also we need to update the kernel's scripts/Makefile.btf with > the new option for the next pahole version (1.26 I guess) > Yep, something like this added to scripts/Makefile.btf should do it: pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j --lang_exclude=rust --btf_features=encode_force,var,float,decl_tag,type_tag,enum64,optimized_func,consistent_func,tag_kfuncs This should hopefully be the last version check we need - we can just add features to the list as required, and pahole will apply the ones it knows about. Thanks! Alan > jirka > >> >> diff --git a/btf_encoder.c b/btf_encoder.c >> index fd04008..e325f66 100644 >> --- a/btf_encoder.c >> +++ b/btf_encoder.c >> @@ -77,6 +77,7 @@ struct btf_encoder { >> verbose, >> force, >> gen_floats, >> + tag_kfuncs, >> is_rel; >> uint32_t array_index_id; >> struct { >> @@ -1642,6 +1643,7 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam >> encoder->force = conf_load->btf_encode_force; >> encoder->gen_floats = conf_load->btf_gen_floats; >> encoder->skip_encoding_vars = conf_load->skip_encoding_btf_vars; >> + encoder->tag_kfuncs = conf_load->btf_decl_tag_kfuncs; >> encoder->verbose = verbose; >> encoder->has_index_type = false; >> encoder->need_index_type = false; >> diff --git a/dwarves.h b/dwarves.h >> index 857b37c..996eb70 100644 >> --- a/dwarves.h >> +++ b/dwarves.h >> @@ -87,6 +87,7 @@ struct conf_load { >> bool skip_encoding_btf_vars; >> bool btf_gen_floats; >> bool btf_encode_force; >> + bool btf_decl_tag_kfuncs; >> uint8_t hashtable_bits; >> uint8_t max_hashtable_bits; >> uint16_t kabi_prefix_len; >> diff --git a/pahole.c b/pahole.c >> index 768a2fe..48c19b7 100644 >> --- a/pahole.c >> +++ b/pahole.c >> @@ -1278,6 +1278,7 @@ struct btf_feature { >> BTF_FEATURE(enum64, skip_encoding_btf_enum64, true), >> BTF_FEATURE(optimized_func, btf_gen_optimized, false), >> BTF_FEATURE(consistent_func, skip_encoding_btf_inconsistent_proto, false), >> + BTF_FEATURE(decl_tag_kfuncs, btf_decl_tag_kfuncs, false), >> }; >> >> #define BTF_MAX_FEATURE_STR 1024 >> -- >> 2.42.1 >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-04 18:40 [PATCH dwarves v4 0/2] pahole: Inject kfunc decl tags into BTF Daniel Xu 2024-02-04 18:40 ` [PATCH dwarves v4 1/2] pahole: Add --btf_feature=decl_tag_kfuncs feature Daniel Xu @ 2024-02-04 18:40 ` Daniel Xu 2024-02-05 16:54 ` Eduard Zingerman ` (2 more replies) 1 sibling, 3 replies; 14+ messages in thread From: Daniel Xu @ 2024-02-04 18:40 UTC (permalink / raw) To: acme, jolsa, quentin, alan.maguire; +Cc: andrii.nakryiko, ast, daniel, bpf This commit teaches pahole to parse symbols in .BTF_ids section in vmlinux and discover exported kfuncs. Pahole then takes the list of kfuncs and injects a BTF_KIND_DECL_TAG for each kfunc. Example of encoding: $ bpftool btf dump file .tmp_vmlinux.btf | rg "DECL_TAG 'bpf_kfunc'" | wc -l 121 $ bpftool btf dump file .tmp_vmlinux.btf | rg 56337 [56337] FUNC 'bpf_ct_change_timeout' type_id=56336 linkage=static [127861] DECL_TAG 'bpf_kfunc' type_id=56337 component_idx=-1 This enables downstream users and tools to dynamically discover which kfuncs are available on a system by parsing vmlinux or module BTF, both available in /sys/kernel/btf. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> --- btf_encoder.c | 359 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 359 insertions(+) diff --git a/btf_encoder.c b/btf_encoder.c index e325f66..d6a561c 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -34,6 +34,21 @@ #include <pthread.h> #define BTF_ENCODER_MAX_PROTO 512 +#define BTF_IDS_SECTION ".BTF_ids" +#define BTF_ID_FUNC_PFX "__BTF_ID__func__" +#define BTF_ID_SET8_PFX "__BTF_ID__set8__" +#define BTF_SET8_KFUNCS (1 << 0) +#define BTF_KFUNC_TYPE_TAG "bpf_kfunc" + +/* Adapted from include/linux/btf_ids.h */ +struct btf_id_set8 { + uint32_t cnt; + uint32_t flags; + struct { + uint32_t id; + uint32_t flags; + } pairs[]; +}; /* state used to do later encoding of saved functions */ struct btf_encoder_state { @@ -95,6 +110,17 @@ struct btf_encoder { } functions; }; +struct btf_func { + const char *name; + int type_id; +}; + +/* Half open interval representing range of addresses containing kfuncs */ +struct btf_kfunc_set_range { + size_t start; + size_t end; +}; + static LIST_HEAD(encoders); static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER; @@ -1353,6 +1379,331 @@ out: return err; } +/* Returns if `sym` points to a kfunc set */ +static int is_sym_kfunc_set(GElf_Sym *sym, const char *name, Elf_Data *idlist, size_t idlist_addr) +{ + void *ptr = idlist->d_buf; + struct btf_id_set8 *set; + bool is_set8; + int off; + + /* kfuncs are only found in BTF_SET8's */ + is_set8 = !strncmp(name, BTF_ID_SET8_PFX, sizeof(BTF_ID_SET8_PFX) - 1); + if (!is_set8) + return false; + + off = sym->st_value - idlist_addr; + if (off >= idlist->d_size) { + fprintf(stderr, "%s: symbol '%s' out of bounds\n", __func__, name); + return false; + } + + /* Check the set8 flags to see if it was marked as kfunc */ + set = ptr + off; + return set->flags & BTF_SET8_KFUNCS; +} + +/* + * Parse BTF_ID symbol and return the func name. + * + * Returns: + * Caller-owned string containing func name if successful. + * NULL if !func or on error. + */ +static char *get_func_name(const char *sym) +{ + char *func, *end; + + if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) + return NULL; + + /* Strip prefix */ + func = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); + + /* Strip suffix */ + end = strrchr(func, '_'); + if (!end || *(end - 1) != '_') { + free(func); + return NULL; + } + *(end - 1) = '\0'; + + return func; +} + +static int btf_func_cmp(const void *_a, const void *_b) +{ + const struct btf_func *a = _a; + const struct btf_func *b = _b; + + return strcmp(a->name, b->name); +} + +/* + * Collects all functions described in BTF. + * Returns non-zero on error. + */ +static int btf_encoder__collect_btf_funcs(struct btf_encoder *encoder, struct gobuffer *funcs) +{ + struct btf *btf = encoder->btf; + int nr_types, type_id; + int err = -1; + + /* First collect all the func entries into an array */ + nr_types = btf__type_cnt(btf); + for (type_id = 1; type_id < nr_types; type_id++) { + const struct btf_type *type; + struct btf_func func = {}; + const char *name; + + type = btf__type_by_id(btf, type_id); + if (!type) { + fprintf(stderr, "%s: malformed BTF, can't resolve type for ID %d\n", + __func__, type_id); + err = -EINVAL; + goto out; + } + + if (!btf_is_func(type)) + continue; + + name = btf__name_by_offset(btf, type->name_off); + if (!name) { + fprintf(stderr, "%s: malformed BTF, can't resolve name for ID %d\n", + __func__, type_id); + err = -EINVAL; + goto out; + } + + func.name = name; + func.type_id = type_id; + err = gobuffer__add(funcs, &func, sizeof(func)); + if (err < 0) + goto out; + } + + /* Now that we've collected funcs, sort them by name */ + qsort((void *)gobuffer__entries(funcs), gobuffer__nr_entries(funcs), + sizeof(struct btf_func), btf_func_cmp); + + err = 0; +out: + return err; +} + +static int btf_encoder__tag_kfunc(struct btf_encoder *encoder, struct gobuffer *funcs, const char *kfunc) +{ + struct btf_func key = { .name = kfunc }; + struct btf *btf = encoder->btf; + struct btf_func *target; + const void *base; + unsigned int cnt; + int err = -1; + + base = gobuffer__entries(funcs); + cnt = gobuffer__nr_entries(funcs); + target = bsearch(&key, base, cnt, sizeof(key), btf_func_cmp); + if (!target) { + fprintf(stderr, "%s: failed to find kfunc '%s' in BTF\n", __func__, kfunc); + goto out; + } + + /* Note we are unconditionally adding the btf_decl_tag even + * though vmlinux may already contain btf_decl_tags for kfuncs. + * We are ok to do this b/c we will later btf__dedup() to remove + * any duplicates. + */ + err = btf__add_decl_tag(btf, BTF_KFUNC_TYPE_TAG, target->type_id, -1); + if (err < 0) { + fprintf(stderr, "%s: failed to insert kfunc decl tag for '%s': %d\n", + __func__, kfunc, err); + goto out; + } + + err = 0; +out: + return err; +} + +static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder) +{ + const char *filename = encoder->filename; + struct gobuffer btf_kfunc_ranges = {}; + struct gobuffer btf_funcs = {}; + Elf_Scn *symscn = NULL; + int symbols_shndx = -1; + int fd = -1, err = -1; + int idlist_shndx = -1; + Elf_Scn *scn = NULL; + size_t idlist_addr; + Elf_Data *symbols; + Elf_Data *idlist; + size_t strtabidx; + Elf *elf = NULL; + GElf_Shdr shdr; + size_t strndx; + char *secname; + int nr_syms; + int i = 0; + + fd = open(filename, O_RDONLY); + if (fd < 0) { + fprintf(stderr, "Cannot open %s\n", filename); + goto out; + } + + if (elf_version(EV_CURRENT) == EV_NONE) { + elf_error("Cannot set libelf version"); + goto out; + } + + elf = elf_begin(fd, ELF_C_READ, NULL); + if (elf == NULL) { + elf_error("Cannot update ELF file"); + goto out; + } + + /* Location symbol table and .BTF_ids sections */ + elf_getshdrstrndx(elf, &strndx); + while ((scn = elf_nextscn(elf, scn)) != NULL) { + Elf_Data *data; + + i++; + if (!gelf_getshdr(scn, &shdr)) { + elf_error("Failed to get ELF section(%d) hdr", i); + goto out; + } + + secname = elf_strptr(elf, strndx, shdr.sh_name); + if (!secname) { + elf_error("Failed to get ELF section(%d) hdr name", i); + goto out; + } + + data = elf_getdata(scn, 0); + if (!data) { + elf_error("Failed to get ELF section(%d) data", i); + goto out; + } + + if (shdr.sh_type == SHT_SYMTAB) { + symbols_shndx = i; + symscn = scn; + symbols = data; + strtabidx = shdr.sh_link; + } else if (!strcmp(secname, BTF_IDS_SECTION)) { + idlist_shndx = i; + idlist_addr = shdr.sh_addr; + idlist = data; + } + } + + /* Cannot resolve symbol or .BTF_ids sections. Nothing to do. */ + if (symbols_shndx == -1 || idlist_shndx == -1) { + err = 0; + goto out; + } + + if (!gelf_getshdr(symscn, &shdr)) { + elf_error("Failed to get ELF symbol table header"); + goto out; + } + nr_syms = shdr.sh_size / shdr.sh_entsize; + + err = btf_encoder__collect_btf_funcs(encoder, &btf_funcs); + if (err) { + fprintf(stderr, "%s: failed to collect BTF funcs\n", __func__); + goto out; + } + + /* First collect all kfunc set ranges. + * + * Note we choose not to sort these ranges and accept a linear + * search when doing lookups. Reasoning is that the number of + * sets is ~O(100) and not worth the additional code to optimize. + */ + for (i = 0; i < nr_syms; i++) { + struct btf_kfunc_set_range range = {}; + const char *name; + GElf_Sym sym; + + if (!gelf_getsym(symbols, i, &sym)) { + elf_error("Failed to get ELF symbol(%d)", i); + goto out; + } + + if (sym.st_shndx != idlist_shndx) + continue; + + name = elf_strptr(elf, strtabidx, sym.st_name); + if (!is_sym_kfunc_set(&sym, name, idlist, idlist_addr)) + continue; + + range.start = sym.st_value; + range.end = sym.st_value + sym.st_size; + gobuffer__add(&btf_kfunc_ranges, &range, sizeof(range)); + } + + /* Now inject BTF with kfunc decl tag for detected kfuncs */ + for (i = 0; i < nr_syms; i++) { + const struct btf_kfunc_set_range *ranges; + unsigned int ranges_cnt; + char *func, *name; + GElf_Sym sym; + bool found; + int err; + int j; + + if (!gelf_getsym(symbols, i, &sym)) { + elf_error("Failed to get ELF symbol(%d)", i); + goto out; + } + + if (sym.st_shndx != idlist_shndx) + continue; + + name = elf_strptr(elf, strtabidx, sym.st_name); + func = get_func_name(name); + if (!func) + continue; + + /* Check if function belongs to a kfunc set */ + ranges = gobuffer__entries(&btf_kfunc_ranges); + ranges_cnt = gobuffer__nr_entries(&btf_kfunc_ranges); + found = false; + for (j = 0; j < ranges_cnt; j++) { + size_t addr = sym.st_value; + + if (ranges[j].start <= addr && addr < ranges[j].end) { + found = true; + break; + } + } + if (!found) { + free(func); + continue; + } + + err = btf_encoder__tag_kfunc(encoder, &btf_funcs, func); + if (err) { + fprintf(stderr, "%s: failed to tag kfunc '%s'\n", __func__, func); + free(func); + goto out; + } + free(func); + } + + err = 0; +out: + __gobuffer__delete(&btf_funcs); + __gobuffer__delete(&btf_kfunc_ranges); + if (elf) + elf_end(elf); + if (fd != -1) + close(fd); + return err; +} + int btf_encoder__encode(struct btf_encoder *encoder) { int err; @@ -1367,6 +1718,14 @@ int btf_encoder__encode(struct btf_encoder *encoder) if (btf__type_cnt(encoder->btf) == 1) return 0; + /* Note vmlinux may already contain btf_decl_tag's for kfuncs. So + * take care to call this before btf_dedup(). + */ + if (encoder->tag_kfuncs && btf_encoder__tag_kfuncs(encoder)) { + fprintf(stderr, "%s: failed to tag kfuncs!\n", __func__); + return -1; + } + if (btf__dedup(encoder->btf, NULL)) { fprintf(stderr, "%s: btf__dedup failed!\n", __func__); return -1; -- 2.42.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-04 18:40 ` [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF Daniel Xu @ 2024-02-05 16:54 ` Eduard Zingerman 2024-02-05 23:31 ` Eduard Zingerman 2024-02-08 10:00 ` Alan Maguire 2 siblings, 0 replies; 14+ messages in thread From: Eduard Zingerman @ 2024-02-05 16:54 UTC (permalink / raw) To: Daniel Xu, acme, jolsa, quentin, alan.maguire Cc: andrii.nakryiko, ast, daniel, bpf On Sun, 2024-02-04 at 11:40 -0700, Daniel Xu wrote: [...] > +static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder) > +{ > + const char *filename = encoder->filename; > + struct gobuffer btf_kfunc_ranges = {}; > + struct gobuffer btf_funcs = {}; > + Elf_Scn *symscn = NULL; > + int symbols_shndx = -1; > + int fd = -1, err = -1; > + int idlist_shndx = -1; > + Elf_Scn *scn = NULL; > + size_t idlist_addr; > + Elf_Data *symbols; > + Elf_Data *idlist; > + size_t strtabidx; > + Elf *elf = NULL; > + GElf_Shdr shdr; > + size_t strndx; > + char *secname; > + int nr_syms; > + int i = 0; Note: when compiled in Release mode (e.g. using buildcmd.sh from the repo) there is a number of false-positive warnings reported by GCC 13.2.1: $ ./buildcmd.sh ... In function ‘is_sym_kfunc_set’, inlined from ‘btf_encoder__tag_kfuncs’ at /home/eddy/work/dwarves-fork/btf_encoder.c:1639:8, inlined from ‘btf_encoder__encode’ at /home/eddy/work/dwarves-fork/btf_encoder.c:1724:29: /home/eddy/work/dwarves-fork/btf_encoder.c:1395:29: warning: ‘idlist_addr’ may be used uninitialized [-Wmaybe-uninitialized] 1395 | off = sym->st_value - idlist_addr; | ~~~~~~~~~~~~~~^~~~~~~~~~~~~ /home/eddy/work/dwarves-fork/btf_encoder.c: In function ‘btf_encoder__encode’: /home/eddy/work/dwarves-fork/btf_encoder.c:1538:16: note: ‘idlist_addr’ was declared here 1538 | size_t idlist_addr; | ^~~~~~~~~~~ In function ‘btf_encoder__tag_kfuncs’, inlined from ‘btf_encoder__encode’ at /home/eddy/work/dwarves-fork/btf_encoder.c:1724:29: Same thing is reported for: - btf_encoder.c:1630:22: warning: ‘symbols’ may be used uninitialized - btf_encoder.c:1385:15: warning: ‘idlist’ may be used uninitialized - btf_encoder.c:1638:24: warning: ‘strtabidx’ may be used uninitialized GCC does not figure out that the variables above are guarded by -1 checks below. [...] > + while ((scn = elf_nextscn(elf, scn)) != NULL) { [...] > + if (shdr.sh_type == SHT_SYMTAB) { > + symbols_shndx = i; > + symscn = scn; > + symbols = data; > + strtabidx = shdr.sh_link; > + } else if (!strcmp(secname, BTF_IDS_SECTION)) { > + idlist_shndx = i; > + idlist_addr = shdr.sh_addr; > + idlist = data; > + } > + } > + > + /* Cannot resolve symbol or .BTF_ids sections. Nothing to do. */ > + if (symbols_shndx == -1 || idlist_shndx == -1) { > + err = 0; > + goto out; > + } [...] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-04 18:40 ` [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF Daniel Xu 2024-02-05 16:54 ` Eduard Zingerman @ 2024-02-05 23:31 ` Eduard Zingerman 2024-02-28 16:07 ` Daniel Xu 2024-02-08 10:00 ` Alan Maguire 2 siblings, 1 reply; 14+ messages in thread From: Eduard Zingerman @ 2024-02-05 23:31 UTC (permalink / raw) To: Daniel Xu, acme, jolsa, quentin, alan.maguire Cc: andrii.nakryiko, ast, daniel, bpf On Sun, 2024-02-04 at 11:40 -0700, Daniel Xu wrote: > This commit teaches pahole to parse symbols in .BTF_ids section in > vmlinux and discover exported kfuncs. Pahole then takes the list of > kfuncs and injects a BTF_KIND_DECL_TAG for each kfunc. > > Example of encoding: > > $ bpftool btf dump file .tmp_vmlinux.btf | rg "DECL_TAG 'bpf_kfunc'" | wc -l > 121 > > $ bpftool btf dump file .tmp_vmlinux.btf | rg 56337 > [56337] FUNC 'bpf_ct_change_timeout' type_id=56336 linkage=static > [127861] DECL_TAG 'bpf_kfunc' type_id=56337 component_idx=-1 > > This enables downstream users and tools to dynamically discover which > kfuncs are available on a system by parsing vmlinux or module BTF, both > available in /sys/kernel/btf. > > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> > --- I've tested this patch-set using kernel built both with clang and gcc, on current bpf-next master (2d9a925d0fbf), both times get 124 kfunc definitions. Tested-by: Eduard Zingerman <eddyz87@gmail.com> Two nitpicks below. [...] > +static char *get_func_name(const char *sym) > +{ > + char *func, *end; > + > + if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) > + return NULL; > + > + /* Strip prefix */ > + func = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); > + > + /* Strip suffix */ > + end = strrchr(func, '_'); > + if (!end || *(end - 1) != '_') { Nit: this would do out of bounds access on malformed input "__BTF_ID__func___" > + free(func); > + return NULL; > + } > + *(end - 1) = '\0'; > + > + return func; > +} [...] > +static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder) > +{ [...] > + elf = elf_begin(fd, ELF_C_READ, NULL); > + if (elf == NULL) { > + elf_error("Cannot update ELF file"); > + goto out; > + } > + > + /* Location symbol table and .BTF_ids sections */ > + elf_getshdrstrndx(elf, &strndx); Nit: in theory elf_getshdrstrndx() could fail and strndx would remain uninitialized. [...] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-05 23:31 ` Eduard Zingerman @ 2024-02-28 16:07 ` Daniel Xu 2024-02-28 21:33 ` Eduard Zingerman 0 siblings, 1 reply; 14+ messages in thread From: Daniel Xu @ 2024-02-28 16:07 UTC (permalink / raw) To: Eduard Zingerman Cc: acme, jolsa, quentin, alan.maguire, andrii.nakryiko, ast, daniel, bpf Hi Eduard, Apologies for long delay - life has been busy. On Tue, Feb 06, 2024 at 01:31:20AM +0200, Eduard Zingerman wrote: > On Sun, 2024-02-04 at 11:40 -0700, Daniel Xu wrote: > > This commit teaches pahole to parse symbols in .BTF_ids section in > > vmlinux and discover exported kfuncs. Pahole then takes the list of > > kfuncs and injects a BTF_KIND_DECL_TAG for each kfunc. > > > > Example of encoding: > > > > $ bpftool btf dump file .tmp_vmlinux.btf | rg "DECL_TAG 'bpf_kfunc'" | wc -l > > 121 > > > > $ bpftool btf dump file .tmp_vmlinux.btf | rg 56337 > > [56337] FUNC 'bpf_ct_change_timeout' type_id=56336 linkage=static > > [127861] DECL_TAG 'bpf_kfunc' type_id=56337 component_idx=-1 > > > > This enables downstream users and tools to dynamically discover which > > kfuncs are available on a system by parsing vmlinux or module BTF, both > > available in /sys/kernel/btf. > > > > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> > > --- > > I've tested this patch-set using kernel built both with clang and gcc, > on current bpf-next master (2d9a925d0fbf), both times get 124 kfunc definitions. > > Tested-by: Eduard Zingerman <eddyz87@gmail.com> > > Two nitpicks below. > > [...] > > > +static char *get_func_name(const char *sym) > > +{ > > + char *func, *end; > > + > > + if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) > > + return NULL; > > + > > + /* Strip prefix */ > > + func = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); > > + > > + /* Strip suffix */ > > + end = strrchr(func, '_'); > > + if (!end || *(end - 1) != '_') { > > Nit: this would do out of bounds access on malformed input > "__BTF_ID__func___" I think this is actually ok. Reason is we have the strncmp() above so we know the prefix is there. Then the strdup() in the malformed cased returns empty string. And strrchr() will then return NULL, so we don't enter the branch. I tested it with: https://pastes.dxuuu.xyz/c3j4kk $ gcc test.c dxu@kashmir~/scratch $ ./a.out name=(null) > > > + free(func); > > + return NULL; > > + } > > + *(end - 1) = '\0'; > > + > > + return func; > > +} > > [...] > > > +static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder) > > +{ > > [...] > > > + elf = elf_begin(fd, ELF_C_READ, NULL); > > + if (elf == NULL) { > > + elf_error("Cannot update ELF file"); > > + goto out; > > + } > > + > > + /* Location symbol table and .BTF_ids sections */ > > + elf_getshdrstrndx(elf, &strndx); > > Nit: in theory elf_getshdrstrndx() could fail and strndx would remain > uninitialized. Sure, will fix. Also fixing typo (Location -> Locate). Thanks, Daniel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-28 16:07 ` Daniel Xu @ 2024-02-28 21:33 ` Eduard Zingerman 2024-03-15 19:43 ` Daniel Xu 0 siblings, 1 reply; 14+ messages in thread From: Eduard Zingerman @ 2024-02-28 21:33 UTC (permalink / raw) To: Daniel Xu Cc: acme, jolsa, quentin, alan.maguire, andrii.nakryiko, ast, daniel, bpf On Wed, 2024-02-28 at 09:07 -0700, Daniel Xu wrote: > Hi Eduard, > > Apologies for long delay - life has been busy. Hi Daniel, No problem, thank you for reaching back. [...] > > > +static char *get_func_name(const char *sym) > > > +{ > > > + char *func, *end; > > > + > > > + if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) > > > + return NULL; > > > + > > > + /* Strip prefix */ > > > + func = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); > > > + > > > + /* Strip suffix */ > > > + end = strrchr(func, '_'); > > > + if (!end || *(end - 1) != '_') { > > > > Nit: this would do out of bounds access on malformed input > > "__BTF_ID__func___" > > I think this is actually ok. Reason is we have the strncmp() above > so we know the prefix is there. Then the strdup() in the malformed cased > returns empty string. And strrchr() will then return NULL, so we don't > enter the branch. > > I tested it with: https://pastes.dxuuu.xyz/c3j4kk > > $ gcc test.c > dxu@kashmir~/scratch $ ./a.out > name=(null) > The test is for "__BTF_ID__func__", but nitpick is for "__BTF_ID__func___" (three underscores in the end). E.g. here is a repro: --- 8< --------------------------------------------------------------- $ cat test.c #include <stdlib.h> #include <string.h> #include <stdio.h> #define BTF_ID_FUNC_PFX "__BTF_ID__func__" static char *get_func_name(const char *sym) { char *func, *end; if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) return NULL; /* Strip prefix */ func = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); /* Strip suffix */ end = strrchr(func, '_'); if (!end || *(end - 1) != '_') { free(func); return NULL; } *(end - 1) = '\0'; return func; } int main(int argc, char *argv[]) { if (argc < 2) return -1; printf("name='%s;\n", get_func_name(argv[1])); return 0; } $ gcc -g test.c $ valgrind ./a.out __BTF_ID__func___ ==16856== Memcheck, a memory error detector ==16856== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==16856== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info ==16856== Command: ./a.out __BTF_ID__func___ ==16856== ==16856== Invalid read of size 1 ==16856== at 0x4011EB: get_func_name (test.c:19) ==16856== by 0x401244: main (test.c:32) ==16856== Address 0x4a7e03f is 1 bytes before a block of size 2 alloc'd ==16856== at 0x4845784: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==16856== by 0x492176D: strdup (in /usr/lib64/libc.so.6) ==16856== by 0x4011C2: get_func_name (test.c:15) ==16856== by 0x401244: main (test.c:32) ==16856== name='(null); ==16856== ==16856== HEAP SUMMARY: ==16856== in use at exit: 0 bytes in 0 blocks ==16856== total heap usage: 2 allocs, 2 frees, 1,026 bytes allocated ==16856== ==16856== All heap blocks were freed -- no leaks are possible ==16856== ==16856== For lists of detected and suppressed errors, rerun with: -s ==16856== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) --------------------------------------------------------------- >8 --- Thanks, Eduard ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-28 21:33 ` Eduard Zingerman @ 2024-03-15 19:43 ` Daniel Xu 0 siblings, 0 replies; 14+ messages in thread From: Daniel Xu @ 2024-03-15 19:43 UTC (permalink / raw) To: Eduard Zingerman Cc: acme, jolsa, quentin, alan.maguire, andrii.nakryiko, ast, daniel, bpf Hi Eduard, On Wed, Feb 28, 2024 at 11:33:28PM +0200, Eduard Zingerman wrote: > On Wed, 2024-02-28 at 09:07 -0700, Daniel Xu wrote: > > Hi Eduard, > > > > Apologies for long delay - life has been busy. > > Hi Daniel, > > No problem, thank you for reaching back. > > [...] > > > > > +static char *get_func_name(const char *sym) > > > > +{ > > > > + char *func, *end; > > > > + > > > > + if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) > > > > + return NULL; > > > > + > > > > + /* Strip prefix */ > > > > + func = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); > > > > + > > > > + /* Strip suffix */ > > > > + end = strrchr(func, '_'); > > > > + if (!end || *(end - 1) != '_') { > > > > > > Nit: this would do out of bounds access on malformed input > > > "__BTF_ID__func___" > > > > I think this is actually ok. Reason is we have the strncmp() above > > so we know the prefix is there. Then the strdup() in the malformed cased > > returns empty string. And strrchr() will then return NULL, so we don't > > enter the branch. > > > > I tested it with: https://pastes.dxuuu.xyz/c3j4kk > > > > $ gcc test.c > > dxu@kashmir~/scratch $ ./a.out > > name=(null) > > > > The test is for "__BTF_ID__func__", but nitpick is for "__BTF_ID__func___" > (three underscores in the end). > Ha, got it. Didn't see the 3rd one. Fixed in v5. [...] Thanks, Daniel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-04 18:40 ` [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF Daniel Xu 2024-02-05 16:54 ` Eduard Zingerman 2024-02-05 23:31 ` Eduard Zingerman @ 2024-02-08 10:00 ` Alan Maguire 2024-02-29 0:57 ` Daniel Xu 2 siblings, 1 reply; 14+ messages in thread From: Alan Maguire @ 2024-02-08 10:00 UTC (permalink / raw) To: Daniel Xu, acme, jolsa, quentin; +Cc: andrii.nakryiko, ast, daniel, bpf On 04/02/2024 18:40, Daniel Xu wrote: > This commit teaches pahole to parse symbols in .BTF_ids section in > vmlinux and discover exported kfuncs. Pahole then takes the list of > kfuncs and injects a BTF_KIND_DECL_TAG for each kfunc. > > Example of encoding: > > $ bpftool btf dump file .tmp_vmlinux.btf | rg "DECL_TAG 'bpf_kfunc'" | wc -l > 121 > > $ bpftool btf dump file .tmp_vmlinux.btf | rg 56337 > [56337] FUNC 'bpf_ct_change_timeout' type_id=56336 linkage=static > [127861] DECL_TAG 'bpf_kfunc' type_id=56337 component_idx=-1 > > This enables downstream users and tools to dynamically discover which > kfuncs are available on a system by parsing vmlinux or module BTF, both > available in /sys/kernel/btf. > > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Looks good! A few suggestions below.. > --- > btf_encoder.c | 359 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 359 insertions(+) > > diff --git a/btf_encoder.c b/btf_encoder.c > index e325f66..d6a561c 100644 > --- a/btf_encoder.c > +++ b/btf_encoder.c > @@ -34,6 +34,21 @@ > #include <pthread.h> > > #define BTF_ENCODER_MAX_PROTO 512 > +#define BTF_IDS_SECTION ".BTF_ids" > +#define BTF_ID_FUNC_PFX "__BTF_ID__func__" > +#define BTF_ID_SET8_PFX "__BTF_ID__set8__" > +#define BTF_SET8_KFUNCS (1 << 0) > +#define BTF_KFUNC_TYPE_TAG "bpf_kfunc" > + > +/* Adapted from include/linux/btf_ids.h */ > +struct btf_id_set8 { > + uint32_t cnt; > + uint32_t flags; > + struct { > + uint32_t id; > + uint32_t flags; > + } pairs[]; > +}; > > /* state used to do later encoding of saved functions */ > struct btf_encoder_state { > @@ -95,6 +110,17 @@ struct btf_encoder { > } functions; > }; > > +struct btf_func { > + const char *name; > + int type_id; > +}; > + > +/* Half open interval representing range of addresses containing kfuncs */ > +struct btf_kfunc_set_range { > + size_t start; > + size_t end; > +}; > + > static LIST_HEAD(encoders); > static pthread_mutex_t encoders__lock = PTHREAD_MUTEX_INITIALIZER; > > @@ -1353,6 +1379,331 @@ out: > return err; > } > > +/* Returns if `sym` points to a kfunc set */ > +static int is_sym_kfunc_set(GElf_Sym *sym, const char *name, Elf_Data *idlist, size_t idlist_addr) > +{ > + void *ptr = idlist->d_buf; > + struct btf_id_set8 *set; > + bool is_set8; > + int off; > + > + /* kfuncs are only found in BTF_SET8's */ > + is_set8 = !strncmp(name, BTF_ID_SET8_PFX, sizeof(BTF_ID_SET8_PFX) - 1); > + if (!is_set8) > + return false; > + > + off = sym->st_value - idlist_addr; > + if (off >= idlist->d_size) { > + fprintf(stderr, "%s: symbol '%s' out of bounds\n", __func__, name); > + return false; > + } > + > + /* Check the set8 flags to see if it was marked as kfunc */ > + set = ptr + off; > + return set->flags & BTF_SET8_KFUNCS; > +} > + > +/* > + * Parse BTF_ID symbol and return the func name. > + * > + * Returns: > + * Caller-owned string containing func name if successful. > + * NULL if !func or on error. > + */ > +static char *get_func_name(const char *sym) > +{ > + char *func, *end; > + > + if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) > + return NULL; > + > + /* Strip prefix */ > + func = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); > + > + /* Strip suffix */ > + end = strrchr(func, '_'); > + if (!end || *(end - 1) != '_') { > + free(func); > + return NULL; > + } > + *(end - 1) = '\0'; > + > + return func; > +} > + > +static int btf_func_cmp(const void *_a, const void *_b) > +{ > + const struct btf_func *a = _a; > + const struct btf_func *b = _b; > + > + return strcmp(a->name, b->name); > +} > + > +/* > + * Collects all functions described in BTF. > + * Returns non-zero on error. > + */ > +static int btf_encoder__collect_btf_funcs(struct btf_encoder *encoder, struct gobuffer *funcs) > +{ > + struct btf *btf = encoder->btf; > + int nr_types, type_id; > + int err = -1; > + > + /* First collect all the func entries into an array */ > + nr_types = btf__type_cnt(btf); > + for (type_id = 1; type_id < nr_types; type_id++) { > + const struct btf_type *type; > + struct btf_func func = {}; > + const char *name; > + > + type = btf__type_by_id(btf, type_id); > + if (!type) { > + fprintf(stderr, "%s: malformed BTF, can't resolve type for ID %d\n", > + __func__, type_id); > + err = -EINVAL; > + goto out; > + } > + > + if (!btf_is_func(type)) > + continue; > + > + name = btf__name_by_offset(btf, type->name_off); > + if (!name) { > + fprintf(stderr, "%s: malformed BTF, can't resolve name for ID %d\n", > + __func__, type_id); > + err = -EINVAL; > + goto out; > + } > + > + func.name = name; > + func.type_id = type_id; > + err = gobuffer__add(funcs, &func, sizeof(func)); > + if (err < 0) > + goto out; > + } > + > + /* Now that we've collected funcs, sort them by name */ > + qsort((void *)gobuffer__entries(funcs), gobuffer__nr_entries(funcs), > + sizeof(struct btf_func), btf_func_cmp); > + > + err = 0; > +out: > + return err; > +} > + > +static int btf_encoder__tag_kfunc(struct btf_encoder *encoder, struct gobuffer *funcs, const char *kfunc) > +{ > + struct btf_func key = { .name = kfunc }; > + struct btf *btf = encoder->btf; > + struct btf_func *target; > + const void *base; > + unsigned int cnt; > + int err = -1; > + > + base = gobuffer__entries(funcs); > + cnt = gobuffer__nr_entries(funcs); > + target = bsearch(&key, base, cnt, sizeof(key), btf_func_cmp); > + if (!target) { > + fprintf(stderr, "%s: failed to find kfunc '%s' in BTF\n", __func__, kfunc); > + goto out; > + } > + > + /* Note we are unconditionally adding the btf_decl_tag even > + * though vmlinux may already contain btf_decl_tags for kfuncs. > + * We are ok to do this b/c we will later btf__dedup() to remove > + * any duplicates. > + */ > + err = btf__add_decl_tag(btf, BTF_KFUNC_TYPE_TAG, target->type_id, -1); > + if (err < 0) { > + fprintf(stderr, "%s: failed to insert kfunc decl tag for '%s': %d\n", > + __func__, kfunc, err); > + goto out; > + } > + > + err = 0; > +out: > + return err; > +} > + > +static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder) > +{ > + const char *filename = encoder->filename; > + struct gobuffer btf_kfunc_ranges = {}; > + struct gobuffer btf_funcs = {}; > + Elf_Scn *symscn = NULL; > + int symbols_shndx = -1; > + int fd = -1, err = -1; > + int idlist_shndx = -1; > + Elf_Scn *scn = NULL; > + size_t idlist_addr; > + Elf_Data *symbols; > + Elf_Data *idlist; > + size_t strtabidx; > + Elf *elf = NULL; > + GElf_Shdr shdr; > + size_t strndx; > + char *secname; > + int nr_syms; > + int i = 0; > + > + fd = open(filename, O_RDONLY); > + if (fd < 0) { > + fprintf(stderr, "Cannot open %s\n", filename); > + goto out; > + } > + > + if (elf_version(EV_CURRENT) == EV_NONE) { > + elf_error("Cannot set libelf version"); > + goto out; > + } > + > + elf = elf_begin(fd, ELF_C_READ, NULL); > + if (elf == NULL) { > + elf_error("Cannot update ELF file"); > + goto out; > + } > + > + /* Location symbol table and .BTF_ids sections */ > + elf_getshdrstrndx(elf, &strndx); > + while ((scn = elf_nextscn(elf, scn)) != NULL) { > + Elf_Data *data; > + > + i++; > + if (!gelf_getshdr(scn, &shdr)) { > + elf_error("Failed to get ELF section(%d) hdr", i); > + goto out; > + } > + > + secname = elf_strptr(elf, strndx, shdr.sh_name); > + if (!secname) { > + elf_error("Failed to get ELF section(%d) hdr name", i); > + goto out; > + } > + > + data = elf_getdata(scn, 0); > + if (!data) { > + elf_error("Failed to get ELF section(%d) data", i); > + goto out; > + } > + > + if (shdr.sh_type == SHT_SYMTAB) { > + symbols_shndx = i; > + symscn = scn; > + symbols = data; > + strtabidx = shdr.sh_link; > + } else if (!strcmp(secname, BTF_IDS_SECTION)) { > + idlist_shndx = i; > + idlist_addr = shdr.sh_addr; > + idlist = data; > + } > + } > + Can we use the existing list of ELF functions collected via btf_encoder__collect_symbols() for the above? We merge info across CUs about ELF functions. It _seems_ like there might be a way to re-use this info but I might be missing something; more below.. > + /* Cannot resolve symbol or .BTF_ids sections. Nothing to do. */ > + if (symbols_shndx == -1 || idlist_shndx == -1) { > + err = 0; > + goto out; > + } > + > + if (!gelf_getshdr(symscn, &shdr)) { > + elf_error("Failed to get ELF symbol table header"); > + goto out; > + } > + nr_syms = shdr.sh_size / shdr.sh_entsize; > + > + err = btf_encoder__collect_btf_funcs(encoder, &btf_funcs); > + if (err) { > + fprintf(stderr, "%s: failed to collect BTF funcs\n", __func__); > + goto out; > + } > + > + /* First collect all kfunc set ranges. > + * > + * Note we choose not to sort these ranges and accept a linear > + * search when doing lookups. Reasoning is that the number of > + * sets is ~O(100) and not worth the additional code to optimize. > + */ > + for (i = 0; i < nr_syms; i++) { > + struct btf_kfunc_set_range range = {}; > + const char *name; > + GElf_Sym sym; > + > + if (!gelf_getsym(symbols, i, &sym)) { > + elf_error("Failed to get ELF symbol(%d)", i); > + goto out; > + } > + > + if (sym.st_shndx != idlist_shndx) > + continue; > + > + name = elf_strptr(elf, strtabidx, sym.st_name); > + if (!is_sym_kfunc_set(&sym, name, idlist, idlist_addr)) > + continue; > + > + range.start = sym.st_value; > + range.end = sym.st_value + sym.st_size; we could potentially record this info when we collect symbols in btf_encoder__collect_function(). The reason I suggest this is that it is likely that to fully clarify which symbol a name refers to we will end up needing the address. So struct elf_function could record start and size, and that could be used by you later without having to parse ELF for symbols (you'd still need to for the BTF ids section). Then all you'd need to do is iterate over BTF functions, using btf_encoder__find_function() to get a function and associated ELF info by name. > + gobuffer__add(&btf_kfunc_ranges, &range, sizeof(range)); > + } > + > + /* Now inject BTF with kfunc decl tag for detected kfuncs */ > + for (i = 0; i < nr_syms; i++) { > + const struct btf_kfunc_set_range *ranges; > + unsigned int ranges_cnt; > + char *func, *name; > + GElf_Sym sym; > + bool found; > + int err; > + int j; > + > + if (!gelf_getsym(symbols, i, &sym)) { > + elf_error("Failed to get ELF symbol(%d)", i); > + goto out; > + } > + > + if (sym.st_shndx != idlist_shndx) > + continue; > + > + name = elf_strptr(elf, strtabidx, sym.st_name); > + func = get_func_name(name); > + if (!func) > + continue; > + > + /* Check if function belongs to a kfunc set */ > + ranges = gobuffer__entries(&btf_kfunc_ranges); > + ranges_cnt = gobuffer__nr_entries(&btf_kfunc_ranges); > + found = false; > + for (j = 0; j < ranges_cnt; j++) { > + size_t addr = sym.st_value; > + > + if (ranges[j].start <= addr && addr < ranges[j].end) { > + found = true; > + break; > + } > + } > + if (!found) { > + free(func); > + continue; > + } > + > + err = btf_encoder__tag_kfunc(encoder, &btf_funcs, func); > + if (err) { > + fprintf(stderr, "%s: failed to tag kfunc '%s'\n", __func__, func); > + free(func); > + goto out; > + } > + free(func); > + } > + > + err = 0; > +out: > + __gobuffer__delete(&btf_funcs); > + __gobuffer__delete(&btf_kfunc_ranges); > + if (elf) > + elf_end(elf); > + if (fd != -1) > + close(fd); > + return err; > +} > + > int btf_encoder__encode(struct btf_encoder *encoder) > { > int err; > @@ -1367,6 +1718,14 @@ int btf_encoder__encode(struct btf_encoder *encoder) > if (btf__type_cnt(encoder->btf) == 1) > return 0; > > + /* Note vmlinux may already contain btf_decl_tag's for kfuncs. So > + * take care to call this before btf_dedup(). > + */ sorry another thing I should have thought of here; if the user has asked to skip encoding declaration tags, we should respect that for this case also. So you'll probably need to set encoder->skip_encoding_btf_decl_tag = conf_load->skip_encoding_btf_decl_tag; ..earlier on, and bail here if encoder->skip_encoding_btf_decl_tag is true. We'd need to be consistent in that if the user asks not to encode declaration tags we don't do it for this case either. > + if (encoder->tag_kfuncs && btf_encoder__tag_kfuncs(encoder)) { > + fprintf(stderr, "%s: failed to tag kfuncs!\n", __func__); > + return -1; > + } > + > if (btf__dedup(encoder->btf, NULL)) { > fprintf(stderr, "%s: btf__dedup failed!\n", __func__); > return -1; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-08 10:00 ` Alan Maguire @ 2024-02-29 0:57 ` Daniel Xu 2024-03-13 3:17 ` Daniel Xu 0 siblings, 1 reply; 14+ messages in thread From: Daniel Xu @ 2024-02-29 0:57 UTC (permalink / raw) To: Alan Maguire; +Cc: acme, jolsa, quentin, andrii.nakryiko, ast, daniel, bpf Hi Alan, On Thu, Feb 08, 2024 at 10:00:15AM +0000, Alan Maguire wrote: > On 04/02/2024 18:40, Daniel Xu wrote: [...] > > + > > + if (shdr.sh_type == SHT_SYMTAB) { > > + symbols_shndx = i; > > + symscn = scn; > > + symbols = data; > > + strtabidx = shdr.sh_link; > > + } else if (!strcmp(secname, BTF_IDS_SECTION)) { > > + idlist_shndx = i; > > + idlist_addr = shdr.sh_addr; > > + idlist = data; > > + } > > + } > > + > > Can we use the existing list of ELF functions collected via > btf_encoder__collect_symbols() for the above? We merge info across CUs > about ELF functions. It _seems_ like there might be a way to re-use this > info but I might be missing something; more below.. So the above two sections are only used to acquire information on the set8's. For collecting functions, this patch uses BTF (see btf_encoder__collect_btf_funcs()). > > > > + /* Cannot resolve symbol or .BTF_ids sections. Nothing to do. */ > > + if (symbols_shndx == -1 || idlist_shndx == -1) { > > + err = 0; > > + goto out; > > + } > > + > > + if (!gelf_getshdr(symscn, &shdr)) { > > + elf_error("Failed to get ELF symbol table header"); > > + goto out; > > + } > > + nr_syms = shdr.sh_size / shdr.sh_entsize; > > + > > + err = btf_encoder__collect_btf_funcs(encoder, &btf_funcs); > > + if (err) { > > + fprintf(stderr, "%s: failed to collect BTF funcs\n", __func__); > > + goto out; > > + } > > + > > + /* First collect all kfunc set ranges. > > + * > > + * Note we choose not to sort these ranges and accept a linear > > + * search when doing lookups. Reasoning is that the number of > > + * sets is ~O(100) and not worth the additional code to optimize. > > + */ > > + for (i = 0; i < nr_syms; i++) { > > + struct btf_kfunc_set_range range = {}; > > + const char *name; > > + GElf_Sym sym; > > + > > + if (!gelf_getsym(symbols, i, &sym)) { > > + elf_error("Failed to get ELF symbol(%d)", i); > > + goto out; > > + } > > + > > + if (sym.st_shndx != idlist_shndx) > > + continue; > > + > > + name = elf_strptr(elf, strtabidx, sym.st_name); > > + if (!is_sym_kfunc_set(&sym, name, idlist, idlist_addr)) > > + continue; > > + > > + range.start = sym.st_value; > > + range.end = sym.st_value + sym.st_size; > > we could potentially record this info when we collect symbols in > btf_encoder__collect_function(). The reason I suggest this is that it is > likely that to fully clarify which symbol a name refers to we will end > up needing the address. So struct elf_function could record start and > size, and that could be used by you later without having to parse ELF > for symbols (you'd still need to for the BTF ids section). This start/end pair is just for the set8's in .BTF_ids section. Which btf_encoder__collect_function() rightfully does not look at. So I think new code needs to be added for set8 collection anyways. I don't have any strong feelings about whether we should hook into btf_encoder__collect_symbols() or not -- IMHO it's a bit cleaner to have all the set8 stuff on its own codepath. What cases do you see needing the location + size for? Since kfuncs are exported, I would think it's not possible for a single object file to have multiple copies of the same kfunc. Nor that the compiler inline the kfunc. > > Then all you'd need to do is iterate over BTF functions, using > btf_encoder__find_function() to get a function and associated ELF info > by name. Didn't know about this, thanks. I'll take a look at if the patch can use the existing function metadata. That should get rid of btf_encoder__collect_btf_funcs() if it works. > > > > + gobuffer__add(&btf_kfunc_ranges, &range, sizeof(range)); > > + } > > + > > + /* Now inject BTF with kfunc decl tag for detected kfuncs */ > > + for (i = 0; i < nr_syms; i++) { > > + const struct btf_kfunc_set_range *ranges; > > + unsigned int ranges_cnt; > > + char *func, *name; > > + GElf_Sym sym; > > + bool found; > > + int err; > > + int j; > > + > > + if (!gelf_getsym(symbols, i, &sym)) { > > + elf_error("Failed to get ELF symbol(%d)", i); > > + goto out; > > + } > > + > > + if (sym.st_shndx != idlist_shndx) > > + continue; > > + > > + name = elf_strptr(elf, strtabidx, sym.st_name); > > + func = get_func_name(name); > > + if (!func) > > + continue; > > + > > + /* Check if function belongs to a kfunc set */ > > + ranges = gobuffer__entries(&btf_kfunc_ranges); > > + ranges_cnt = gobuffer__nr_entries(&btf_kfunc_ranges); > > + found = false; > > + for (j = 0; j < ranges_cnt; j++) { > > + size_t addr = sym.st_value; > > + > > + if (ranges[j].start <= addr && addr < ranges[j].end) { > > + found = true; > > + break; > > + } > > + } > > + if (!found) { > > + free(func); > > + continue; > > + } > > + > > + err = btf_encoder__tag_kfunc(encoder, &btf_funcs, func); > > + if (err) { > > + fprintf(stderr, "%s: failed to tag kfunc '%s'\n", __func__, func); > > + free(func); > > + goto out; > > + } > > + free(func); > > + } > > + > > + err = 0; > > +out: > > + __gobuffer__delete(&btf_funcs); > > + __gobuffer__delete(&btf_kfunc_ranges); > > + if (elf) > > + elf_end(elf); > > + if (fd != -1) > > + close(fd); > > + return err; > > +} > > + > > int btf_encoder__encode(struct btf_encoder *encoder) > > { > > int err; > > @@ -1367,6 +1718,14 @@ int btf_encoder__encode(struct btf_encoder *encoder) > > if (btf__type_cnt(encoder->btf) == 1) > > return 0; > > > > + /* Note vmlinux may already contain btf_decl_tag's for kfuncs. So > > + * take care to call this before btf_dedup(). > > + */ > > sorry another thing I should have thought of here; if the user has asked > to skip encoding declaration tags, we should respect that for this case > also. So you'll probably need to set > > encoder->skip_encoding_btf_decl_tag = > conf_load->skip_encoding_btf_decl_tag; > > ..earlier on, and bail here if encoder->skip_encoding_btf_decl_tag is > true. We'd need to be consistent in that if the user asks not to encode > declaration tags we don't do it for this case either. Yeah, good catch. Will do. [...] Thanks, Daniel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-02-29 0:57 ` Daniel Xu @ 2024-03-13 3:17 ` Daniel Xu 2024-03-13 9:36 ` Alan Maguire 0 siblings, 1 reply; 14+ messages in thread From: Daniel Xu @ 2024-03-13 3:17 UTC (permalink / raw) To: Alan Maguire; +Cc: acme, jolsa, quentin, andrii.nakryiko, ast, daniel, bpf Hi Alan, On Wed, Feb 28, 2024 at 05:57:01PM -0700, Daniel Xu wrote: > Hi Alan, > > On Thu, Feb 08, 2024 at 10:00:15AM +0000, Alan Maguire wrote: > > On 04/02/2024 18:40, Daniel Xu wrote: [...] > > > > Then all you'd need to do is iterate over BTF functions, using > > btf_encoder__find_function() to get a function and associated ELF info > > by name. > > Didn't know about this, thanks. I'll take a look at if the patch can use > the existing function metadata. That should get rid of > btf_encoder__collect_btf_funcs() if it works. I experimented with this a bit and I'm not sure if it's a good approach. Here are the two commits: https://pastes.dxuuu.xyz/xo9jwk https://pastes.dxuuu.xyz/xmzew5 Basically my approach was to fan-in all the function info collected by the different threads. I don't think it'll work cuz some of the data (the name in particular) in struct elf_function belongs to the thread's struct btf_encoder instance. Which is all freed by btf_encoder__delete() before btf_encoder__encode(). It could probably be fixed, but doesn't seem very clean to me. So I think it'll be better to keep the code as-is. Unless you were thinking something different. Thanks, Daniel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF 2024-03-13 3:17 ` Daniel Xu @ 2024-03-13 9:36 ` Alan Maguire 0 siblings, 0 replies; 14+ messages in thread From: Alan Maguire @ 2024-03-13 9:36 UTC (permalink / raw) To: Daniel Xu; +Cc: acme, jolsa, quentin, andrii.nakryiko, ast, daniel, bpf On 13/03/2024 03:17, Daniel Xu wrote: > Hi Alan, > > On Wed, Feb 28, 2024 at 05:57:01PM -0700, Daniel Xu wrote: >> Hi Alan, >> >> On Thu, Feb 08, 2024 at 10:00:15AM +0000, Alan Maguire wrote: >>> On 04/02/2024 18:40, Daniel Xu wrote: > > [...] > >>> >>> Then all you'd need to do is iterate over BTF functions, using >>> btf_encoder__find_function() to get a function and associated ELF info >>> by name. >> >> Didn't know about this, thanks. I'll take a look at if the patch can use >> the existing function metadata. That should get rid of >> btf_encoder__collect_btf_funcs() if it works. > > I experimented with this a bit and I'm not sure if it's a good approach. > > Here are the two commits: > > https://pastes.dxuuu.xyz/xo9jwk > https://pastes.dxuuu.xyz/xmzew5 > > Basically my approach was to fan-in all the function info collected by > the different threads. I don't think it'll work cuz some of the data > (the name in particular) in struct elf_function belongs to the thread's > struct btf_encoder instance. Which is all freed by btf_encoder__delete() > before btf_encoder__encode(). > > It could probably be fixed, but doesn't seem very clean to me. So I > think it'll be better to keep the code as-is. Unless you were thinking > something different. > Thanks for trying! We may end up revisiting the freeing of ELF function/variable info in the future if we add address information from symbols into BTF, but since that's not there yet, it makes sense to do your collection separately. Alan ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-03-15 19:43 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-04 18:40 [PATCH dwarves v4 0/2] pahole: Inject kfunc decl tags into BTF Daniel Xu 2024-02-04 18:40 ` [PATCH dwarves v4 1/2] pahole: Add --btf_feature=decl_tag_kfuncs feature Daniel Xu 2024-02-06 12:47 ` Jiri Olsa 2024-02-06 14:14 ` Alan Maguire 2024-02-04 18:40 ` [PATCH dwarves v4 2/2] pahole: Inject kfunc decl tags into BTF Daniel Xu 2024-02-05 16:54 ` Eduard Zingerman 2024-02-05 23:31 ` Eduard Zingerman 2024-02-28 16:07 ` Daniel Xu 2024-02-28 21:33 ` Eduard Zingerman 2024-03-15 19:43 ` Daniel Xu 2024-02-08 10:00 ` Alan Maguire 2024-02-29 0:57 ` Daniel Xu 2024-03-13 3:17 ` Daniel Xu 2024-03-13 9:36 ` Alan Maguire
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox