* [PATCH dwarves 0/4] Emit global variables in BTF
@ 2024-09-12 19:08 Stephen Brennan
2024-09-12 19:08 ` [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index Stephen Brennan
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Stephen Brennan @ 2024-09-12 19:08 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: dwarves, linux-debuggers, Stephen Brennan, Alan Maguire
Hello all,
Currently, pahole has the "var" feature which emits a single DATASEC for
the ".data..percpu" section, and emits only the variables from that
section. I work on the drgn debugger [1] and have been working specifically
on features that allow users to debug their kernel without necessarily
having DWARF debuginfo [2]. To that end, I'd like to see *all* global
variables included in the BTF, not just percpu variables. But this woudln't
just be useful for debuggers: libbpf already supports resolution of global
variables with __ksym annotations, but these have to be declared as void.
For example, in tools/bpf/bpftool/skeleton/pid_iter.bpf.c we have:
extern const void bpf_map_fops __ksym;
With global variable information, declarations like these would be able to
use the actial variable types, for example:
extern const struct file_operations bpf_map_fops __ksym;
To that end, this patch series adds a pahole feature "global_var", which
includes global variable type information in the BTF. Most of the important
code comes in patch 4. The major difficulty is in deciding which DWARF
variables should be emitted. The current pahole implementation considers
any variable with a static memory address a global, but that includes
static local variables, so I've added a flag to indicate when variables are
declared at the top level of a CU. There are some other useless kernel
variables which are generated by macros, etc. I filter these by name, which
isn't ideal, but works well.
To test and measure the size impact, I've built v6.11-rc7 on x86_64 with a
defconfig, and enabled the necessary BTF support. For one build, I used
pahole built from master, and for the other, I used these patches on top.
Here is a comparison of total vmlinux and module BTF size for old and new.
Before After
vmlinux BTF size 0x5f4a03 (5.95 MiB) 0x77e948 (7.49 MiB)
increase: 1.54 MiB (25.8%)
#modules 10 10
module BTF size 0x20c5 (8.19 KiB) 0x3235 (12.55 KiB)
increase: 4.36 KiB (53.2%)
The module size increases more compared to the vmlinux, because each
module's BTF doesn't define many types, so the variables are larger as a
percentage of the total. These are also rather small modules (whichever few
were enabled for the build of an x86_64 defconfig), I'd expect larger
modules to have a smaller relative increase, because they would define more
of their own types.
To sanity check the output, I looked at the "DATASEC" generated for the
percpu section, and verified that it was nearly identical. The only
difference (besides the type IDs) was the addition of the fixed_percpu_data
variable, which had been omitted by pahole's master branch:
--- percpu_datasec_master.txt 2024-09-12 00:04:37.255240578 +0000
+++ percpu_datasec_global_var.txt 2024-09-12 00:04:24.214980821 +0000
@@ -1,4 +1,5 @@
-DATASEC '.data..percpu' size=199384 vlen=302
+DATASEC '.data..percpu' size=199384 vlen=303
+offset=0 size=48 (VAR 'fixed_percpu_data')
offset=4096 size=4096 (VAR 'cpu_debug_store')
offset=8192 size=16384 (VAR 'irq_stack_backing_store')
offset=24576 size=20480 (VAR 'cpu_tss_rw')
Finally, you can find below a summary of the vmlinux DATASEC outputs which
were added by this patch in this build scenario.
DATASEC '.rodata' size=2910928 vlen=6514
DATASEC '__init_rodata' size=48 vlen=1
DATASEC '__param' size=21640 vlen=537
DATASEC '__modver' size=1008 vlen=14
DATASEC '.data' size=2953985 vlen=16451
DATASEC '.orc_header' size=20 vlen=1
DATASEC '.vvar' size=656 vlen=2
DATASEC '.data..percpu' size=199384 vlen=303
DATASEC '.init.data' size=690040 vlen=4830
DATASEC '.x86_cpu_dev.init' size=40 vlen=5
DATASEC '.apicdrivers' size=16 vlen=1
DATASEC '.data_nosave' size=4 vlen=1
DATASEC '.bss' size=693060 vlen=2532
DATASEC '.brk' size=196608 vlen=2
I know the overheads are non-negligible, but I think there will be
considerable value in having the global variables present. Thanks for
reading this and I look forward to discussing!
[1]: https://github.com/osandov/drgn
[2]: https://github.com/osandov/drgn/issues/176
PS: Previous iterations were sent in 2022:
v1 https://lore.kernel.org/dwarves/20220826184911.168442-1-stephen.s.brennan@oracle.com/
v2 https://lore.kernel.org/dwarves/20221104231103.752040-1-stephen.s.brennan@oracle.com/
Difference from these: I've gotten a much better understanding of DWARF,
BTF, and the pahole code since then. In particular, the older patches did
not include DATASEC for anything other than percpu variables. Also, since
those older patches, I've dramatically improved the variable filtering,
including using DW_AT_artificial, name filtering, and getting rid of static
local variables.
Stephen Brennan (4):
dutil: return ELF section name when looked up by index
dwarf_loader: add "artificial" and "top_level" variable flags
btf_encoder: cache all ELF section info
btf_encoder: add global_var feature to encode globals
btf_encoder.c | 371 ++++++++++++++++++++++-----------------------
btf_encoder.h | 8 +
dutil.c | 9 +-
dutil.h | 2 +-
dwarf_loader.c | 12 +-
dwarves.h | 3 +
man-pages/pahole.1 | 8 +-
pahole.c | 11 +-
8 files changed, 223 insertions(+), 201 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index 2024-09-12 19:08 [PATCH dwarves 0/4] Emit global variables in BTF Stephen Brennan @ 2024-09-12 19:08 ` Stephen Brennan 2024-09-13 13:26 ` Alan Maguire 2024-09-12 19:08 ` [PATCH dwarves 2/4] dwarf_loader: add "artificial" and "top_level" variable flags Stephen Brennan ` (2 subsequent siblings) 3 siblings, 1 reply; 14+ messages in thread From: Stephen Brennan @ 2024-09-12 19:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: dwarves, linux-debuggers, Stephen Brennan, Alan Maguire Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> --- dutil.c | 9 ++++++++- dutil.h | 2 +- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/dutil.c b/dutil.c index 97c4474..4e83d59 100644 --- a/dutil.c +++ b/dutil.c @@ -207,13 +207,20 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Shdr *shp, const char *name, size_t return sec; } -Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx) +Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx, const char **name_out) { Elf_Scn *sec; + size_t str_idx; sec = elf_getscn(elf, idx); if (sec) gelf_getshdr(sec, shp); + + if (name_out) { + if (elf_getshdrstrndx(elf, &str_idx)) + return NULL; + *name_out = elf_strptr(elf, str_idx, shp->sh_name); + } return sec; } diff --git a/dutil.h b/dutil.h index 335a17c..ff78aa6 100644 --- a/dutil.h +++ b/dutil.h @@ -328,7 +328,7 @@ void *zalloc(const size_t size); Elf_Scn *elf_section_by_name(Elf *elf, GElf_Shdr *shp, const char *name, size_t *index); -Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx); +Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx, const char **name_out); #ifndef SHT_GNU_ATTRIBUTES /* Just a way to check if we're using an old elfutils version */ -- 2.43.5 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index 2024-09-12 19:08 ` [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index Stephen Brennan @ 2024-09-13 13:26 ` Alan Maguire 2024-09-13 17:06 ` Stephen Brennan 0 siblings, 1 reply; 14+ messages in thread From: Alan Maguire @ 2024-09-13 13:26 UTC (permalink / raw) To: Stephen Brennan, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers On 12/09/2024 20:08, Stephen Brennan wrote: nit; needs a commit message, e.g. ELF section names will be used in DATASEC encoding; as well as getting the section, optionally retrieve the name. > Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> I think one small thing needs fixing below, but Reviewed-by: Alan Maguire <alan.maguire@oracle.com> > --- > dutil.c | 9 ++++++++- > dutil.h | 2 +- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/dutil.c b/dutil.c > index 97c4474..4e83d59 100644 > --- a/dutil.c > +++ b/dutil.c > @@ -207,13 +207,20 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Shdr *shp, const char *name, size_t > return sec; > } > > -Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx) > +Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx, const char **name_out) > { > Elf_Scn *sec; > + size_t str_idx; > > sec = elf_getscn(elf, idx); > if (sec) > gelf_getshdr(sec, shp); > + > + if (name_out) { nit; before we were directly returning sec, so if it was NULL that was for the caller to deal with; now though we're driving on assuming it was non-NULL here. So I'd suggest changing the above to be something like sec = elf_getscn(elf, idx); if (!sec) return NULL; if (!gelf_getshhdr(sec, shp)) return NULL; if (name_out) { ... > + if (elf_getshdrstrndx(elf, &str_idx)) > + return NULL; > + *name_out = elf_strptr(elf, str_idx, shp->sh_name); > + } > return sec; > } > > diff --git a/dutil.h b/dutil.h > index 335a17c..ff78aa6 100644 > --- a/dutil.h > +++ b/dutil.h > @@ -328,7 +328,7 @@ void *zalloc(const size_t size); > > Elf_Scn *elf_section_by_name(Elf *elf, GElf_Shdr *shp, const char *name, size_t *index); > > -Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx); > +Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx, const char **name_out); > > #ifndef SHT_GNU_ATTRIBUTES > /* Just a way to check if we're using an old elfutils version */ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index 2024-09-13 13:26 ` Alan Maguire @ 2024-09-13 17:06 ` Stephen Brennan 0 siblings, 0 replies; 14+ messages in thread From: Stephen Brennan @ 2024-09-13 17:06 UTC (permalink / raw) To: Alan Maguire, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers Alan Maguire <alan.maguire@oracle.com> writes: > On 12/09/2024 20:08, Stephen Brennan wrote: > > nit; needs a commit message, e.g. ELF section names will be used in > DATASEC encoding; as well as getting the section, optionally retrieve > the name. > >> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> > > I think one small thing needs fixing below, but > > Reviewed-by: Alan Maguire <alan.maguire@oracle.com> >> --- >> dutil.c | 9 ++++++++- >> dutil.h | 2 +- >> 2 files changed, 9 insertions(+), 2 deletions(-) >> >> diff --git a/dutil.c b/dutil.c >> index 97c4474..4e83d59 100644 >> --- a/dutil.c >> +++ b/dutil.c >> @@ -207,13 +207,20 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Shdr *shp, const char *name, size_t >> return sec; >> } >> >> -Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx) >> +Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx, const char **name_out) >> { >> Elf_Scn *sec; >> + size_t str_idx; >> >> sec = elf_getscn(elf, idx); >> if (sec) >> gelf_getshdr(sec, shp); >> + >> + if (name_out) { > > nit; before we were directly returning sec, so if it was NULL that was > for the caller to deal with; now though we're driving on assuming it was > non-NULL here. So I'd suggest changing the above to be something like > > sec = elf_getscn(elf, idx); > if (!sec) > return NULL; > if (!gelf_getshhdr(sec, shp)) > return NULL; > if (name_out) { > D'oh! You're exactly right. Thanks! >> + if (elf_getshdrstrndx(elf, &str_idx)) >> + return NULL; >> + *name_out = elf_strptr(elf, str_idx, shp->sh_name); >> + } >> return sec; >> } >> >> diff --git a/dutil.h b/dutil.h >> index 335a17c..ff78aa6 100644 >> --- a/dutil.h >> +++ b/dutil.h >> @@ -328,7 +328,7 @@ void *zalloc(const size_t size); >> >> Elf_Scn *elf_section_by_name(Elf *elf, GElf_Shdr *shp, const char *name, size_t *index); >> >> -Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx); >> +Elf_Scn *elf_section_by_idx(Elf *elf, GElf_Shdr *shp, int idx, const char **name_out); >> >> #ifndef SHT_GNU_ATTRIBUTES >> /* Just a way to check if we're using an old elfutils version */ ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH dwarves 2/4] dwarf_loader: add "artificial" and "top_level" variable flags 2024-09-12 19:08 [PATCH dwarves 0/4] Emit global variables in BTF Stephen Brennan 2024-09-12 19:08 ` [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index Stephen Brennan @ 2024-09-12 19:08 ` Stephen Brennan 2024-09-13 13:35 ` Alan Maguire 2024-09-12 19:08 ` [PATCH dwarves 3/4] btf_encoder: cache all ELF section info Stephen Brennan 2024-09-12 19:08 ` [PATCH dwarves 4/4] btf_encoder: add global_var feature to encode globals Stephen Brennan 3 siblings, 1 reply; 14+ messages in thread From: Stephen Brennan @ 2024-09-12 19:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: dwarves, linux-debuggers, Stephen Brennan, Alan Maguire The "artificial" flag corresponds directly to DW_AT_artificial, which indicates a compiler-generated variable (e.g. __func__) which shouldn't be included in the output. The "top_level" flag is intended to be a better proxy for global scoped variables. Currently, the DWARF loader examines the DWARF location expression, and if the location is found to be at a constant memory address (not stack, register, etc), then the variable is assumed to be globally scoped. However, this includes a variety of variables that aren't truly globally scoped: most commonly, static variables in functions definitions. Their locations may be static, but they're not globally accessible in any useful way. These flags will be used by the BTF encoder to select global variables. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> --- dwarf_loader.c | 12 +++++++----- dwarves.h | 2 ++ 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/dwarf_loader.c b/dwarf_loader.c index 065ed4d..d162214 100644 --- a/dwarf_loader.c +++ b/dwarf_loader.c @@ -730,7 +730,7 @@ const char *variable__scope_str(const struct variable *var) return "unknown"; } -static struct variable *variable__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf) +static struct variable *variable__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int top_level) { bool has_specification = dwarf_hasattr(die, DW_AT_specification); struct variable *var = tag__alloc(cu, sizeof(*var)); @@ -743,6 +743,8 @@ static struct variable *variable__new(Dwarf_Die *die, struct cu *cu, struct conf /* non-defining declaration of an object */ var->declaration = dwarf_hasattr(die, DW_AT_declaration); var->has_specification = has_specification; + var->artificial = dwarf_hasattr(die, DW_AT_artificial); + var->top_level = top_level; var->scope = VSCOPE_UNKNOWN; INIT_LIST_HEAD(&var->annots); var->ip.addr = 0; @@ -1767,9 +1769,9 @@ static struct tag *die__create_new_label(Dwarf_Die *die, return &label->ip.tag; } -static struct tag *die__create_new_variable(Dwarf_Die *die, struct cu *cu, struct conf_load *conf) +static struct tag *die__create_new_variable(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int top_level) { - struct variable *var = variable__new(die, cu, conf); + struct variable *var = variable__new(die, cu, conf, top_level); if (var == NULL || add_child_llvm_annotations(die, -1, conf, &var->annots)) return NULL; @@ -2243,7 +2245,7 @@ static int die__process_function(Dwarf_Die *die, struct ftype *ftype, tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++); break; case DW_TAG_variable: - tag = die__create_new_variable(die, cu, conf); + tag = die__create_new_variable(die, cu, conf, 0); if (tag == NULL) goto out_enomem; lexblock__add_variable(lexblock, tag__variable(tag)); @@ -2367,7 +2369,7 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu, case DW_TAG_union_type: tag = die__create_new_union(die, cu, conf); break; case DW_TAG_variable: - tag = die__create_new_variable(die, cu, conf); break; + tag = die__create_new_variable(die, cu, conf, top_level); break; case DW_TAG_constant: // First seen in a Go CU tag = die__create_new_constant(die, cu, conf); break; default: diff --git a/dwarves.h b/dwarves.h index f2d3988..0fede91 100644 --- a/dwarves.h +++ b/dwarves.h @@ -848,6 +848,8 @@ struct variable { uint8_t external:1; uint8_t declaration:1; uint8_t has_specification:1; + uint8_t artificial:1; + uint8_t top_level:1; enum vscope scope; struct location location; struct hlist_node tool_hnode; -- 2.43.5 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 2/4] dwarf_loader: add "artificial" and "top_level" variable flags 2024-09-12 19:08 ` [PATCH dwarves 2/4] dwarf_loader: add "artificial" and "top_level" variable flags Stephen Brennan @ 2024-09-13 13:35 ` Alan Maguire 2024-09-13 17:16 ` Stephen Brennan 0 siblings, 1 reply; 14+ messages in thread From: Alan Maguire @ 2024-09-13 13:35 UTC (permalink / raw) To: Stephen Brennan, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers On 12/09/2024 20:08, Stephen Brennan wrote: > The "artificial" flag corresponds directly to DW_AT_artificial, which > indicates a compiler-generated variable (e.g. __func__) which shouldn't > be included in the output. > Nice, didn't know about this! Great to have such a clear criterion for filtering. > The "top_level" flag is intended to be a better proxy for global scoped > variables. Currently, the DWARF loader examines the DWARF location Looking at this, it appears that top_level means it is a top-level compilation unit tag, i.e. not associated with a subroutine tag (I think?). That's kind of implicit in your explanation so I think it would be helpful for the log to describe what it is as well as how you're using it. > expression, and if the location is found to be at a constant memory > address (not stack, register, etc), then the variable is assumed to be > globally scoped. However, this includes a variety of variables that > aren't truly globally scoped: most commonly, static variables in > functions definitions. Their locations may be static, but they're not > globally accessible in any useful way. > > These flags will be used by the BTF encoder to select global variables. > > Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> > --- > dwarf_loader.c | 12 +++++++----- > dwarves.h | 2 ++ > 2 files changed, 9 insertions(+), 5 deletions(-) > > diff --git a/dwarf_loader.c b/dwarf_loader.c > index 065ed4d..d162214 100644 > --- a/dwarf_loader.c > +++ b/dwarf_loader.c > @@ -730,7 +730,7 @@ const char *variable__scope_str(const struct variable *var) > return "unknown"; > } > > -static struct variable *variable__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf) > +static struct variable *variable__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int top_level) > { > bool has_specification = dwarf_hasattr(die, DW_AT_specification); > struct variable *var = tag__alloc(cu, sizeof(*var)); > @@ -743,6 +743,8 @@ static struct variable *variable__new(Dwarf_Die *die, struct cu *cu, struct conf > /* non-defining declaration of an object */ > var->declaration = dwarf_hasattr(die, DW_AT_declaration); > var->has_specification = has_specification; > + var->artificial = dwarf_hasattr(die, DW_AT_artificial); > + var->top_level = top_level; > var->scope = VSCOPE_UNKNOWN; > INIT_LIST_HEAD(&var->annots); > var->ip.addr = 0; > @@ -1767,9 +1769,9 @@ static struct tag *die__create_new_label(Dwarf_Die *die, > return &label->ip.tag; > } > > -static struct tag *die__create_new_variable(Dwarf_Die *die, struct cu *cu, struct conf_load *conf) > +static struct tag *die__create_new_variable(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int top_level) > { > - struct variable *var = variable__new(die, cu, conf); > + struct variable *var = variable__new(die, cu, conf, top_level); > > if (var == NULL || add_child_llvm_annotations(die, -1, conf, &var->annots)) > return NULL; > @@ -2243,7 +2245,7 @@ static int die__process_function(Dwarf_Die *die, struct ftype *ftype, > tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++); > break; > case DW_TAG_variable: > - tag = die__create_new_variable(die, cu, conf); > + tag = die__create_new_variable(die, cu, conf, 0); > if (tag == NULL) > goto out_enomem; > lexblock__add_variable(lexblock, tag__variable(tag)); > @@ -2367,7 +2369,7 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu, > case DW_TAG_union_type: > tag = die__create_new_union(die, cu, conf); break; > case DW_TAG_variable: > - tag = die__create_new_variable(die, cu, conf); break; > + tag = die__create_new_variable(die, cu, conf, top_level); break; > case DW_TAG_constant: // First seen in a Go CU > tag = die__create_new_constant(die, cu, conf); break; > default: > diff --git a/dwarves.h b/dwarves.h > index f2d3988..0fede91 100644 > --- a/dwarves.h > +++ b/dwarves.h > @@ -848,6 +848,8 @@ struct variable { > uint8_t external:1; > uint8_t declaration:1; > uint8_t has_specification:1; > + uint8_t artificial:1; > + uint8_t top_level:1; > enum vscope scope; > struct location location; > struct hlist_node tool_hnode; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 2/4] dwarf_loader: add "artificial" and "top_level" variable flags 2024-09-13 13:35 ` Alan Maguire @ 2024-09-13 17:16 ` Stephen Brennan 0 siblings, 0 replies; 14+ messages in thread From: Stephen Brennan @ 2024-09-13 17:16 UTC (permalink / raw) To: Alan Maguire, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers Alan Maguire <alan.maguire@oracle.com> writes: > On 12/09/2024 20:08, Stephen Brennan wrote: >> The "artificial" flag corresponds directly to DW_AT_artificial, which >> indicates a compiler-generated variable (e.g. __func__) which shouldn't >> be included in the output. >> > > Nice, didn't know about this! Great to have such a clear criterion for > filtering. I've been finding llvm-dwarfdump invaluable for learning these things. (Aside: thanks to Omar Sandoval to introducing it to me as I've tried to wrap my head around some small parts of DWARF.) I discovered this attribute when exploring those __func__ variables: $ llvm-dwarfdump vmlinux --name=__func__ | head -n 12 vmlinux: file format elf64-x86-64 0x0001bfd3: DW_TAG_variable DW_AT_name ("__func__") DW_AT_type (0x0001c0f0 "const const char[18]") DW_AT_artificial (true) 0x0001c148: DW_TAG_variable DW_AT_name ("__func__") DW_AT_type (0x0001c552 "const const char[12]") DW_AT_artificial (true) Then it was just a quick ctrl-F through the specification to fill in the gaps. >> The "top_level" flag is intended to be a better proxy for global scoped >> variables. Currently, the DWARF loader examines the DWARF location > > Looking at this, it appears that top_level means it is a top-level > compilation unit tag, i.e. not associated with a subroutine tag (I > think?). That's kind of implicit in your explanation so I think it would > be helpful for the log to describe what it is as well as how you're > using it. Yeah, I'll spell it out more explicitly in this commit message. I do add a comment in patch 4 which explains the differences between SCOPE_GLOBAL and top_level (shown below) but something similar should be in the log message for this change too. top_level: indicates that the variable is declared at the top level of the CU, and thus it is globally scoped. artificial: indicates that the variable is a compiler-generated "fake" variable that doesn't appear in the source. scope: set by pahole to indicate the type of storage the variable has. GLOBAL indicates it is stored in static memory (as opposed to a stack variable or register) Some variables are "top_level" but not GLOBAL: e.g. current_stack_pointer, which is a register variable, despite having global CU-declarations. We don't want that, since no code could actually find this variable. Some variables are GLOBAL but not top_level: e.g. function static variables >> expression, and if the location is found to be at a constant memory >> address (not stack, register, etc), then the variable is assumed to be >> globally scoped. However, this includes a variety of variables that >> aren't truly globally scoped: most commonly, static variables in >> functions definitions. Their locations may be static, but they're not >> globally accessible in any useful way. >> >> These flags will be used by the BTF encoder to select global variables. >> >> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> > > > Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Thanks! Stephen ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH dwarves 3/4] btf_encoder: cache all ELF section info 2024-09-12 19:08 [PATCH dwarves 0/4] Emit global variables in BTF Stephen Brennan 2024-09-12 19:08 ` [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index Stephen Brennan 2024-09-12 19:08 ` [PATCH dwarves 2/4] dwarf_loader: add "artificial" and "top_level" variable flags Stephen Brennan @ 2024-09-12 19:08 ` Stephen Brennan 2024-09-13 15:25 ` Alan Maguire 2024-09-12 19:08 ` [PATCH dwarves 4/4] btf_encoder: add global_var feature to encode globals Stephen Brennan 3 siblings, 1 reply; 14+ messages in thread From: Stephen Brennan @ 2024-09-12 19:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: dwarves, linux-debuggers, Stephen Brennan, Alan Maguire To handle outputting all variables generally, we'll need to store more section data. Create a table of ELF sections so we can refer to all the cached data, not just the percpu section. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> --- btf_encoder.c | 50 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 36 insertions(+), 14 deletions(-) diff --git a/btf_encoder.c b/btf_encoder.c index 8a2d92e..e3384e5 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -65,12 +65,20 @@ struct elf_function { struct btf_encoder_state state; }; +#define MAX_ELF_SEC_CNT 128 + struct var_info { uint64_t addr; const char *name; uint32_t sz; }; +struct elf_secinfo { + uint64_t addr; + const char *name; + uint64_t sz; +}; + /* * cu: cu being processed. */ @@ -95,13 +103,13 @@ struct btf_encoder { is_rel, gen_distilled_base; uint32_t array_index_id; + struct elf_secinfo secinfo[MAX_ELF_SEC_CNT]; + size_t seccnt; struct { struct var_info *vars; int var_cnt; int allocated; uint32_t shndx; - uint64_t base_addr; - uint64_t sec_sz; } percpu; struct { struct elf_function *entries; @@ -1849,7 +1857,7 @@ static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym * ET_EXEC file) we need to subtract the section address. */ if (!encoder->is_rel) - addr -= encoder->percpu.base_addr; + addr -= encoder->secinfo[encoder->percpu.shndx].addr; if (encoder->percpu.var_cnt == encoder->percpu.allocated) { struct var_info *new; @@ -1923,6 +1931,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) uint32_t core_id; struct tag *pos; int err = -1; + struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx]; if (encoder->percpu.shndx == 0 || !encoder->symtab) return 0; @@ -1954,9 +1963,9 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) * always contains virtual symbol addresses, so subtract * the section address unconditionally. */ - if (addr < encoder->percpu.base_addr || addr >= encoder->percpu.base_addr + encoder->percpu.sec_sz) + if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz) continue; - addr -= encoder->percpu.base_addr; + addr -= pcpu_scn->addr; if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name)) continue; /* not a per-CPU variable */ @@ -2099,20 +2108,33 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam goto out; } - /* find percpu section's shndx */ + /* index the ELF sections for later lookup */ GElf_Shdr shdr; - Elf_Scn *sec = elf_section_by_name(cu->elf, &shdr, PERCPU_SECTION, NULL); + size_t shndx; + if (elf_getshdrnum(cu->elf, &encoder->seccnt)) + goto out_delete; + if (encoder->seccnt >= MAX_ELF_SEC_CNT) { + fprintf(stderr, "%s: reached limit of ELF sections\n", __func__); + goto out_delete; + } - if (!sec) { - if (encoder->verbose) - printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); - } else { - encoder->percpu.shndx = elf_ndxscn(sec); - encoder->percpu.base_addr = shdr.sh_addr; - encoder->percpu.sec_sz = shdr.sh_size; + for (shndx = 0; shndx < encoder->seccnt; shndx++) { + const char *secname = NULL; + Elf_Scn *sec = elf_section_by_idx(cu->elf, &shdr, shndx, &secname); + if (!sec) + goto out_delete; + encoder->secinfo[shndx].addr = shdr.sh_addr; + encoder->secinfo[shndx].sz = shdr.sh_size; + encoder->secinfo[shndx].name = secname; + + if (strcmp(secname, PERCPU_SECTION) == 0) + encoder->percpu.shndx = shndx; } + if (!encoder->percpu.shndx && encoder->verbose) + printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); + if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars)) goto out_delete; -- 2.43.5 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 3/4] btf_encoder: cache all ELF section info 2024-09-12 19:08 ` [PATCH dwarves 3/4] btf_encoder: cache all ELF section info Stephen Brennan @ 2024-09-13 15:25 ` Alan Maguire 2024-09-13 17:05 ` Stephen Brennan 0 siblings, 1 reply; 14+ messages in thread From: Alan Maguire @ 2024-09-13 15:25 UTC (permalink / raw) To: Stephen Brennan, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers On 12/09/2024 20:08, Stephen Brennan wrote: > To handle outputting all variables generally, we'll need to store more > section data. Create a table of ELF sections so we can refer to all the > cached data, not just the percpu section. > > Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> > --- > btf_encoder.c | 50 ++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 36 insertions(+), 14 deletions(-) > > diff --git a/btf_encoder.c b/btf_encoder.c > index 8a2d92e..e3384e5 100644 > --- a/btf_encoder.c > +++ b/btf_encoder.c > @@ -65,12 +65,20 @@ struct elf_function { > struct btf_encoder_state state; > }; > > +#define MAX_ELF_SEC_CNT 128 > + > struct var_info { > uint64_t addr; > const char *name; > uint32_t sz; > }; > > +struct elf_secinfo { > + uint64_t addr; > + const char *name; > + uint64_t sz; > +}; > + One thing we've run into before is hardcoded limits get exceeded on systems, and while it seems unlikely for 128 ELF sections, I think it'd be better to make this dynamic just in case. Also for parallel mode we have N encoders so it might same some memory (which we're pretty profligate with in other areas). I'd suggest using reallocarray_grow() as we do for functions. One thing we may want to tweak is that reallocarray_grow() will grow by 1000 or 1.5 x the current size; starting with 1000 ELF sections is a bit much. We could perhaps tweak reallocarray_grow() to have a min_growth parameter to control this and set it to something smaller for ELF sections, what do you think? > /* > * cu: cu being processed. > */ > @@ -95,13 +103,13 @@ struct btf_encoder { > is_rel, > gen_distilled_base; > uint32_t array_index_id; > + struct elf_secinfo secinfo[MAX_ELF_SEC_CNT]; ...so here we'd have a substructure like this struct { int allocated; int sec_cnt; struct elf_secinfo *sections; } elf_sections; > + size_t seccnt; > struct { > struct var_info *vars; > int var_cnt; > int allocated; > uint32_t shndx; > - uint64_t base_addr; > - uint64_t sec_sz; > } percpu; > struct { > struct elf_function *entries; > @@ -1849,7 +1857,7 @@ static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym > * ET_EXEC file) we need to subtract the section address. > */ > if (!encoder->is_rel) > - addr -= encoder->percpu.base_addr; > + addr -= encoder->secinfo[encoder->percpu.shndx].addr; > We will need some of these mechanics later when we encode function addresses; great to have this! > if (encoder->percpu.var_cnt == encoder->percpu.allocated) { > struct var_info *new; > @@ -1923,6 +1931,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) > uint32_t core_id; > struct tag *pos; > int err = -1; > + struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx]; > > if (encoder->percpu.shndx == 0 || !encoder->symtab) > return 0; > @@ -1954,9 +1963,9 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) > * always contains virtual symbol addresses, so subtract > * the section address unconditionally. > */ > - if (addr < encoder->percpu.base_addr || addr >= encoder->percpu.base_addr + encoder->percpu.sec_sz) > + if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz) > continue; > - addr -= encoder->percpu.base_addr; > + addr -= pcpu_scn->addr; > > if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name)) > continue; /* not a per-CPU variable */ > @@ -2099,20 +2108,33 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam > goto out; > } > > - /* find percpu section's shndx */ > + /* index the ELF sections for later lookup */ > > GElf_Shdr shdr; > - Elf_Scn *sec = elf_section_by_name(cu->elf, &shdr, PERCPU_SECTION, NULL); > + size_t shndx; > + if (elf_getshdrnum(cu->elf, &encoder->seccnt)) > + goto out_delete; > + if (encoder->seccnt >= MAX_ELF_SEC_CNT) { > + fprintf(stderr, "%s: reached limit of ELF sections\n", __func__); > + goto out_delete; > + } > as above we should just reallocarray_grow() more space. > - if (!sec) { > - if (encoder->verbose) > - printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); > - } else { > - encoder->percpu.shndx = elf_ndxscn(sec); > - encoder->percpu.base_addr = shdr.sh_addr; > - encoder->percpu.sec_sz = shdr.sh_size; > + for (shndx = 0; shndx < encoder->seccnt; shndx++) { > + const char *secname = NULL; > + Elf_Scn *sec = elf_section_by_idx(cu->elf, &shdr, shndx, &secname); > + if (!sec) > + goto out_delete; > + encoder->secinfo[shndx].addr = shdr.sh_addr; > + encoder->secinfo[shndx].sz = shdr.sh_size; > + encoder->secinfo[shndx].name = secname; > + > + if (strcmp(secname, PERCPU_SECTION) == 0) > + encoder->percpu.shndx = shndx; > } > > + if (!encoder->percpu.shndx && encoder->verbose) > + printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); > + > if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars)) > goto out_delete; > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 3/4] btf_encoder: cache all ELF section info 2024-09-13 15:25 ` Alan Maguire @ 2024-09-13 17:05 ` Stephen Brennan 2024-09-15 11:26 ` Alan Maguire 0 siblings, 1 reply; 14+ messages in thread From: Stephen Brennan @ 2024-09-13 17:05 UTC (permalink / raw) To: Alan Maguire, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers Alan Maguire <alan.maguire@oracle.com> writes: > On 12/09/2024 20:08, Stephen Brennan wrote: >> To handle outputting all variables generally, we'll need to store more >> section data. Create a table of ELF sections so we can refer to all the >> cached data, not just the percpu section. >> >> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> >> --- >> btf_encoder.c | 50 ++++++++++++++++++++++++++++++++++++-------------- >> 1 file changed, 36 insertions(+), 14 deletions(-) >> >> diff --git a/btf_encoder.c b/btf_encoder.c >> index 8a2d92e..e3384e5 100644 >> --- a/btf_encoder.c >> +++ b/btf_encoder.c >> @@ -65,12 +65,20 @@ struct elf_function { >> struct btf_encoder_state state; >> }; >> >> +#define MAX_ELF_SEC_CNT 128 >> + >> struct var_info { >> uint64_t addr; >> const char *name; >> uint32_t sz; >> }; >> >> +struct elf_secinfo { >> + uint64_t addr; >> + const char *name; >> + uint64_t sz; >> +}; >> + > > One thing we've run into before is hardcoded limits get exceeded on > systems, and while it seems unlikely for 128 ELF sections, I think it'd > be better to make this dynamic just in case. Also for parallel mode we > have N encoders so it might same some memory (which we're pretty > profligate with in other areas). I'd suggest using reallocarray_grow() > as we do for functions. One thing we may want to tweak is that > reallocarray_grow() will grow by 1000 or 1.5 x the current size; > starting with 1000 ELF sections is a bit much. We could perhaps > tweak reallocarray_grow() to have a min_growth parameter to control this > and set it to something smaller for ELF sections, what do you think? I agree that the static allocation doesn't make a ton of sense. To be quite honest, I was lazy during the initial implementation and intended to come back around to make this dynamically allocated. I think the answer is to simply use elf_getshdrnum and directly allocate the array to the size we need. I'll definitely do that in the next version. >> /* >> * cu: cu being processed. >> */ >> @@ -95,13 +103,13 @@ struct btf_encoder { >> is_rel, >> gen_distilled_base; >> uint32_t array_index_id; >> + struct elf_secinfo secinfo[MAX_ELF_SEC_CNT]; > > ...so here we'd have a substructure like this > > struct { > int allocated; > int sec_cnt; > struct elf_secinfo *sections; > } elf_sections; > >> + size_t seccnt; >> struct { >> struct var_info *vars; >> int var_cnt; >> int allocated; >> uint32_t shndx; >> - uint64_t base_addr; >> - uint64_t sec_sz; >> } percpu; >> struct { >> struct elf_function *entries; >> @@ -1849,7 +1857,7 @@ static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym >> * ET_EXEC file) we need to subtract the section address. >> */ >> if (!encoder->is_rel) >> - addr -= encoder->percpu.base_addr; >> + addr -= encoder->secinfo[encoder->percpu.shndx].addr; >> > We will need some of these mechanics later when we encode function > addresses; great to have this! Ironically enough, this code (and especially, the is_rel field referred to here) gets deleted in patch 4. Maybe I should have left is_rel if you think we're going to need function addresses from the symbol table. But patch 4 fully embraces simply using DWARF to get the variable addresses, and it stops using the ELF symbol table for variables. The DWARF addresses are always absolute. I couldn't really find a good reason for using the symbol table, except maybe as a decent way to filter out bogus variables that didn't actually become real data. For example, the x86_64 percpu region starts at address 0, and many bogus DWARF variables have address 0, so making sure the variable matched the symbol at address 0 was a good way to filter bogus variables. I didn't think it was a big deal that I removed this behavior, because with improved variable filtering, this code continues to produce nearly identical percpu DATASECs, with the only difference being that we now (correctly) include the fixed_percpu_data variable for x86_64, which was omitted before. So there's no regression in functionality. Stephen >> if (encoder->percpu.var_cnt == encoder->percpu.allocated) { >> struct var_info *new; >> @@ -1923,6 +1931,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) >> uint32_t core_id; >> struct tag *pos; >> int err = -1; >> + struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx]; >> >> if (encoder->percpu.shndx == 0 || !encoder->symtab) >> return 0; >> @@ -1954,9 +1963,9 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) >> * always contains virtual symbol addresses, so subtract >> * the section address unconditionally. >> */ >> - if (addr < encoder->percpu.base_addr || addr >= encoder->percpu.base_addr + encoder->percpu.sec_sz) >> + if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz) >> continue; >> - addr -= encoder->percpu.base_addr; >> + addr -= pcpu_scn->addr; >> >> if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name)) >> continue; /* not a per-CPU variable */ >> @@ -2099,20 +2108,33 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam >> goto out; >> } >> >> - /* find percpu section's shndx */ >> + /* index the ELF sections for later lookup */ >> >> GElf_Shdr shdr; >> - Elf_Scn *sec = elf_section_by_name(cu->elf, &shdr, PERCPU_SECTION, NULL); >> + size_t shndx; >> + if (elf_getshdrnum(cu->elf, &encoder->seccnt)) >> + goto out_delete; >> + if (encoder->seccnt >= MAX_ELF_SEC_CNT) { >> + fprintf(stderr, "%s: reached limit of ELF sections\n", __func__); >> + goto out_delete; >> + } >> > as above we should just reallocarray_grow() more space. > >> - if (!sec) { >> - if (encoder->verbose) >> - printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); >> - } else { >> - encoder->percpu.shndx = elf_ndxscn(sec); >> - encoder->percpu.base_addr = shdr.sh_addr; >> - encoder->percpu.sec_sz = shdr.sh_size; >> + for (shndx = 0; shndx < encoder->seccnt; shndx++) { >> + const char *secname = NULL; >> + Elf_Scn *sec = elf_section_by_idx(cu->elf, &shdr, shndx, &secname); >> + if (!sec) >> + goto out_delete; >> + encoder->secinfo[shndx].addr = shdr.sh_addr; >> + encoder->secinfo[shndx].sz = shdr.sh_size; >> + encoder->secinfo[shndx].name = secname; >> + >> + if (strcmp(secname, PERCPU_SECTION) == 0) >> + encoder->percpu.shndx = shndx; >> } >> >> + if (!encoder->percpu.shndx && encoder->verbose) >> + printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); >> + >> if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars)) >> goto out_delete; >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 3/4] btf_encoder: cache all ELF section info 2024-09-13 17:05 ` Stephen Brennan @ 2024-09-15 11:26 ` Alan Maguire 0 siblings, 0 replies; 14+ messages in thread From: Alan Maguire @ 2024-09-15 11:26 UTC (permalink / raw) To: Stephen Brennan, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers On 13/09/2024 18:05, Stephen Brennan wrote: > Alan Maguire <alan.maguire@oracle.com> writes: >> On 12/09/2024 20:08, Stephen Brennan wrote: >>> To handle outputting all variables generally, we'll need to store more >>> section data. Create a table of ELF sections so we can refer to all the >>> cached data, not just the percpu section. >>> >>> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> >>> --- >>> btf_encoder.c | 50 ++++++++++++++++++++++++++++++++++++-------------- >>> 1 file changed, 36 insertions(+), 14 deletions(-) >>> >>> diff --git a/btf_encoder.c b/btf_encoder.c >>> index 8a2d92e..e3384e5 100644 >>> --- a/btf_encoder.c >>> +++ b/btf_encoder.c >>> @@ -65,12 +65,20 @@ struct elf_function { >>> struct btf_encoder_state state; >>> }; >>> >>> +#define MAX_ELF_SEC_CNT 128 >>> + >>> struct var_info { >>> uint64_t addr; >>> const char *name; >>> uint32_t sz; >>> }; >>> >>> +struct elf_secinfo { >>> + uint64_t addr; >>> + const char *name; >>> + uint64_t sz; >>> +}; >>> + >> >> One thing we've run into before is hardcoded limits get exceeded on >> systems, and while it seems unlikely for 128 ELF sections, I think it'd >> be better to make this dynamic just in case. Also for parallel mode we >> have N encoders so it might same some memory (which we're pretty >> profligate with in other areas). I'd suggest using reallocarray_grow() >> as we do for functions. One thing we may want to tweak is that >> reallocarray_grow() will grow by 1000 or 1.5 x the current size; >> starting with 1000 ELF sections is a bit much. We could perhaps >> tweak reallocarray_grow() to have a min_growth parameter to control this >> and set it to something smaller for ELF sections, what do you think? > > I agree that the static allocation doesn't make a ton of sense. To be > quite honest, I was lazy during the initial implementation and intended > to come back around to make this dynamically allocated. > > I think the answer is to simply use elf_getshdrnum and directly allocate > the array to the size we need. I'll definitely do that in the next > version. > You're right; since we know in advance how many sections we need we can just calloc() to get zeroed memory. >>> /* >>> * cu: cu being processed. >>> */ >>> @@ -95,13 +103,13 @@ struct btf_encoder { >>> is_rel, >>> gen_distilled_base; >>> uint32_t array_index_id; >>> + struct elf_secinfo secinfo[MAX_ELF_SEC_CNT]; >> >> ...so here we'd have a substructure like this >> >> struct { >> int allocated; >> int sec_cnt; >> struct elf_secinfo *sections; >> } elf_sections; >> >>> + size_t seccnt; >>> struct { >>> struct var_info *vars; >>> int var_cnt; >>> int allocated; >>> uint32_t shndx; >>> - uint64_t base_addr; >>> - uint64_t sec_sz; >>> } percpu; >>> struct { >>> struct elf_function *entries; >>> @@ -1849,7 +1857,7 @@ static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym >>> * ET_EXEC file) we need to subtract the section address. >>> */ >>> if (!encoder->is_rel) >>> - addr -= encoder->percpu.base_addr; >>> + addr -= encoder->secinfo[encoder->percpu.shndx].addr; >>> >> We will need some of these mechanics later when we encode function >> addresses; great to have this! > > Ironically enough, this code (and especially, the is_rel field referred > to here) gets deleted in patch 4. Maybe I should have left is_rel if you > think we're going to need function addresses from the symbol table. > No need to leave it in; if there's reason to delete it for your changes, please go ahead. We can resurrect code for functions if needed, but no need for you to worry about potential future changes like that! > But patch 4 fully embraces simply using DWARF to get the variable > addresses, and it stops using the ELF symbol table for variables. The > DWARF addresses are always absolute. I couldn't really find a good > reason for using the symbol table, except maybe as a decent way to > filter out bogus variables that didn't actually become real data. For > example, the x86_64 percpu region starts at address 0, and many bogus > DWARF variables have address 0, so making sure the variable matched the > symbol at address 0 was a good way to filter bogus variables. > > I didn't think it was a big deal that I removed this behavior, because > with improved variable filtering, this code continues to produce nearly > identical percpu DATASECs, with the only difference being that we now > (correctly) include the fixed_percpu_data variable for x86_64, which was > omitted before. So there's no regression in functionality. > FWIW I think it's much better the way you're doing things; the situation with functions is unfortunately messier, and we need to do comparisons across function declarations to probe for inconsistencies. I'm hoping we can start to move in this direction with the function code too though, especially if we add address info for functions too (which we're hoping to). Thanks! Alan > Stephen > >>> if (encoder->percpu.var_cnt == encoder->percpu.allocated) { >>> struct var_info *new; >>> @@ -1923,6 +1931,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) >>> uint32_t core_id; >>> struct tag *pos; >>> int err = -1; >>> + struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx]; >>> >>> if (encoder->percpu.shndx == 0 || !encoder->symtab) >>> return 0; >>> @@ -1954,9 +1963,9 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) >>> * always contains virtual symbol addresses, so subtract >>> * the section address unconditionally. >>> */ >>> - if (addr < encoder->percpu.base_addr || addr >= encoder->percpu.base_addr + encoder->percpu.sec_sz) >>> + if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz) >>> continue; >>> - addr -= encoder->percpu.base_addr; >>> + addr -= pcpu_scn->addr; >>> >>> if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name)) >>> continue; /* not a per-CPU variable */ >>> @@ -2099,20 +2108,33 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam >>> goto out; >>> } >>> >>> - /* find percpu section's shndx */ >>> + /* index the ELF sections for later lookup */ >>> >>> GElf_Shdr shdr; >>> - Elf_Scn *sec = elf_section_by_name(cu->elf, &shdr, PERCPU_SECTION, NULL); >>> + size_t shndx; >>> + if (elf_getshdrnum(cu->elf, &encoder->seccnt)) >>> + goto out_delete; >>> + if (encoder->seccnt >= MAX_ELF_SEC_CNT) { >>> + fprintf(stderr, "%s: reached limit of ELF sections\n", __func__); >>> + goto out_delete; >>> + } >>> >> as above we should just reallocarray_grow() more space. >> >>> - if (!sec) { >>> - if (encoder->verbose) >>> - printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); >>> - } else { >>> - encoder->percpu.shndx = elf_ndxscn(sec); >>> - encoder->percpu.base_addr = shdr.sh_addr; >>> - encoder->percpu.sec_sz = shdr.sh_size; >>> + for (shndx = 0; shndx < encoder->seccnt; shndx++) { >>> + const char *secname = NULL; >>> + Elf_Scn *sec = elf_section_by_idx(cu->elf, &shdr, shndx, &secname); >>> + if (!sec) >>> + goto out_delete; >>> + encoder->secinfo[shndx].addr = shdr.sh_addr; >>> + encoder->secinfo[shndx].sz = shdr.sh_size; >>> + encoder->secinfo[shndx].name = secname; >>> + >>> + if (strcmp(secname, PERCPU_SECTION) == 0) >>> + encoder->percpu.shndx = shndx; >>> } >>> >>> + if (!encoder->percpu.shndx && encoder->verbose) >>> + printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); >>> + >>> if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars)) >>> goto out_delete; >>> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH dwarves 4/4] btf_encoder: add global_var feature to encode globals 2024-09-12 19:08 [PATCH dwarves 0/4] Emit global variables in BTF Stephen Brennan ` (2 preceding siblings ...) 2024-09-12 19:08 ` [PATCH dwarves 3/4] btf_encoder: cache all ELF section info Stephen Brennan @ 2024-09-12 19:08 ` Stephen Brennan 2024-09-15 11:49 ` Alan Maguire 3 siblings, 1 reply; 14+ messages in thread From: Stephen Brennan @ 2024-09-12 19:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: dwarves, linux-debuggers, Stephen Brennan, Alan Maguire Currently the "var" feature only encodes percpu variables. Add the ability to encode all global variables. This also drops the use of the symbol table to find variable names and compare them against DWARF names. We simply rely on the DWARF information to give us all the variables. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> --- btf_encoder.c | 345 ++++++++++++++++++++------------------------- btf_encoder.h | 8 ++ dwarves.h | 1 + man-pages/pahole.1 | 8 +- pahole.c | 11 +- 5 files changed, 181 insertions(+), 192 deletions(-) diff --git a/btf_encoder.c b/btf_encoder.c index e3384e5..1872a35 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -67,16 +67,13 @@ struct elf_function { #define MAX_ELF_SEC_CNT 128 -struct var_info { - uint64_t addr; - const char *name; - uint32_t sz; -}; - struct elf_secinfo { uint64_t addr; const char *name; uint64_t sz; + bool include; + uint32_t type; + struct gobuffer secinfo; }; /* @@ -86,31 +83,24 @@ struct btf_encoder { struct list_head node; struct btf *btf; struct cu *cu; - struct gobuffer percpu_secinfo; const char *source_filename; const char *filename; struct elf_symtab *symtab; uint32_t type_id_off; bool has_index_type, need_index_type, - skip_encoding_vars, raw_output, verbose, force, gen_floats, skip_encoding_decl_tag, tag_kfuncs, - is_rel, gen_distilled_base; uint32_t array_index_id; struct elf_secinfo secinfo[MAX_ELF_SEC_CNT]; size_t seccnt; - struct { - struct var_info *vars; - int var_cnt; - int allocated; - uint32_t shndx; - } percpu; + int encode_vars; + uint32_t percpu_shndx; struct { struct elf_function *entries; int allocated; @@ -737,46 +727,56 @@ static int32_t btf_encoder__add_var(struct btf_encoder *encoder, uint32_t type, return id; } -static int32_t btf_encoder__add_var_secinfo(struct btf_encoder *encoder, uint32_t type, - uint32_t offset, uint32_t size) +static int32_t btf_encoder__add_var_secinfo(struct btf_encoder *encoder, int shndx, + uint32_t type, unsigned long addr, uint32_t size) { struct btf_var_secinfo si = { .type = type, - .offset = offset, + .offset = addr, .size = size, }; - return gobuffer__add(&encoder->percpu_secinfo, &si, sizeof(si)); + return gobuffer__add(&encoder->secinfo[shndx].secinfo, &si, sizeof(si)); } int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other) { - struct gobuffer *var_secinfo_buf = &other->percpu_secinfo; - size_t sz = gobuffer__size(var_secinfo_buf); - uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); - uint32_t type_id; - uint32_t next_type_id = btf__type_cnt(encoder->btf); - int32_t i, id; - struct btf_var_secinfo *vsi; - + int shndx; if (encoder == other) return 0; btf_encoder__add_saved_funcs(other); - for (i = 0; i < nr_var_secinfo; i++) { - vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i; - type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */ - id = btf_encoder__add_var_secinfo(encoder, type_id, vsi->offset, vsi->size); - if (id < 0) - return id; + for (shndx = 0; shndx < other->seccnt; shndx++) { + struct gobuffer *var_secinfo_buf = &other->secinfo[shndx].secinfo; + size_t sz = gobuffer__size(var_secinfo_buf); + uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); + uint32_t type_id; + uint32_t next_type_id = btf__type_cnt(encoder->btf); + int32_t i, id; + struct btf_var_secinfo *vsi; + + if (strcmp(encoder->secinfo[shndx].name, other->secinfo[shndx].name)) { + fprintf(stderr, "mismatched ELF sections at index %d: \"%s\", \"%s\"\n", + shndx, encoder->secinfo[shndx].name, other->secinfo[shndx].name); + return -1; + } + + for (i = 0; i < nr_var_secinfo; i++) { + vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i; + type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */ + id = btf_encoder__add_var_secinfo(encoder, shndx, type_id, vsi->offset, vsi->size); + if (id < 0) + return id; + } } return btf__add_btf(encoder->btf, other->btf); } -static int32_t btf_encoder__add_datasec(struct btf_encoder *encoder, const char *section_name) +static int32_t btf_encoder__add_datasec(struct btf_encoder *encoder, int shndx) { - struct gobuffer *var_secinfo_buf = &encoder->percpu_secinfo; + struct gobuffer *var_secinfo_buf = &encoder->secinfo[shndx].secinfo; + const char *section_name = encoder->secinfo[shndx].name; struct btf *btf = encoder->btf; size_t sz = gobuffer__size(var_secinfo_buf); uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); @@ -1743,13 +1743,14 @@ out: int btf_encoder__encode(struct btf_encoder *encoder) { bool should_tag_kfuncs; - int err; + int err, shndx; /* for single-threaded case, saved funcs are added here */ btf_encoder__add_saved_funcs(encoder); - if (gobuffer__size(&encoder->percpu_secinfo) != 0) - btf_encoder__add_datasec(encoder, PERCPU_SECTION); + for (shndx = 0; shndx < encoder->seccnt; shndx++) + if (gobuffer__size(&encoder->secinfo[shndx].secinfo)) + btf_encoder__add_datasec(encoder, shndx); /* Empty file, nothing to do, so... done! */ if (btf__type_cnt(encoder->btf) == 1) @@ -1799,111 +1800,18 @@ int btf_encoder__encode(struct btf_encoder *encoder) return err; } -static int percpu_var_cmp(const void *_a, const void *_b) -{ - const struct var_info *a = _a; - const struct var_info *b = _b; - - if (a->addr == b->addr) - return 0; - return a->addr < b->addr ? -1 : 1; -} - -static bool btf_encoder__percpu_var_exists(struct btf_encoder *encoder, uint64_t addr, uint32_t *sz, const char **name) -{ - struct var_info key = { .addr = addr }; - const struct var_info *p = bsearch(&key, encoder->percpu.vars, encoder->percpu.var_cnt, - sizeof(encoder->percpu.vars[0]), percpu_var_cmp); - if (!p) - return false; - - *sz = p->sz; - *name = p->name; - return true; -} - -static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym *sym, size_t sym_sec_idx) -{ - const char *sym_name; - uint64_t addr; - uint32_t size; - - /* compare a symbol's shndx to determine if it's a percpu variable */ - if (sym_sec_idx != encoder->percpu.shndx) - return 0; - if (elf_sym__type(sym) != STT_OBJECT) - return 0; - - addr = elf_sym__value(sym); - - size = elf_sym__size(sym); - if (!size) - return 0; /* ignore zero-sized symbols */ - - sym_name = elf_sym__name(sym, encoder->symtab); - if (!btf_name_valid(sym_name)) { - dump_invalid_symbol("Found symbol of invalid name when encoding btf", - sym_name, encoder->verbose, encoder->force); - if (encoder->force) - return 0; - return -1; - } - - if (encoder->verbose) - printf("Found per-CPU symbol '%s' at address 0x%" PRIx64 "\n", sym_name, addr); - - /* Make sure addr is section-relative. For kernel modules (which are - * ET_REL files) this is already the case. For vmlinux (which is an - * ET_EXEC file) we need to subtract the section address. - */ - if (!encoder->is_rel) - addr -= encoder->secinfo[encoder->percpu.shndx].addr; - - if (encoder->percpu.var_cnt == encoder->percpu.allocated) { - struct var_info *new; - - new = reallocarray_grow(encoder->percpu.vars, - &encoder->percpu.allocated, - sizeof(*encoder->percpu.vars)); - if (!new) { - fprintf(stderr, "Failed to allocate memory for variables\n"); - return -1; - } - encoder->percpu.vars = new; - } - encoder->percpu.vars[encoder->percpu.var_cnt].addr = addr; - encoder->percpu.vars[encoder->percpu.var_cnt].sz = size; - encoder->percpu.vars[encoder->percpu.var_cnt].name = sym_name; - encoder->percpu.var_cnt++; - - return 0; -} -static int btf_encoder__collect_symbols(struct btf_encoder *encoder, bool collect_percpu_vars) +static int btf_encoder__collect_symbols(struct btf_encoder *encoder) { - Elf32_Word sym_sec_idx; + uint32_t sym_sec_idx; uint32_t core_id; GElf_Sym sym; - /* cache variables' addresses, preparing for searching in symtab. */ - encoder->percpu.var_cnt = 0; - - /* search within symtab for percpu variables */ elf_symtab__for_each_symbol_index(encoder->symtab, core_id, sym, sym_sec_idx) { - if (collect_percpu_vars && btf_encoder__collect_percpu_var(encoder, &sym, sym_sec_idx)) - return -1; if (btf_encoder__collect_function(encoder, &sym)) return -1; } - if (collect_percpu_vars) { - if (encoder->percpu.var_cnt) - qsort(encoder->percpu.vars, encoder->percpu.var_cnt, sizeof(encoder->percpu.vars[0]), percpu_var_cmp); - - if (encoder->verbose) - printf("Found %d per-CPU variables!\n", encoder->percpu.var_cnt); - } - if (encoder->functions.cnt) { qsort(encoder->functions.entries, encoder->functions.cnt, sizeof(encoder->functions.entries[0]), functions_cmp); @@ -1925,75 +1833,126 @@ static bool ftype__has_arg_names(const struct ftype *ftype) return true; } +static int get_elf_section(struct btf_encoder *encoder, unsigned long addr) +{ + for (int i = 1; i < encoder->seccnt; i++) + /* Variables are only present in PROGBITS or NOBITS (.bss) */ + if ((encoder->secinfo[i].type == SHT_PROGBITS || + encoder->secinfo[i].type == SHT_NOBITS) && + encoder->secinfo[i].addr <= addr && + (addr - encoder->secinfo[i].addr) < encoder->secinfo[i].sz) + return i; + return 0; +} + +/* + * Filter out variables / symbol names with common prefixes and no useful + * values. Prefixes should be added sparingly, and it should be objectively + * obvious that they are not useful. + */ +static bool filter_variable_name(const char *name) +{ + static const struct { char *s; size_t len; } skip[] = { + #define X(str) {str, sizeof(str) - 1} + X("__UNIQUE_ID"), + X("__tpstrtab_"), + X("__exitcall_"), + X("__func_stack_frame_non_standard_") + #undef X + }; + int i; + + if (*name != '_') + return false; + + for (i = 0; i < sizeof(skip) / sizeof(skip[0]); i++) + if (strncmp(name, skip[i].s, skip[i].len) == 0) + return true; + return false; +} + static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) { struct cu *cu = encoder->cu; uint32_t core_id; struct tag *pos; int err = -1; - struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx]; - if (encoder->percpu.shndx == 0 || !encoder->symtab) + if (!encoder->symtab) return 0; if (encoder->verbose) - printf("search cu '%s' for percpu global variables.\n", cu->name); + printf("search cu '%s' for global variables.\n", cu->name); cu__for_each_variable(cu, core_id, pos) { struct variable *var = tag__variable(pos); - uint32_t size, type, linkage; - const char *name, *dwarf_name; + uint32_t type, linkage; + const char *name; struct llvm_annotation *annot; const struct tag *tag; + int shndx; + struct elf_secinfo *sec = NULL; uint64_t addr; int id; + unsigned long size; + /* Skip incomplete (non-defining) declarations */ if (var->declaration && !var->spec) continue; - /* percpu variables are allocated in global space */ - if (variable__scope(var) != VSCOPE_GLOBAL && !var->spec) + /* + * top_level: indicates that the variable is declared at the top + * level of the CU, and thus it is globally scoped. + * artificial: indicates that the variable is a compiler-generated + * "fake" variable that doesn't appear in the source. + * scope: set by pahole to indicate the type of storage the + * variable has. GLOBAL indicates it is stored in static + * memory (as opposed to a stack variable or register) + * + * Some variables are "top_level" but not GLOBAL: + * e.g. current_stack_pointer, which is a register variable, + * despite having global CU-declarations. We don't want that, + * since no code could actually find this variable. + * Some variables are GLOBAL but not top_level: + * e.g. function static variables + */ + if (!var->top_level || var->artificial || var->scope != VSCOPE_GLOBAL) continue; /* addr has to be recorded before we follow spec */ addr = var->ip.addr; - dwarf_name = variable__name(var); - /* Make sure addr is section-relative. DWARF, unlike ELF, - * always contains virtual symbol addresses, so subtract - * the section address unconditionally. - */ - if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz) + /* Get the ELF section info for the variable */ + shndx = get_elf_section(encoder, addr); + if (shndx && shndx < encoder->seccnt) + sec = &encoder->secinfo[shndx]; + if (!sec || !sec->include) continue; - addr -= pcpu_scn->addr; - if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name)) - continue; /* not a per-CPU variable */ + /* Convert addr to section relative */ + addr -= sec->addr; - /* A lot of "special" DWARF variables (e.g, __UNIQUE_ID___xxx) - * have addr == 0, which is the same as, say, valid - * fixed_percpu_data per-CPU variable. To distinguish between - * them, additionally compare DWARF and ELF symbol names. If - * DWARF doesn't provide proper name, pessimistically assume - * bad variable. - * - * Examples of such special variables are: - * - * 1. __ADDRESSABLE(sym), which are forcely emitted as symbols. - * 2. __UNIQUE_ID(prefix), which are introduced to generate unique ids. - * 3. __exitcall(fn), functions which are labeled as exit calls. - * - * This is relevant only for vmlinux image, as for kernel - * modules per-CPU data section has non-zero offset so all - * per-CPU symbols have non-zero values. - */ - if (var->ip.addr == 0) { - if (!dwarf_name || strcmp(dwarf_name, name)) + /* DWARF specification reference should be followed, because + * information like the name & type may not be present on var */ + if (var->spec) + var = var->spec; + + name = variable__name(var); + if (!name) + continue; + + /* Check for invalid BTF names */ + if (!btf_name_valid(name)) { + dump_invalid_symbol("Found invalid variable name when encoding btf", + name, encoder->verbose, encoder->force); + if (encoder->force) continue; + else + return -1; } - if (var->spec) - var = var->spec; + if (filter_variable_name(name)) + continue; if (var->ip.tag.type == 0) { fprintf(stderr, "error: found variable '%s' in CU '%s' that has void type\n", @@ -2005,9 +1964,10 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) } tag = cu__type(cu, var->ip.tag.type); - if (tag__size(tag, cu) == 0) { + size = tag__size(tag, cu); + if (size == 0) { if (encoder->verbose) - fprintf(stderr, "Ignoring zero-sized per-CPU variable '%s'...\n", dwarf_name ?: "<missing name>"); + fprintf(stderr, "Ignoring zero-sized variable '%s'...\n", name ?: "<missing name>"); continue; } @@ -2037,13 +1997,14 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) } /* - * add a BTF_VAR_SECINFO in encoder->percpu_secinfo, which will be added into - * encoder->types later when we add BTF_VAR_DATASEC. + * Add the variable to the secinfo for the section it appears in. + * Later we will generate a BTF_VAR_DATASEC for all any section with + * an encoded variable. */ - id = btf_encoder__add_var_secinfo(encoder, id, addr, size); + id = btf_encoder__add_var_secinfo(encoder, shndx, id, addr, size); if (id < 0) { fprintf(stderr, "error: failed to encode section info for variable '%s' at addr 0x%" PRIx64 "\n", - name, addr); + name, addr); goto out; } } @@ -2070,7 +2031,6 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam encoder->force = conf_load->btf_encode_force; encoder->gen_floats = conf_load->btf_gen_floats; - encoder->skip_encoding_vars = conf_load->skip_encoding_btf_vars; encoder->skip_encoding_decl_tag = conf_load->skip_encoding_btf_decl_tag; encoder->tag_kfuncs = conf_load->btf_decl_tag_kfuncs; encoder->gen_distilled_base = conf_load->btf_gen_distilled_base; @@ -2078,6 +2038,11 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam encoder->has_index_type = false; encoder->need_index_type = false; encoder->array_index_id = 0; + encoder->encode_vars = 0; + if (!conf_load->skip_encoding_btf_vars) + encoder->encode_vars |= BTF_VAR_PERCPU; + if (conf_load->encode_btf_global_vars) + encoder->encode_vars |= BTF_VAR_GLOBAL; GElf_Ehdr ehdr; @@ -2087,8 +2052,6 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam goto out_delete; } - encoder->is_rel = ehdr.e_type == ET_REL; - switch (ehdr.e_ident[EI_DATA]) { case ELFDATA2LSB: btf__set_endianness(encoder->btf, BTF_LITTLE_ENDIAN); @@ -2127,15 +2090,21 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam encoder->secinfo[shndx].addr = shdr.sh_addr; encoder->secinfo[shndx].sz = shdr.sh_size; encoder->secinfo[shndx].name = secname; - - if (strcmp(secname, PERCPU_SECTION) == 0) - encoder->percpu.shndx = shndx; + encoder->secinfo[shndx].type = shdr.sh_type; + if (encoder->encode_vars & BTF_VAR_GLOBAL) + encoder->secinfo[shndx].include = true; + + if (strcmp(secname, PERCPU_SECTION) == 0) { + encoder->percpu_shndx = shndx; + if (encoder->encode_vars & BTF_VAR_PERCPU) + encoder->secinfo[shndx].include = true; + } } - if (!encoder->percpu.shndx && encoder->verbose) + if (!encoder->percpu_shndx && encoder->verbose) printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); - if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars)) + if (btf_encoder__collect_symbols(encoder)) goto out_delete; if (encoder->verbose) @@ -2156,7 +2125,8 @@ void btf_encoder__delete(struct btf_encoder *encoder) return; btf_encoders__delete(encoder); - __gobuffer__delete(&encoder->percpu_secinfo); + for (int i = 0; i < encoder->seccnt; i++) + __gobuffer__delete(&encoder->secinfo[i].secinfo); zfree(&encoder->filename); zfree(&encoder->source_filename); btf__free(encoder->btf); @@ -2166,9 +2136,6 @@ void btf_encoder__delete(struct btf_encoder *encoder) encoder->functions.allocated = encoder->functions.cnt = 0; free(encoder->functions.entries); encoder->functions.entries = NULL; - encoder->percpu.allocated = encoder->percpu.var_cnt = 0; - free(encoder->percpu.vars); - encoder->percpu.vars = NULL; free(encoder); } @@ -2321,7 +2288,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co goto out; } - if (!encoder->skip_encoding_vars) + if (encoder->encode_vars) err = btf_encoder__encode_cu_variables(encoder); /* It is only safe to delete this CU if we have not stashed any static diff --git a/btf_encoder.h b/btf_encoder.h index f54c95a..c5424db 100644 --- a/btf_encoder.h +++ b/btf_encoder.h @@ -16,7 +16,15 @@ struct btf; struct cu; struct list_head; +enum btf_var_option { + BTF_VAR_NONE = 0, + BTF_VAR_PERCPU = 1, + BTF_VAR_GLOBAL = 2, +}; +#define BTF_VAR_ALL (BTF_VAR_PERCPU | BTF_VAR_GLOBAL) + struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load); + void btf_encoder__delete(struct btf_encoder *encoder); int btf_encoder__encode(struct btf_encoder *encoder); diff --git a/dwarves.h b/dwarves.h index 0fede91..fef881f 100644 --- a/dwarves.h +++ b/dwarves.h @@ -92,6 +92,7 @@ struct conf_load { bool btf_gen_optimized; bool skip_encoding_btf_inconsistent_proto; bool skip_encoding_btf_vars; + bool encode_btf_global_vars; bool btf_gen_floats; bool btf_encode_force; bool reproducible_build; diff --git a/man-pages/pahole.1 b/man-pages/pahole.1 index 0a9d8ac..4bc2d03 100644 --- a/man-pages/pahole.1 +++ b/man-pages/pahole.1 @@ -230,7 +230,10 @@ the debugging information. .TP .B \-\-skip_encoding_btf_vars -Do not encode VARs in BTF. +.TQ +.B \-\-encode_btf_global_vars +By default, VARs are encoded only for percpu variables. These options allow +to skip encoding them, or alternatively to encode all global variables too. .TP .B \-\-skip_encoding_btf_decl_tag @@ -296,7 +299,8 @@ Encode BTF using the specified feature list, or specify 'default' for all standa encode_force Ignore invalid symbols when encoding BTF; for example if a symbol has an invalid name, it will be ignored and BTF encoding will continue. - var Encode variables using BTF_KIND_VAR in BTF. + var Encode percpu variables using BTF_KIND_VAR in BTF. + global_var Encode all global variables in the same way. float Encode floating-point types in BTF. decl_tag Encode declaration tags using BTF_KIND_DECL_TAG. type_tag Encode type tags using BTF_KIND_TYPE_TAG. diff --git a/pahole.c b/pahole.c index a33490d..27b7a7e 100644 --- a/pahole.c +++ b/pahole.c @@ -1239,6 +1239,7 @@ ARGP_PROGRAM_VERSION_HOOK_DEF = dwarves_print_version; #define ARGP_contains_enumerator 344 #define ARGP_reproducible_build 345 #define ARGP_running_kernel_vmlinux 346 +#define ARGP_encode_btf_global_vars 347 /* --btf_features=feature1[,feature2,..] allows us to specify * a list of requested BTF features or "default" to enable all default @@ -1285,6 +1286,7 @@ struct btf_feature { } btf_features[] = { BTF_DEFAULT_FEATURE(encode_force, btf_encode_force, false), BTF_DEFAULT_FEATURE(var, skip_encoding_btf_vars, true), + BTF_DEFAULT_FEATURE(global_var, encode_btf_global_vars, false), BTF_DEFAULT_FEATURE(float, btf_gen_floats, false), BTF_DEFAULT_FEATURE(decl_tag, skip_encoding_btf_decl_tag, true), BTF_DEFAULT_FEATURE(type_tag, skip_encoding_btf_type_tag, true), @@ -1720,7 +1722,12 @@ static const struct argp_option pahole__options[] = { { .name = "skip_encoding_btf_vars", .key = ARGP_skip_encoding_btf_vars, - .doc = "Do not encode VARs in BTF." + .doc = "Do not encode VARs in BTF (default: percpu only)." + }, + { + .name = "encode_btf_global_vars", + .key = ARGP_skip_encoding_btf_vars, + .doc = "Encode all global VARs in BTF (default: percpu only)." }, { .name = "btf_encode_force", @@ -2047,6 +2054,8 @@ static error_t pahole__options_parser(int key, char *arg, show_supported_btf_features(stdout); exit(0); case ARGP_btf_features_strict: parse_btf_features(arg, true); break; + case ARGP_encode_btf_global_vars: + conf_load.encode_btf_global_vars = true; break; default: return ARGP_ERR_UNKNOWN; } -- 2.43.5 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 4/4] btf_encoder: add global_var feature to encode globals 2024-09-12 19:08 ` [PATCH dwarves 4/4] btf_encoder: add global_var feature to encode globals Stephen Brennan @ 2024-09-15 11:49 ` Alan Maguire 2024-09-20 8:18 ` Stephen Brennan 0 siblings, 1 reply; 14+ messages in thread From: Alan Maguire @ 2024-09-15 11:49 UTC (permalink / raw) To: Stephen Brennan, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers On 12/09/2024 20:08, Stephen Brennan wrote: > Currently the "var" feature only encodes percpu variables. Add the > ability to encode all global variables. > > This also drops the use of the symbol table to find variable names and > compare them against DWARF names. We simply rely on the DWARF > information to give us all the variables. > > Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Tried this out on x86_64/aarch64, got multiple DATASECs and ~28000 variables. All worked nicely! Tested-by: Alan Maguire <alan.maguire@oracle.com> more below, mostly small things though. > --- > btf_encoder.c | 345 ++++++++++++++++++++------------------------- > btf_encoder.h | 8 ++ > dwarves.h | 1 + > man-pages/pahole.1 | 8 +- > pahole.c | 11 +- > 5 files changed, 181 insertions(+), 192 deletions(-) > What's really impressive here is that you've deleted more code than you've added while also adding valuable functionality; that's a rare thing so didn't want to leave it unremarked! > diff --git a/btf_encoder.c b/btf_encoder.c > index e3384e5..1872a35 100644 > --- a/btf_encoder.c > +++ b/btf_encoder.c > @@ -67,16 +67,13 @@ struct elf_function { > > #define MAX_ELF_SEC_CNT 128 > > -struct var_info { > - uint64_t addr; > - const char *name; > - uint32_t sz; > -}; > - > struct elf_secinfo { > uint64_t addr; > const char *name; > uint64_t sz; > + bool include; > + uint32_t type; > + struct gobuffer secinfo; > }; > > /* > @@ -86,31 +83,24 @@ struct btf_encoder { > struct list_head node; > struct btf *btf; > struct cu *cu; > - struct gobuffer percpu_secinfo; > const char *source_filename; > const char *filename; > struct elf_symtab *symtab; > uint32_t type_id_off; > bool has_index_type, > need_index_type, > - skip_encoding_vars, > raw_output, > verbose, > force, > gen_floats, > skip_encoding_decl_tag, > tag_kfuncs, > - is_rel, > gen_distilled_base; > uint32_t array_index_id; > struct elf_secinfo secinfo[MAX_ELF_SEC_CNT]; > size_t seccnt; > - struct { > - struct var_info *vars; > - int var_cnt; > - int allocated; > - uint32_t shndx; > - } percpu; > + int encode_vars; > + uint32_t percpu_shndx; > struct { > struct elf_function *entries; > int allocated; > @@ -737,46 +727,56 @@ static int32_t btf_encoder__add_var(struct btf_encoder *encoder, uint32_t type, > return id; > } > > -static int32_t btf_encoder__add_var_secinfo(struct btf_encoder *encoder, uint32_t type, > - uint32_t offset, uint32_t size) > +static int32_t btf_encoder__add_var_secinfo(struct btf_encoder *encoder, int shndx, > + uint32_t type, unsigned long addr, uint32_t size) > { > struct btf_var_secinfo si = { > .type = type, > - .offset = offset, > + .offset = addr, > .size = size, > }; > - return gobuffer__add(&encoder->percpu_secinfo, &si, sizeof(si)); > + return gobuffer__add(&encoder->secinfo[shndx].secinfo, &si, sizeof(si)); > } > > int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other) > { > - struct gobuffer *var_secinfo_buf = &other->percpu_secinfo; > - size_t sz = gobuffer__size(var_secinfo_buf); > - uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); > - uint32_t type_id; > - uint32_t next_type_id = btf__type_cnt(encoder->btf); > - int32_t i, id; > - struct btf_var_secinfo *vsi; > - > + int shndx; > if (encoder == other) > return 0; > > btf_encoder__add_saved_funcs(other); > > - for (i = 0; i < nr_var_secinfo; i++) { > - vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i; > - type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */ > - id = btf_encoder__add_var_secinfo(encoder, type_id, vsi->offset, vsi->size); > - if (id < 0) > - return id; > + for (shndx = 0; shndx < other->seccnt; shndx++) { > + struct gobuffer *var_secinfo_buf = &other->secinfo[shndx].secinfo; > + size_t sz = gobuffer__size(var_secinfo_buf); > + uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); > + uint32_t type_id; > + uint32_t next_type_id = btf__type_cnt(encoder->btf); > + int32_t i, id; > + struct btf_var_secinfo *vsi; > + > + if (strcmp(encoder->secinfo[shndx].name, other->secinfo[shndx].name)) { > + fprintf(stderr, "mismatched ELF sections at index %d: \"%s\", \"%s\"\n", > + shndx, encoder->secinfo[shndx].name, other->secinfo[shndx].name); > + return -1; > + } > + > + for (i = 0; i < nr_var_secinfo; i++) { > + vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i; > + type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */ > + id = btf_encoder__add_var_secinfo(encoder, shndx, type_id, vsi->offset, vsi->size); > + if (id < 0) > + return id; > + } > } > > return btf__add_btf(encoder->btf, other->btf); > } > > -static int32_t btf_encoder__add_datasec(struct btf_encoder *encoder, const char *section_name) > +static int32_t btf_encoder__add_datasec(struct btf_encoder *encoder, int shndx) > { > - struct gobuffer *var_secinfo_buf = &encoder->percpu_secinfo; > + struct gobuffer *var_secinfo_buf = &encoder->secinfo[shndx].secinfo; > + const char *section_name = encoder->secinfo[shndx].name; > struct btf *btf = encoder->btf; > size_t sz = gobuffer__size(var_secinfo_buf); > uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); > @@ -1743,13 +1743,14 @@ out: > int btf_encoder__encode(struct btf_encoder *encoder) > { > bool should_tag_kfuncs; > - int err; > + int err, shndx; > > /* for single-threaded case, saved funcs are added here */ > btf_encoder__add_saved_funcs(encoder); > > - if (gobuffer__size(&encoder->percpu_secinfo) != 0) > - btf_encoder__add_datasec(encoder, PERCPU_SECTION); > + for (shndx = 0; shndx < encoder->seccnt; shndx++) > + if (gobuffer__size(&encoder->secinfo[shndx].secinfo)) > + btf_encoder__add_datasec(encoder, shndx); > > /* Empty file, nothing to do, so... done! */ > if (btf__type_cnt(encoder->btf) == 1) > @@ -1799,111 +1800,18 @@ int btf_encoder__encode(struct btf_encoder *encoder) > return err; > } > > -static int percpu_var_cmp(const void *_a, const void *_b) > -{ > - const struct var_info *a = _a; > - const struct var_info *b = _b; > - > - if (a->addr == b->addr) > - return 0; > - return a->addr < b->addr ? -1 : 1; > -} > - > -static bool btf_encoder__percpu_var_exists(struct btf_encoder *encoder, uint64_t addr, uint32_t *sz, const char **name) > -{ > - struct var_info key = { .addr = addr }; > - const struct var_info *p = bsearch(&key, encoder->percpu.vars, encoder->percpu.var_cnt, > - sizeof(encoder->percpu.vars[0]), percpu_var_cmp); > - if (!p) > - return false; > - > - *sz = p->sz; > - *name = p->name; > - return true; > -} > - > -static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym *sym, size_t sym_sec_idx) > -{ > - const char *sym_name; > - uint64_t addr; > - uint32_t size; > - > - /* compare a symbol's shndx to determine if it's a percpu variable */ > - if (sym_sec_idx != encoder->percpu.shndx) > - return 0; > - if (elf_sym__type(sym) != STT_OBJECT) > - return 0; > - > - addr = elf_sym__value(sym); > - > - size = elf_sym__size(sym); > - if (!size) > - return 0; /* ignore zero-sized symbols */ > - > - sym_name = elf_sym__name(sym, encoder->symtab); > - if (!btf_name_valid(sym_name)) { > - dump_invalid_symbol("Found symbol of invalid name when encoding btf", > - sym_name, encoder->verbose, encoder->force); > - if (encoder->force) > - return 0; > - return -1; > - } > - > - if (encoder->verbose) > - printf("Found per-CPU symbol '%s' at address 0x%" PRIx64 "\n", sym_name, addr); > - > - /* Make sure addr is section-relative. For kernel modules (which are > - * ET_REL files) this is already the case. For vmlinux (which is an > - * ET_EXEC file) we need to subtract the section address. > - */ > - if (!encoder->is_rel) > - addr -= encoder->secinfo[encoder->percpu.shndx].addr; > - > - if (encoder->percpu.var_cnt == encoder->percpu.allocated) { > - struct var_info *new; > - > - new = reallocarray_grow(encoder->percpu.vars, > - &encoder->percpu.allocated, > - sizeof(*encoder->percpu.vars)); > - if (!new) { > - fprintf(stderr, "Failed to allocate memory for variables\n"); > - return -1; > - } > - encoder->percpu.vars = new; > - } > - encoder->percpu.vars[encoder->percpu.var_cnt].addr = addr; > - encoder->percpu.vars[encoder->percpu.var_cnt].sz = size; > - encoder->percpu.vars[encoder->percpu.var_cnt].name = sym_name; > - encoder->percpu.var_cnt++; > - > - return 0; > -} > > -static int btf_encoder__collect_symbols(struct btf_encoder *encoder, bool collect_percpu_vars) > +static int btf_encoder__collect_symbols(struct btf_encoder *encoder) > { > - Elf32_Word sym_sec_idx; > + uint32_t sym_sec_idx; > uint32_t core_id; > GElf_Sym sym; > > - /* cache variables' addresses, preparing for searching in symtab. */ > - encoder->percpu.var_cnt = 0; > - > - /* search within symtab for percpu variables */ > elf_symtab__for_each_symbol_index(encoder->symtab, core_id, sym, sym_sec_idx) { > - if (collect_percpu_vars && btf_encoder__collect_percpu_var(encoder, &sym, sym_sec_idx)) > - return -1; > if (btf_encoder__collect_function(encoder, &sym)) > return -1; > } > > - if (collect_percpu_vars) { > - if (encoder->percpu.var_cnt) > - qsort(encoder->percpu.vars, encoder->percpu.var_cnt, sizeof(encoder->percpu.vars[0]), percpu_var_cmp); > - > - if (encoder->verbose) > - printf("Found %d per-CPU variables!\n", encoder->percpu.var_cnt); > - } > - > if (encoder->functions.cnt) { > qsort(encoder->functions.entries, encoder->functions.cnt, sizeof(encoder->functions.entries[0]), > functions_cmp); > @@ -1925,75 +1833,126 @@ static bool ftype__has_arg_names(const struct ftype *ftype) > return true; > } > > +static int get_elf_section(struct btf_encoder *encoder, unsigned long addr) > +{ > + for (int i = 1; i < encoder->seccnt; i++) > + /* Variables are only present in PROGBITS or NOBITS (.bss) */ > + if ((encoder->secinfo[i].type == SHT_PROGBITS || > + encoder->secinfo[i].type == SHT_NOBITS) && > + encoder->secinfo[i].addr <= addr && > + (addr - encoder->secinfo[i].addr) < encoder->secinfo[i].sz) > + return i; > + return 0; > +} what if our variables were in shndx 0, probably unlikely but not sure if it's possible? What about making it return -ENOENT here if no section matches? > + > +/* > + * Filter out variables / symbol names with common prefixes and no useful > + * values. Prefixes should be added sparingly, and it should be objectively > + * obvious that they are not useful. > + */ > +static bool filter_variable_name(const char *name) > +{ > + static const struct { char *s; size_t len; } skip[] = { > + #define X(str) {str, sizeof(str) - 1} > + X("__UNIQUE_ID"), > + X("__tpstrtab_"), > + X("__exitcall_"), > + X("__func_stack_frame_non_standard_") > + #undef X > + }; > + int i; > + > + if (*name != '_') > + return false; > + > + for (i = 0; i < sizeof(skip) / sizeof(skip[0]); i++) nit we have ARRAY_SIZE() in dwarves.h, ARRAY_SIZE(skip) could be used here; i'd also add brackets for the for() loop body even tho not strictly required. > + if (strncmp(name, skip[i].s, skip[i].len) == 0) > + return true; > + return false; > +} > + > static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) > { > struct cu *cu = encoder->cu; > uint32_t core_id; > struct tag *pos; > int err = -1; > - struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx]; > > - if (encoder->percpu.shndx == 0 || !encoder->symtab) > + if (!encoder->symtab) > return 0; > > if (encoder->verbose) > - printf("search cu '%s' for percpu global variables.\n", cu->name); > + printf("search cu '%s' for global variables.\n", cu->name); > > cu__for_each_variable(cu, core_id, pos) { > struct variable *var = tag__variable(pos); > - uint32_t size, type, linkage; > - const char *name, *dwarf_name; > + uint32_t type, linkage; > + const char *name; > struct llvm_annotation *annot; > const struct tag *tag; > + int shndx; > + struct elf_secinfo *sec = NULL; > uint64_t addr; > int id; > + unsigned long size; > > + /* Skip incomplete (non-defining) declarations */ > if (var->declaration && !var->spec) > continue; > > - /* percpu variables are allocated in global space */ > - if (variable__scope(var) != VSCOPE_GLOBAL && !var->spec) > + /* > + * top_level: indicates that the variable is declared at the top > + * level of the CU, and thus it is globally scoped. > + * artificial: indicates that the variable is a compiler-generated > + * "fake" variable that doesn't appear in the source. > + * scope: set by pahole to indicate the type of storage the > + * variable has. GLOBAL indicates it is stored in static > + * memory (as opposed to a stack variable or register) > + * > + * Some variables are "top_level" but not GLOBAL: > + * e.g. current_stack_pointer, which is a register variable, > + * despite having global CU-declarations. We don't want that, > + * since no code could actually find this variable. > + * Some variables are GLOBAL but not top_level: > + * e.g. function static variables > + */ > + if (!var->top_level || var->artificial || var->scope != VSCOPE_GLOBAL) > continue; > > /* addr has to be recorded before we follow spec */ > addr = var->ip.addr; > - dwarf_name = variable__name(var); > > - /* Make sure addr is section-relative. DWARF, unlike ELF, > - * always contains virtual symbol addresses, so subtract > - * the section address unconditionally. > - */ > - if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz) > + /* Get the ELF section info for the variable */ > + shndx = get_elf_section(encoder, addr); > + if (shndx && shndx < encoder->seccnt) > + sec = &encoder->secinfo[shndx]; > + if (!sec || !sec->include) > continue; > - addr -= pcpu_scn->addr; > > - if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name)) > - continue; /* not a per-CPU variable */ > + /* Convert addr to section relative */ > + addr -= sec->addr; > > - /* A lot of "special" DWARF variables (e.g, __UNIQUE_ID___xxx) > - * have addr == 0, which is the same as, say, valid > - * fixed_percpu_data per-CPU variable. To distinguish between > - * them, additionally compare DWARF and ELF symbol names. If > - * DWARF doesn't provide proper name, pessimistically assume > - * bad variable. > - * > - * Examples of such special variables are: > - * > - * 1. __ADDRESSABLE(sym), which are forcely emitted as symbols. > - * 2. __UNIQUE_ID(prefix), which are introduced to generate unique ids. > - * 3. __exitcall(fn), functions which are labeled as exit calls. > - * > - * This is relevant only for vmlinux image, as for kernel > - * modules per-CPU data section has non-zero offset so all > - * per-CPU symbols have non-zero values. > - */ > - if (var->ip.addr == 0) { > - if (!dwarf_name || strcmp(dwarf_name, name)) > + /* DWARF specification reference should be followed, because > + * information like the name & type may not be present on var */ > + if (var->spec) > + var = var->spec; > + > + name = variable__name(var); > + if (!name) > + continue; > + > + /* Check for invalid BTF names */ > + if (!btf_name_valid(name)) { > + dump_invalid_symbol("Found invalid variable name when encoding btf", > + name, encoder->verbose, encoder->force); > + if (encoder->force) > continue; > + else > + return -1; > } > > - if (var->spec) > - var = var->spec; > + if (filter_variable_name(name)) > + continue; > > if (var->ip.tag.type == 0) { > fprintf(stderr, "error: found variable '%s' in CU '%s' that has void type\n", > @@ -2005,9 +1964,10 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) > } > > tag = cu__type(cu, var->ip.tag.type); > - if (tag__size(tag, cu) == 0) { > + size = tag__size(tag, cu); > + if (size == 0) { > if (encoder->verbose) > - fprintf(stderr, "Ignoring zero-sized per-CPU variable '%s'...\n", dwarf_name ?: "<missing name>"); > + fprintf(stderr, "Ignoring zero-sized variable '%s'...\n", name ?: "<missing name>"); > continue; > } > > @@ -2037,13 +1997,14 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) > } > > /* > - * add a BTF_VAR_SECINFO in encoder->percpu_secinfo, which will be added into > - * encoder->types later when we add BTF_VAR_DATASEC. > + * Add the variable to the secinfo for the section it appears in. > + * Later we will generate a BTF_VAR_DATASEC for all any section with > + * an encoded variable. > */ > - id = btf_encoder__add_var_secinfo(encoder, id, addr, size); > + id = btf_encoder__add_var_secinfo(encoder, shndx, id, addr, size); > if (id < 0) { > fprintf(stderr, "error: failed to encode section info for variable '%s' at addr 0x%" PRIx64 "\n", > - name, addr); > + name, addr); > goto out; > } > } > @@ -2070,7 +2031,6 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam > > encoder->force = conf_load->btf_encode_force; > encoder->gen_floats = conf_load->btf_gen_floats; > - encoder->skip_encoding_vars = conf_load->skip_encoding_btf_vars; > encoder->skip_encoding_decl_tag = conf_load->skip_encoding_btf_decl_tag; > encoder->tag_kfuncs = conf_load->btf_decl_tag_kfuncs; > encoder->gen_distilled_base = conf_load->btf_gen_distilled_base; > @@ -2078,6 +2038,11 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam > encoder->has_index_type = false; > encoder->need_index_type = false; > encoder->array_index_id = 0; > + encoder->encode_vars = 0; > + if (!conf_load->skip_encoding_btf_vars) > + encoder->encode_vars |= BTF_VAR_PERCPU; > + if (conf_load->encode_btf_global_vars) > + encoder->encode_vars |= BTF_VAR_GLOBAL; > > GElf_Ehdr ehdr; > > @@ -2087,8 +2052,6 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam > goto out_delete; > } > > - encoder->is_rel = ehdr.e_type == ET_REL; > - > switch (ehdr.e_ident[EI_DATA]) { > case ELFDATA2LSB: > btf__set_endianness(encoder->btf, BTF_LITTLE_ENDIAN); > @@ -2127,15 +2090,21 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam > encoder->secinfo[shndx].addr = shdr.sh_addr; > encoder->secinfo[shndx].sz = shdr.sh_size; > encoder->secinfo[shndx].name = secname; > - > - if (strcmp(secname, PERCPU_SECTION) == 0) > - encoder->percpu.shndx = shndx; > + encoder->secinfo[shndx].type = shdr.sh_type; > + if (encoder->encode_vars & BTF_VAR_GLOBAL) > + encoder->secinfo[shndx].include = true; > + > + if (strcmp(secname, PERCPU_SECTION) == 0) { > + encoder->percpu_shndx = shndx; > + if (encoder->encode_vars & BTF_VAR_PERCPU) > + encoder->secinfo[shndx].include = true; > + } > } > > - if (!encoder->percpu.shndx && encoder->verbose) > + if (!encoder->percpu_shndx && encoder->verbose) > printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); > > - if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars)) > + if (btf_encoder__collect_symbols(encoder)) > goto out_delete; > > if (encoder->verbose) > @@ -2156,7 +2125,8 @@ void btf_encoder__delete(struct btf_encoder *encoder) > return; > > btf_encoders__delete(encoder); > - __gobuffer__delete(&encoder->percpu_secinfo); > + for (int i = 0; i < encoder->seccnt; i++) > + __gobuffer__delete(&encoder->secinfo[i].secinfo); > zfree(&encoder->filename); > zfree(&encoder->source_filename); > btf__free(encoder->btf); > @@ -2166,9 +2136,6 @@ void btf_encoder__delete(struct btf_encoder *encoder) > encoder->functions.allocated = encoder->functions.cnt = 0; > free(encoder->functions.entries); > encoder->functions.entries = NULL; > - encoder->percpu.allocated = encoder->percpu.var_cnt = 0; > - free(encoder->percpu.vars); > - encoder->percpu.vars = NULL; > > free(encoder); > } > @@ -2321,7 +2288,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co > goto out; > } > > - if (!encoder->skip_encoding_vars) > + if (encoder->encode_vars) > err = btf_encoder__encode_cu_variables(encoder); > > /* It is only safe to delete this CU if we have not stashed any static > diff --git a/btf_encoder.h b/btf_encoder.h > index f54c95a..c5424db 100644 > --- a/btf_encoder.h > +++ b/btf_encoder.h > @@ -16,7 +16,15 @@ struct btf; > struct cu; > struct list_head; > > +enum btf_var_option { > + BTF_VAR_NONE = 0, > + BTF_VAR_PERCPU = 1, > + BTF_VAR_GLOBAL = 2, > +}; > +#define BTF_VAR_ALL (BTF_VAR_PERCPU | BTF_VAR_GLOBAL) > + Could also just do enum btf_var_option { BTF_VAR_NONE = 0, BTF_VAR_PERCPU = 1, BTF_VAR_GLOBAL = 2, BTF_VAR_ALL = BTF_VAR_PERCPU | BTF_VAR_GLOBAL, }; Either way, BTF_VAR_ALL doesn't seem to be used. > struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load); > + > void btf_encoder__delete(struct btf_encoder *encoder); > > int btf_encoder__encode(struct btf_encoder *encoder); > diff --git a/dwarves.h b/dwarves.h > index 0fede91..fef881f 100644 > --- a/dwarves.h > +++ b/dwarves.h > @@ -92,6 +92,7 @@ struct conf_load { > bool btf_gen_optimized; > bool skip_encoding_btf_inconsistent_proto; > bool skip_encoding_btf_vars; > + bool encode_btf_global_vars; > bool btf_gen_floats; > bool btf_encode_force; > bool reproducible_build; > diff --git a/man-pages/pahole.1 b/man-pages/pahole.1 > index 0a9d8ac..4bc2d03 100644 > --- a/man-pages/pahole.1 > +++ b/man-pages/pahole.1 > @@ -230,7 +230,10 @@ the debugging information. > > .TP > .B \-\-skip_encoding_btf_vars > -Do not encode VARs in BTF. > +.TQ > +.B \-\-encode_btf_global_vars > +By default, VARs are encoded only for percpu variables. These options allow > +to skip encoding them, or alternatively to encode all global variables too. > > .TP > .B \-\-skip_encoding_btf_decl_tag > @@ -296,7 +299,8 @@ Encode BTF using the specified feature list, or specify 'default' for all standa > encode_force Ignore invalid symbols when encoding BTF; for example > if a symbol has an invalid name, it will be ignored > and BTF encoding will continue. > - var Encode variables using BTF_KIND_VAR in BTF. > + var Encode percpu variables using BTF_KIND_VAR in BTF. > + global_var Encode all global variables in the same way. > float Encode floating-point types in BTF. > decl_tag Encode declaration tags using BTF_KIND_DECL_TAG. > type_tag Encode type tags using BTF_KIND_TYPE_TAG. > diff --git a/pahole.c b/pahole.c > index a33490d..27b7a7e 100644 > --- a/pahole.c > +++ b/pahole.c > @@ -1239,6 +1239,7 @@ ARGP_PROGRAM_VERSION_HOOK_DEF = dwarves_print_version; > #define ARGP_contains_enumerator 344 > #define ARGP_reproducible_build 345 > #define ARGP_running_kernel_vmlinux 346 > +#define ARGP_encode_btf_global_vars 347 > > /* --btf_features=feature1[,feature2,..] allows us to specify > * a list of requested BTF features or "default" to enable all default > @@ -1285,6 +1286,7 @@ struct btf_feature { > } btf_features[] = { > BTF_DEFAULT_FEATURE(encode_force, btf_encode_force, false), > BTF_DEFAULT_FEATURE(var, skip_encoding_btf_vars, true), > + BTF_DEFAULT_FEATURE(global_var, encode_btf_global_vars, false), > BTF_DEFAULT_FEATURE(float, btf_gen_floats, false), > BTF_DEFAULT_FEATURE(decl_tag, skip_encoding_btf_decl_tag, true), > BTF_DEFAULT_FEATURE(type_tag, skip_encoding_btf_type_tag, true), > @@ -1720,7 +1722,12 @@ static const struct argp_option pahole__options[] = { > { > .name = "skip_encoding_btf_vars", > .key = ARGP_skip_encoding_btf_vars, > - .doc = "Do not encode VARs in BTF." > + .doc = "Do not encode VARs in BTF (default: percpu only)." > + }, you've inherited this, but the double negatives here are really confusing; in brackets would it be clearer to say "If this option is not specified, per-cpu variables are encoded only. To encode all global variables, --encode_btf_global_vars must be specified too". Or something similar. > + { > + .name = "encode_btf_global_vars", > + .key = ARGP_skip_encoding_btf_vars, > + .doc = "Encode all global VARs in BTF (default: percpu only)." > }, > { > .name = "btf_encode_force", > @@ -2047,6 +2054,8 @@ static error_t pahole__options_parser(int key, char *arg, > show_supported_btf_features(stdout); exit(0); > case ARGP_btf_features_strict: > parse_btf_features(arg, true); break; > + case ARGP_encode_btf_global_vars: > + conf_load.encode_btf_global_vars = true; break; > default: > return ARGP_ERR_UNKNOWN; > } ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH dwarves 4/4] btf_encoder: add global_var feature to encode globals 2024-09-15 11:49 ` Alan Maguire @ 2024-09-20 8:18 ` Stephen Brennan 0 siblings, 0 replies; 14+ messages in thread From: Stephen Brennan @ 2024-09-20 8:18 UTC (permalink / raw) To: Alan Maguire, Arnaldo Carvalho de Melo; +Cc: dwarves, linux-debuggers Alan Maguire <alan.maguire@oracle.com> writes: > On 12/09/2024 20:08, Stephen Brennan wrote: >> Currently the "var" feature only encodes percpu variables. Add the >> ability to encode all global variables. >> >> This also drops the use of the symbol table to find variable names and >> compare them against DWARF names. We simply rely on the DWARF >> information to give us all the variables. >> >> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> > > Tried this out on x86_64/aarch64, got multiple DATASECs and ~28000 > variables. All worked nicely! > > Tested-by: Alan Maguire <alan.maguire@oracle.com> > > more below, mostly small things though. > >> --- >> btf_encoder.c | 345 ++++++++++++++++++++------------------------- >> btf_encoder.h | 8 ++ >> dwarves.h | 1 + >> man-pages/pahole.1 | 8 +- >> pahole.c | 11 +- >> 5 files changed, 181 insertions(+), 192 deletions(-) >> > > What's really impressive here is that you've deleted more code than > you've added while also adding valuable functionality; that's a rare > thing so didn't want to leave it unremarked! I didn't even notice myself :) >> diff --git a/btf_encoder.c b/btf_encoder.c >> index e3384e5..1872a35 100644 >> --- a/btf_encoder.c >> +++ b/btf_encoder.c >> @@ -67,16 +67,13 @@ struct elf_function { >> >> #define MAX_ELF_SEC_CNT 128 >> >> -struct var_info { >> - uint64_t addr; >> - const char *name; >> - uint32_t sz; >> -}; >> - >> struct elf_secinfo { >> uint64_t addr; >> const char *name; >> uint64_t sz; >> + bool include; >> + uint32_t type; >> + struct gobuffer secinfo; >> }; >> >> /* >> @@ -86,31 +83,24 @@ struct btf_encoder { >> struct list_head node; >> struct btf *btf; >> struct cu *cu; >> - struct gobuffer percpu_secinfo; >> const char *source_filename; >> const char *filename; >> struct elf_symtab *symtab; >> uint32_t type_id_off; >> bool has_index_type, >> need_index_type, >> - skip_encoding_vars, >> raw_output, >> verbose, >> force, >> gen_floats, >> skip_encoding_decl_tag, >> tag_kfuncs, >> - is_rel, >> gen_distilled_base; >> uint32_t array_index_id; >> struct elf_secinfo secinfo[MAX_ELF_SEC_CNT]; >> size_t seccnt; >> - struct { >> - struct var_info *vars; >> - int var_cnt; >> - int allocated; >> - uint32_t shndx; >> - } percpu; >> + int encode_vars; >> + uint32_t percpu_shndx; >> struct { >> struct elf_function *entries; >> int allocated; >> @@ -737,46 +727,56 @@ static int32_t btf_encoder__add_var(struct btf_encoder *encoder, uint32_t type, >> return id; >> } >> >> -static int32_t btf_encoder__add_var_secinfo(struct btf_encoder *encoder, uint32_t type, >> - uint32_t offset, uint32_t size) >> +static int32_t btf_encoder__add_var_secinfo(struct btf_encoder *encoder, int shndx, >> + uint32_t type, unsigned long addr, uint32_t size) >> { >> struct btf_var_secinfo si = { >> .type = type, >> - .offset = offset, >> + .offset = addr, >> .size = size, >> }; >> - return gobuffer__add(&encoder->percpu_secinfo, &si, sizeof(si)); >> + return gobuffer__add(&encoder->secinfo[shndx].secinfo, &si, sizeof(si)); >> } >> >> int32_t btf_encoder__add_encoder(struct btf_encoder *encoder, struct btf_encoder *other) >> { >> - struct gobuffer *var_secinfo_buf = &other->percpu_secinfo; >> - size_t sz = gobuffer__size(var_secinfo_buf); >> - uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); >> - uint32_t type_id; >> - uint32_t next_type_id = btf__type_cnt(encoder->btf); >> - int32_t i, id; >> - struct btf_var_secinfo *vsi; >> - >> + int shndx; >> if (encoder == other) >> return 0; >> >> btf_encoder__add_saved_funcs(other); >> >> - for (i = 0; i < nr_var_secinfo; i++) { >> - vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i; >> - type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */ >> - id = btf_encoder__add_var_secinfo(encoder, type_id, vsi->offset, vsi->size); >> - if (id < 0) >> - return id; >> + for (shndx = 0; shndx < other->seccnt; shndx++) { >> + struct gobuffer *var_secinfo_buf = &other->secinfo[shndx].secinfo; >> + size_t sz = gobuffer__size(var_secinfo_buf); >> + uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); >> + uint32_t type_id; >> + uint32_t next_type_id = btf__type_cnt(encoder->btf); >> + int32_t i, id; >> + struct btf_var_secinfo *vsi; >> + >> + if (strcmp(encoder->secinfo[shndx].name, other->secinfo[shndx].name)) { >> + fprintf(stderr, "mismatched ELF sections at index %d: \"%s\", \"%s\"\n", >> + shndx, encoder->secinfo[shndx].name, other->secinfo[shndx].name); >> + return -1; >> + } >> + >> + for (i = 0; i < nr_var_secinfo; i++) { >> + vsi = (struct btf_var_secinfo *)var_secinfo_buf->entries + i; >> + type_id = next_type_id + vsi->type - 1; /* Type ID starts from 1 */ >> + id = btf_encoder__add_var_secinfo(encoder, shndx, type_id, vsi->offset, vsi->size); >> + if (id < 0) >> + return id; >> + } >> } >> >> return btf__add_btf(encoder->btf, other->btf); >> } >> >> -static int32_t btf_encoder__add_datasec(struct btf_encoder *encoder, const char *section_name) >> +static int32_t btf_encoder__add_datasec(struct btf_encoder *encoder, int shndx) >> { >> - struct gobuffer *var_secinfo_buf = &encoder->percpu_secinfo; >> + struct gobuffer *var_secinfo_buf = &encoder->secinfo[shndx].secinfo; >> + const char *section_name = encoder->secinfo[shndx].name; >> struct btf *btf = encoder->btf; >> size_t sz = gobuffer__size(var_secinfo_buf); >> uint16_t nr_var_secinfo = sz / sizeof(struct btf_var_secinfo); >> @@ -1743,13 +1743,14 @@ out: >> int btf_encoder__encode(struct btf_encoder *encoder) >> { >> bool should_tag_kfuncs; >> - int err; >> + int err, shndx; >> >> /* for single-threaded case, saved funcs are added here */ >> btf_encoder__add_saved_funcs(encoder); >> >> - if (gobuffer__size(&encoder->percpu_secinfo) != 0) >> - btf_encoder__add_datasec(encoder, PERCPU_SECTION); >> + for (shndx = 0; shndx < encoder->seccnt; shndx++) >> + if (gobuffer__size(&encoder->secinfo[shndx].secinfo)) >> + btf_encoder__add_datasec(encoder, shndx); >> >> /* Empty file, nothing to do, so... done! */ >> if (btf__type_cnt(encoder->btf) == 1) >> @@ -1799,111 +1800,18 @@ int btf_encoder__encode(struct btf_encoder *encoder) >> return err; >> } >> >> -static int percpu_var_cmp(const void *_a, const void *_b) >> -{ >> - const struct var_info *a = _a; >> - const struct var_info *b = _b; >> - >> - if (a->addr == b->addr) >> - return 0; >> - return a->addr < b->addr ? -1 : 1; >> -} >> - >> -static bool btf_encoder__percpu_var_exists(struct btf_encoder *encoder, uint64_t addr, uint32_t *sz, const char **name) >> -{ >> - struct var_info key = { .addr = addr }; >> - const struct var_info *p = bsearch(&key, encoder->percpu.vars, encoder->percpu.var_cnt, >> - sizeof(encoder->percpu.vars[0]), percpu_var_cmp); >> - if (!p) >> - return false; >> - >> - *sz = p->sz; >> - *name = p->name; >> - return true; >> -} >> - >> -static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym *sym, size_t sym_sec_idx) >> -{ >> - const char *sym_name; >> - uint64_t addr; >> - uint32_t size; >> - >> - /* compare a symbol's shndx to determine if it's a percpu variable */ >> - if (sym_sec_idx != encoder->percpu.shndx) >> - return 0; >> - if (elf_sym__type(sym) != STT_OBJECT) >> - return 0; >> - >> - addr = elf_sym__value(sym); >> - >> - size = elf_sym__size(sym); >> - if (!size) >> - return 0; /* ignore zero-sized symbols */ >> - >> - sym_name = elf_sym__name(sym, encoder->symtab); >> - if (!btf_name_valid(sym_name)) { >> - dump_invalid_symbol("Found symbol of invalid name when encoding btf", >> - sym_name, encoder->verbose, encoder->force); >> - if (encoder->force) >> - return 0; >> - return -1; >> - } >> - >> - if (encoder->verbose) >> - printf("Found per-CPU symbol '%s' at address 0x%" PRIx64 "\n", sym_name, addr); >> - >> - /* Make sure addr is section-relative. For kernel modules (which are >> - * ET_REL files) this is already the case. For vmlinux (which is an >> - * ET_EXEC file) we need to subtract the section address. >> - */ >> - if (!encoder->is_rel) >> - addr -= encoder->secinfo[encoder->percpu.shndx].addr; >> - >> - if (encoder->percpu.var_cnt == encoder->percpu.allocated) { >> - struct var_info *new; >> - >> - new = reallocarray_grow(encoder->percpu.vars, >> - &encoder->percpu.allocated, >> - sizeof(*encoder->percpu.vars)); >> - if (!new) { >> - fprintf(stderr, "Failed to allocate memory for variables\n"); >> - return -1; >> - } >> - encoder->percpu.vars = new; >> - } >> - encoder->percpu.vars[encoder->percpu.var_cnt].addr = addr; >> - encoder->percpu.vars[encoder->percpu.var_cnt].sz = size; >> - encoder->percpu.vars[encoder->percpu.var_cnt].name = sym_name; >> - encoder->percpu.var_cnt++; >> - >> - return 0; >> -} >> >> -static int btf_encoder__collect_symbols(struct btf_encoder *encoder, bool collect_percpu_vars) >> +static int btf_encoder__collect_symbols(struct btf_encoder *encoder) >> { >> - Elf32_Word sym_sec_idx; >> + uint32_t sym_sec_idx; >> uint32_t core_id; >> GElf_Sym sym; >> >> - /* cache variables' addresses, preparing for searching in symtab. */ >> - encoder->percpu.var_cnt = 0; >> - >> - /* search within symtab for percpu variables */ >> elf_symtab__for_each_symbol_index(encoder->symtab, core_id, sym, sym_sec_idx) { >> - if (collect_percpu_vars && btf_encoder__collect_percpu_var(encoder, &sym, sym_sec_idx)) >> - return -1; >> if (btf_encoder__collect_function(encoder, &sym)) >> return -1; >> } >> >> - if (collect_percpu_vars) { >> - if (encoder->percpu.var_cnt) >> - qsort(encoder->percpu.vars, encoder->percpu.var_cnt, sizeof(encoder->percpu.vars[0]), percpu_var_cmp); >> - >> - if (encoder->verbose) >> - printf("Found %d per-CPU variables!\n", encoder->percpu.var_cnt); >> - } >> - >> if (encoder->functions.cnt) { >> qsort(encoder->functions.entries, encoder->functions.cnt, sizeof(encoder->functions.entries[0]), >> functions_cmp); >> @@ -1925,75 +1833,126 @@ static bool ftype__has_arg_names(const struct ftype *ftype) >> return true; >> } >> >> +static int get_elf_section(struct btf_encoder *encoder, unsigned long addr) >> +{ >> + for (int i = 1; i < encoder->seccnt; i++) >> + /* Variables are only present in PROGBITS or NOBITS (.bss) */ >> + if ((encoder->secinfo[i].type == SHT_PROGBITS || >> + encoder->secinfo[i].type == SHT_NOBITS) && >> + encoder->secinfo[i].addr <= addr && >> + (addr - encoder->secinfo[i].addr) < encoder->secinfo[i].sz) >> + return i; >> + return 0; >> +} > > what if our variables were in shndx 0, probably unlikely but not sure if > it's possible? What about making it return -ENOENT here if no section > matches? I'm intentionally starting the loop at 1 to ignore the null section at index 0. According to the ELF standard index 0 should always be SHT_NULL, see Figure 1-10 in the linked doc: https://refspecs.linuxfoundation.org/elf/elf.pdf That said... clearer return values good, and this index 0 business is heavily implicit, so I'll adjust the return value all the same, and add a comment about index 0. >> + >> +/* >> + * Filter out variables / symbol names with common prefixes and no useful >> + * values. Prefixes should be added sparingly, and it should be objectively >> + * obvious that they are not useful. >> + */ >> +static bool filter_variable_name(const char *name) >> +{ >> + static const struct { char *s; size_t len; } skip[] = { >> + #define X(str) {str, sizeof(str) - 1} >> + X("__UNIQUE_ID"), >> + X("__tpstrtab_"), >> + X("__exitcall_"), >> + X("__func_stack_frame_non_standard_") >> + #undef X >> + }; >> + int i; >> + >> + if (*name != '_') >> + return false; >> + >> + for (i = 0; i < sizeof(skip) / sizeof(skip[0]); i++) > > nit we have ARRAY_SIZE() in dwarves.h, ARRAY_SIZE(skip) could be used > here; i'd also add brackets for the for() loop body even tho not > strictly required. Thanks! >> + if (strncmp(name, skip[i].s, skip[i].len) == 0) >> + return true; >> + return false; >> +} >> + >> static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) >> { >> struct cu *cu = encoder->cu; >> uint32_t core_id; >> struct tag *pos; >> int err = -1; >> - struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx]; >> >> - if (encoder->percpu.shndx == 0 || !encoder->symtab) >> + if (!encoder->symtab) >> return 0; >> >> if (encoder->verbose) >> - printf("search cu '%s' for percpu global variables.\n", cu->name); >> + printf("search cu '%s' for global variables.\n", cu->name); >> >> cu__for_each_variable(cu, core_id, pos) { >> struct variable *var = tag__variable(pos); >> - uint32_t size, type, linkage; >> - const char *name, *dwarf_name; >> + uint32_t type, linkage; >> + const char *name; >> struct llvm_annotation *annot; >> const struct tag *tag; >> + int shndx; >> + struct elf_secinfo *sec = NULL; >> uint64_t addr; >> int id; >> + unsigned long size; >> >> + /* Skip incomplete (non-defining) declarations */ >> if (var->declaration && !var->spec) >> continue; >> >> - /* percpu variables are allocated in global space */ >> - if (variable__scope(var) != VSCOPE_GLOBAL && !var->spec) >> + /* >> + * top_level: indicates that the variable is declared at the top >> + * level of the CU, and thus it is globally scoped. >> + * artificial: indicates that the variable is a compiler-generated >> + * "fake" variable that doesn't appear in the source. >> + * scope: set by pahole to indicate the type of storage the >> + * variable has. GLOBAL indicates it is stored in static >> + * memory (as opposed to a stack variable or register) >> + * >> + * Some variables are "top_level" but not GLOBAL: >> + * e.g. current_stack_pointer, which is a register variable, >> + * despite having global CU-declarations. We don't want that, >> + * since no code could actually find this variable. >> + * Some variables are GLOBAL but not top_level: >> + * e.g. function static variables >> + */ >> + if (!var->top_level || var->artificial || var->scope != VSCOPE_GLOBAL) >> continue; >> >> /* addr has to be recorded before we follow spec */ >> addr = var->ip.addr; >> - dwarf_name = variable__name(var); >> >> - /* Make sure addr is section-relative. DWARF, unlike ELF, >> - * always contains virtual symbol addresses, so subtract >> - * the section address unconditionally. >> - */ >> - if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz) >> + /* Get the ELF section info for the variable */ >> + shndx = get_elf_section(encoder, addr); >> + if (shndx && shndx < encoder->seccnt) >> + sec = &encoder->secinfo[shndx]; >> + if (!sec || !sec->include) >> continue; >> - addr -= pcpu_scn->addr; >> >> - if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name)) >> - continue; /* not a per-CPU variable */ >> + /* Convert addr to section relative */ >> + addr -= sec->addr; >> >> - /* A lot of "special" DWARF variables (e.g, __UNIQUE_ID___xxx) >> - * have addr == 0, which is the same as, say, valid >> - * fixed_percpu_data per-CPU variable. To distinguish between >> - * them, additionally compare DWARF and ELF symbol names. If >> - * DWARF doesn't provide proper name, pessimistically assume >> - * bad variable. >> - * >> - * Examples of such special variables are: >> - * >> - * 1. __ADDRESSABLE(sym), which are forcely emitted as symbols. >> - * 2. __UNIQUE_ID(prefix), which are introduced to generate unique ids. >> - * 3. __exitcall(fn), functions which are labeled as exit calls. >> - * >> - * This is relevant only for vmlinux image, as for kernel >> - * modules per-CPU data section has non-zero offset so all >> - * per-CPU symbols have non-zero values. >> - */ >> - if (var->ip.addr == 0) { >> - if (!dwarf_name || strcmp(dwarf_name, name)) >> + /* DWARF specification reference should be followed, because >> + * information like the name & type may not be present on var */ >> + if (var->spec) >> + var = var->spec; >> + >> + name = variable__name(var); >> + if (!name) >> + continue; >> + >> + /* Check for invalid BTF names */ >> + if (!btf_name_valid(name)) { >> + dump_invalid_symbol("Found invalid variable name when encoding btf", >> + name, encoder->verbose, encoder->force); >> + if (encoder->force) >> continue; >> + else >> + return -1; >> } >> >> - if (var->spec) >> - var = var->spec; >> + if (filter_variable_name(name)) >> + continue; >> >> if (var->ip.tag.type == 0) { >> fprintf(stderr, "error: found variable '%s' in CU '%s' that has void type\n", >> @@ -2005,9 +1964,10 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) >> } >> >> tag = cu__type(cu, var->ip.tag.type); >> - if (tag__size(tag, cu) == 0) { >> + size = tag__size(tag, cu); >> + if (size == 0) { >> if (encoder->verbose) >> - fprintf(stderr, "Ignoring zero-sized per-CPU variable '%s'...\n", dwarf_name ?: "<missing name>"); >> + fprintf(stderr, "Ignoring zero-sized variable '%s'...\n", name ?: "<missing name>"); >> continue; >> } >> >> @@ -2037,13 +1997,14 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) >> } >> >> /* >> - * add a BTF_VAR_SECINFO in encoder->percpu_secinfo, which will be added into >> - * encoder->types later when we add BTF_VAR_DATASEC. >> + * Add the variable to the secinfo for the section it appears in. >> + * Later we will generate a BTF_VAR_DATASEC for all any section with >> + * an encoded variable. >> */ >> - id = btf_encoder__add_var_secinfo(encoder, id, addr, size); >> + id = btf_encoder__add_var_secinfo(encoder, shndx, id, addr, size); >> if (id < 0) { >> fprintf(stderr, "error: failed to encode section info for variable '%s' at addr 0x%" PRIx64 "\n", >> - name, addr); >> + name, addr); >> goto out; >> } >> } >> @@ -2070,7 +2031,6 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam >> >> encoder->force = conf_load->btf_encode_force; >> encoder->gen_floats = conf_load->btf_gen_floats; >> - encoder->skip_encoding_vars = conf_load->skip_encoding_btf_vars; >> encoder->skip_encoding_decl_tag = conf_load->skip_encoding_btf_decl_tag; >> encoder->tag_kfuncs = conf_load->btf_decl_tag_kfuncs; >> encoder->gen_distilled_base = conf_load->btf_gen_distilled_base; >> @@ -2078,6 +2038,11 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam >> encoder->has_index_type = false; >> encoder->need_index_type = false; >> encoder->array_index_id = 0; >> + encoder->encode_vars = 0; >> + if (!conf_load->skip_encoding_btf_vars) >> + encoder->encode_vars |= BTF_VAR_PERCPU; >> + if (conf_load->encode_btf_global_vars) >> + encoder->encode_vars |= BTF_VAR_GLOBAL; >> >> GElf_Ehdr ehdr; >> >> @@ -2087,8 +2052,6 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam >> goto out_delete; >> } >> >> - encoder->is_rel = ehdr.e_type == ET_REL; >> - >> switch (ehdr.e_ident[EI_DATA]) { >> case ELFDATA2LSB: >> btf__set_endianness(encoder->btf, BTF_LITTLE_ENDIAN); >> @@ -2127,15 +2090,21 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam >> encoder->secinfo[shndx].addr = shdr.sh_addr; >> encoder->secinfo[shndx].sz = shdr.sh_size; >> encoder->secinfo[shndx].name = secname; >> - >> - if (strcmp(secname, PERCPU_SECTION) == 0) >> - encoder->percpu.shndx = shndx; >> + encoder->secinfo[shndx].type = shdr.sh_type; >> + if (encoder->encode_vars & BTF_VAR_GLOBAL) >> + encoder->secinfo[shndx].include = true; >> + >> + if (strcmp(secname, PERCPU_SECTION) == 0) { >> + encoder->percpu_shndx = shndx; >> + if (encoder->encode_vars & BTF_VAR_PERCPU) >> + encoder->secinfo[shndx].include = true; >> + } >> } >> >> - if (!encoder->percpu.shndx && encoder->verbose) >> + if (!encoder->percpu_shndx && encoder->verbose) >> printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION); >> >> - if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars)) >> + if (btf_encoder__collect_symbols(encoder)) >> goto out_delete; >> >> if (encoder->verbose) >> @@ -2156,7 +2125,8 @@ void btf_encoder__delete(struct btf_encoder *encoder) >> return; >> >> btf_encoders__delete(encoder); >> - __gobuffer__delete(&encoder->percpu_secinfo); >> + for (int i = 0; i < encoder->seccnt; i++) >> + __gobuffer__delete(&encoder->secinfo[i].secinfo); >> zfree(&encoder->filename); >> zfree(&encoder->source_filename); >> btf__free(encoder->btf); >> @@ -2166,9 +2136,6 @@ void btf_encoder__delete(struct btf_encoder *encoder) >> encoder->functions.allocated = encoder->functions.cnt = 0; >> free(encoder->functions.entries); >> encoder->functions.entries = NULL; >> - encoder->percpu.allocated = encoder->percpu.var_cnt = 0; >> - free(encoder->percpu.vars); >> - encoder->percpu.vars = NULL; >> >> free(encoder); >> } >> @@ -2321,7 +2288,7 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co >> goto out; >> } >> >> - if (!encoder->skip_encoding_vars) >> + if (encoder->encode_vars) >> err = btf_encoder__encode_cu_variables(encoder); >> >> /* It is only safe to delete this CU if we have not stashed any static >> diff --git a/btf_encoder.h b/btf_encoder.h >> index f54c95a..c5424db 100644 >> --- a/btf_encoder.h >> +++ b/btf_encoder.h >> @@ -16,7 +16,15 @@ struct btf; >> struct cu; >> struct list_head; >> >> +enum btf_var_option { >> + BTF_VAR_NONE = 0, >> + BTF_VAR_PERCPU = 1, >> + BTF_VAR_GLOBAL = 2, >> +}; >> +#define BTF_VAR_ALL (BTF_VAR_PERCPU | BTF_VAR_GLOBAL) >> + > > Could also just do > > enum btf_var_option { > BTF_VAR_NONE = 0, > BTF_VAR_PERCPU = 1, > BTF_VAR_GLOBAL = 2, > BTF_VAR_ALL = BTF_VAR_PERCPU | BTF_VAR_GLOBAL, > }; > Either way, BTF_VAR_ALL doesn't seem to be used. Yeah, its main purpose is to illustrate that these are flags, not constants. I'll just put that into a comment and delete BTF_VAR_ALL. >> struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load); >> + >> void btf_encoder__delete(struct btf_encoder *encoder); >> >> int btf_encoder__encode(struct btf_encoder *encoder); >> diff --git a/dwarves.h b/dwarves.h >> index 0fede91..fef881f 100644 >> --- a/dwarves.h >> +++ b/dwarves.h >> @@ -92,6 +92,7 @@ struct conf_load { >> bool btf_gen_optimized; >> bool skip_encoding_btf_inconsistent_proto; >> bool skip_encoding_btf_vars; >> + bool encode_btf_global_vars; >> bool btf_gen_floats; >> bool btf_encode_force; >> bool reproducible_build; >> diff --git a/man-pages/pahole.1 b/man-pages/pahole.1 >> index 0a9d8ac..4bc2d03 100644 >> --- a/man-pages/pahole.1 >> +++ b/man-pages/pahole.1 >> @@ -230,7 +230,10 @@ the debugging information. >> >> .TP >> .B \-\-skip_encoding_btf_vars >> -Do not encode VARs in BTF. >> +.TQ >> +.B \-\-encode_btf_global_vars >> +By default, VARs are encoded only for percpu variables. These options allow >> +to skip encoding them, or alternatively to encode all global variables too. >> >> .TP >> .B \-\-skip_encoding_btf_decl_tag >> @@ -296,7 +299,8 @@ Encode BTF using the specified feature list, or specify 'default' for all standa >> encode_force Ignore invalid symbols when encoding BTF; for example >> if a symbol has an invalid name, it will be ignored >> and BTF encoding will continue. >> - var Encode variables using BTF_KIND_VAR in BTF. >> + var Encode percpu variables using BTF_KIND_VAR in BTF. >> + global_var Encode all global variables in the same way. >> float Encode floating-point types in BTF. >> decl_tag Encode declaration tags using BTF_KIND_DECL_TAG. >> type_tag Encode type tags using BTF_KIND_TYPE_TAG. >> diff --git a/pahole.c b/pahole.c >> index a33490d..27b7a7e 100644 >> --- a/pahole.c >> +++ b/pahole.c >> @@ -1239,6 +1239,7 @@ ARGP_PROGRAM_VERSION_HOOK_DEF = dwarves_print_version; >> #define ARGP_contains_enumerator 344 >> #define ARGP_reproducible_build 345 >> #define ARGP_running_kernel_vmlinux 346 >> +#define ARGP_encode_btf_global_vars 347 >> >> /* --btf_features=feature1[,feature2,..] allows us to specify >> * a list of requested BTF features or "default" to enable all default >> @@ -1285,6 +1286,7 @@ struct btf_feature { >> } btf_features[] = { >> BTF_DEFAULT_FEATURE(encode_force, btf_encode_force, false), >> BTF_DEFAULT_FEATURE(var, skip_encoding_btf_vars, true), >> + BTF_DEFAULT_FEATURE(global_var, encode_btf_global_vars, false), >> BTF_DEFAULT_FEATURE(float, btf_gen_floats, false), >> BTF_DEFAULT_FEATURE(decl_tag, skip_encoding_btf_decl_tag, true), >> BTF_DEFAULT_FEATURE(type_tag, skip_encoding_btf_type_tag, true), >> @@ -1720,7 +1722,12 @@ static const struct argp_option pahole__options[] = { >> { >> .name = "skip_encoding_btf_vars", >> .key = ARGP_skip_encoding_btf_vars, >> - .doc = "Do not encode VARs in BTF." >> + .doc = "Do not encode VARs in BTF (default: percpu only)." >> + }, > > you've inherited this, but the double negatives here are really > confusing; in brackets would it be clearer to say "If this option is not > specified, per-cpu variables are encoded only. To encode all global > variables, --encode_btf_global_vars must be specified too". Or something > similar. I've done my best to improve the help text for v2! Thanks, Stephen >> + { >> + .name = "encode_btf_global_vars", >> + .key = ARGP_skip_encoding_btf_vars, >> + .doc = "Encode all global VARs in BTF (default: percpu only)." >> }, >> { >> .name = "btf_encode_force", >> @@ -2047,6 +2054,8 @@ static error_t pahole__options_parser(int key, char *arg, >> show_supported_btf_features(stdout); exit(0); >> case ARGP_btf_features_strict: >> parse_btf_features(arg, true); break; >> + case ARGP_encode_btf_global_vars: >> + conf_load.encode_btf_global_vars = true; break; >> default: >> return ARGP_ERR_UNKNOWN; >> } ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-09-20 8:18 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-12 19:08 [PATCH dwarves 0/4] Emit global variables in BTF Stephen Brennan 2024-09-12 19:08 ` [PATCH dwarves 1/4] dutil: return ELF section name when looked up by index Stephen Brennan 2024-09-13 13:26 ` Alan Maguire 2024-09-13 17:06 ` Stephen Brennan 2024-09-12 19:08 ` [PATCH dwarves 2/4] dwarf_loader: add "artificial" and "top_level" variable flags Stephen Brennan 2024-09-13 13:35 ` Alan Maguire 2024-09-13 17:16 ` Stephen Brennan 2024-09-12 19:08 ` [PATCH dwarves 3/4] btf_encoder: cache all ELF section info Stephen Brennan 2024-09-13 15:25 ` Alan Maguire 2024-09-13 17:05 ` Stephen Brennan 2024-09-15 11:26 ` Alan Maguire 2024-09-12 19:08 ` [PATCH dwarves 4/4] btf_encoder: add global_var feature to encode globals Stephen Brennan 2024-09-15 11:49 ` Alan Maguire 2024-09-20 8:18 ` Stephen Brennan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).