From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Stephen Brennan <stephen@brennan.io>
Cc: Yonghong Song <yhs@fb.com>, Shung-Hsi Yu <shung-hsi.yu@suse.com>,
bpf@vger.kernel.org, Omar Sandoval <osandov@osandov.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Stephen Brennan <stephen.s.brennan@oracle.com>
Subject: Re: Question: missing vmlinux BTF variable declarations
Date: Tue, 15 Mar 2022 14:58:59 -0300 [thread overview]
Message-ID: <YjDT498PfzFT+kT4@kernel.org> (raw)
In-Reply-To: <8735jjw4rp.fsf@brennan.io>
Em Tue, Mar 15, 2022 at 09:37:46AM -0700, Stephen Brennan escreveu: Yonghong Song <yhs@fb.com> writes:
> > On 3/14/22 12:09 AM, Shung-Hsi Yu wrote:
> >> On Wed, Mar 09, 2022 at 03:20:47PM -0800, Stephen Brennan wrote:
> >>> I've been recently learning about BTF with a keen interest in using it
> >>> as a fallback source of debug information. On the face of it, Linux
> >>> kernels these days have a lot of introspection information. BTF provides
> >>> information about types. kallsyms provides information about symbol
> >>> locations. ORC allows us to reliably unwind stack traces. So together,
> >>> these could enable a debugger (either postmortem, or live) to do a lot
> >>> without needing to read the (very large) DWARF debuginfo files. For
> >>> example, we could format backtraces with function names, we could
> > For backtraces with function names, you probably still need ksyms since
> > BTF won't encode address => symbol translation.
> Yes, kallsyms is definitely required in this scheme. In practice, it
> seems very common for distributions to be compiled not just with
> CONFIG_KALLSYMS, but CONFIG_KALLSYMS_ALL.
> Kallsyms is critical for mapping names to addresses (and vice versa).
> >>> pretty-print global variables and data structures, etc. This is nice
> > This indeed is a potential use case.
> > We discussed this during adding per-cpu
> > global variables. Ultimately we just added per-cpu global variables
> > since we didn't have a use case or request for other global variables.
> > But I still would like to know beyond this whether you have other needs
> > which BPF may or may not help. It would be good to know since if
> > ultimately you still need dwarf, then it might be undesirable to
> > add general global variables to BTF.
> I think that kallsyms, BTF, and ORC together will be enough to provide a
> lite debugging experience. Some things will be missing:
> - mapping backtrace addresses to source code lines
So, BTF has provisions for that, and its present in the eBPF programs,
perf annotate uses it, see tools/perf/util/annotate.c,
symbol__disassemble_bpf(), it goes like:
struct bpf_prog_linfo *prog_linfo = NULL;
info_node = perf_env__find_bpf_prog_info(dso->bpf_prog.env,
dso->bpf_prog.id);
if (!info_node) {
ret = SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF;
goto out;
}
info_linear = info_node->info_linear;
sub_id = dso->bpf_prog.sub_id;
info.buffer = (void *)(uintptr_t)(info_linear->info.jited_prog_insns);
info.buffer_length = info_linear->info.jited_prog_len;
if (info_linear->info.nr_line_info)
prog_linfo = bpf_prog_linfo__new(&info_linear->info);
addr = pc + ((u64 *)(uintptr_t)(info_linear->info.jited_ksyms))[sub_id];
count = disassemble(pc, &info);
if (prog_linfo)
linfo = bpf_prog_linfo__lfind_addr_func(prog_linfo,
addr, sub_id,
nr_skip);
if (linfo && btf) {
srcline = btf__name_by_offset(btf, linfo->line_off);
nr_skip++;
} else
srcline = NULL;
etc.
Having this for the kernel proper is thus doable, but then we go on
making BTF info grow.
Perhaps having this as optional, distros or appliances wanting to have a
kernel with this extra info would add it and then tools would use it if
available?
> - intelligent stack frame information from DWARF CFI (e.g.
> register/variable values)
> - probably other things, I'm not a DWARF expert.
> However, I do have two interesting branches of drgn which demonstrate
> the utility of just BTF+kallsyms:
> 1. https://github.com/osandov/drgn/pull/162
> 2. https://github.com/brenns10/drgn/tree/kallsyms_plus_btf
> #1 adds preliminary BTF support, and #2 adds basic kallsyms support,
> building on #1. Finally, I have some unpublished patches which add some
> symbols into vmcoreinfo, which help us locate kallsyms info. From there,
> drgn is able to take a core dump, and lookup symbols and get their
> corresponding type info!
> The only real blocker I see here is that the BTF data is mainly limited
> to functions, so most of what you're doing is looking up function names
> and viewing their signatures :)
> >>> given that depending on your distro, it might be tough to get debuginfo,
> >>> and it is quite large to download or install.
> >>>
> >>> As I've worked toward this goal, I discovered that while the
> >>> BTF_KIND_VAR exists [1], the BTF included in the core kernel only has
> >>> declarations for percpu variables. This makes BTF much less useful for
> >>> this (admittedly odd) use case. Without a way to bind a name found in
> >>> kallsyms to its type, we can't interpret global variables. It looks like
> >>> the restriction for percpu-only variables is baked into the pahole BTF
> >>> encoder [2].
> >>> [1]: https://www.kernel.org/doc/html/latest/bpf/btf.html#btf-kind-var
> >>> [2]: https://github.com/acmel/dwarves/blob/master/btf_encoder.c
> >>> I wonder what the BPF / BTF community's thoughts are on including more
> >>> of these global variable declarations? Perhaps behind a
> >>> CONFIG_DEBUG_INFO_BTF_ALL, like how kallsyms does it? I'm aware that
> > Currently on my local machine, the vmlinux BTF's size is 4.2MB and
> > adding 1MB would be a big increase. CONFIG_DEBUG_INFO_BTF_ALL is a good
> > idea. But we might be able to just add global variables without this
> > new config if we have strong use case.
> And unfortunately 1MiB is really just a shot in the dark, guessing
> around 70k variables with no string data.
Maybe we can have a separate BTF file with all this extra info that
could be fetched from somewhere, keyed by build-id, like is now possible
with debuginfod and DWARF?
> I'd love to use kallsyms to avoid adding new strings into BTF. If the
> "all variables BTF" config added a dependency on "CONFIG_KALLSYMS_ALL",
> then we could use the BTF "kind_flag" to indicate that string values
> should be looked up in the kallsyms table, not the BTF strings section.
> This could even be used to reduce the string footprint for BTF
> function names.
> Of course it's a more complex change to dwarves :(
> >>> each declaration costs at least 16 bytes of BTF records, plus the
> >>> strings and any necessary type data. The string cost could be mitigated
> >>> by allowing "name_off" to refer to the kallsyms offset for variable or
> >>> function declaration. But the additional records could cost around 1MiB
> >>> for common distribution configurations.
> >>>
> >>> I know this isn't the designed use case for BTF, but I think it's very
> >>> exciting.
> >>
> >> I've been wondering about the same (possibility of using BTF for postmortem
> >> debugging without debuginfo), though not to the extend that you've
> >> researched.
> >>
> >> I find the idea exciting as well, and quite useful for distros where the
> >> kernel package changes quite often that the debuginfo package may be long
> >> gone by the time a crash dump for such kernel is captured.
> >
> > I would love to use BTF (including global variables in BTF) for crash
> > dump. But I suspect we may still have some gaps. Maybe you can
> > explore a little bit more on this?
>
> Hopefully my above explanation gives more context here. There is code
> (not production-ready) which can make use of these features together.
> The next step for me has been trying to get the dwarves/pahole BTF
> encoder to output *all* functions but I've hit some issues with it. If I
> can get that to work, then I can present a full demo of these pieces
> working together and we can be confident that there are no gaps.
>
> Maybe this is a topic worth discussing at LSF/MM/BPF conference? Though
> it's quite late for that...
>
> Thanks,
> Stephen
>
> >
> >>
> >> Shung-Hsi
> >>
> >>> Thanks for your attention!
> >>> Stephen
> >>
--
- Arnaldo
next prev parent reply other threads:[~2022-03-15 17:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-09 23:20 Question: missing vmlinux BTF variable declarations Stephen Brennan
2022-03-14 7:09 ` Shung-Hsi Yu
2022-03-15 5:53 ` Yonghong Song
2022-03-15 16:37 ` Stephen Brennan
2022-03-15 17:58 ` Arnaldo Carvalho de Melo [this message]
2022-03-16 16:06 ` Stephen Brennan
2022-03-25 17:07 ` Andrii Nakryiko
2022-04-27 18:24 ` Stephen Brennan
2022-04-29 17:10 ` Alexei Starovoitov
2022-05-03 14:39 ` Arnaldo Carvalho de Melo
2022-05-03 17:29 ` Stephen Brennan
2022-05-03 22:31 ` Alan Maguire
2022-05-10 0:10 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YjDT498PfzFT+kT4@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=bpf@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=shung-hsi.yu@suse.com \
--cc=stephen.s.brennan@oracle.com \
--cc=stephen@brennan.io \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox