All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Stephen Brennan <stephen@brennan.io>
Cc: Yonghong Song <yhs@fb.com>, Shung-Hsi Yu <shung-hsi.yu@suse.com>,
	bpf@vger.kernel.org, Omar Sandoval <osandov@osandov.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Stephen Brennan <stephen.s.brennan@oracle.com>
Subject: Re: Question: missing vmlinux BTF variable declarations
Date: Tue, 15 Mar 2022 14:58:59 -0300	[thread overview]
Message-ID: <YjDT498PfzFT+kT4@kernel.org> (raw)
In-Reply-To: <8735jjw4rp.fsf@brennan.io>

Em Tue, Mar 15, 2022 at 09:37:46AM -0700, Stephen Brennan escreveu: Yonghong Song <yhs@fb.com> writes:
> > On 3/14/22 12:09 AM, Shung-Hsi Yu wrote:
> >> On Wed, Mar 09, 2022 at 03:20:47PM -0800, Stephen Brennan wrote:
> >>> I've been recently learning about BTF with a keen interest in using it
> >>> as a fallback source of debug information. On the face of it, Linux
> >>> kernels these days have a lot of introspection information. BTF provides
> >>> information about types. kallsyms provides information about symbol
> >>> locations. ORC allows us to reliably unwind stack traces. So together,
> >>> these could enable a debugger (either postmortem, or live) to do a lot
> >>> without needing to read the (very large) DWARF debuginfo files. For
> >>> example, we could format backtraces with function names, we could

> > For backtraces with function names, you probably still need ksyms since
> > BTF won't encode address => symbol translation.
 
> Yes, kallsyms is definitely required in this scheme. In practice, it
> seems very common for distributions to be compiled not just with
> CONFIG_KALLSYMS, but CONFIG_KALLSYMS_ALL.
 
> Kallsyms is critical for mapping names to addresses (and vice versa).
 
> >>> pretty-print global variables and data structures, etc. This is nice

> > This indeed is a potential use case.
> > We discussed this during adding per-cpu
> > global variables. Ultimately we just added per-cpu global variables 
> > since we didn't have a use case or request for other global variables.

> > But I still would like to know beyond this whether you have other needs
> > which BPF may or may not help. It would be good to know since if 
> > ultimately you still need dwarf, then it might be undesirable to
> > add general global variables to BTF.

> I think that kallsyms, BTF, and ORC together will be enough to provide a
> lite debugging experience. Some things will be missing:

> - mapping backtrace addresses to source code lines

So, BTF has provisions for that, and its present in the eBPF programs,
perf annotate uses it, see tools/perf/util/annotate.c,
symbol__disassemble_bpf(), it goes like:

        struct bpf_prog_linfo *prog_linfo = NULL;

        info_node = perf_env__find_bpf_prog_info(dso->bpf_prog.env,
                                                 dso->bpf_prog.id);
        if (!info_node) {
                ret = SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF;
                goto out;
        }
        info_linear = info_node->info_linear;
        sub_id = dso->bpf_prog.sub_id;

        info.buffer = (void *)(uintptr_t)(info_linear->info.jited_prog_insns);
        info.buffer_length = info_linear->info.jited_prog_len;

        if (info_linear->info.nr_line_info)
                prog_linfo = bpf_prog_linfo__new(&info_linear->info);

                addr = pc + ((u64 *)(uintptr_t)(info_linear->info.jited_ksyms))[sub_id];
                count = disassemble(pc, &info);

                if (prog_linfo)
                        linfo = bpf_prog_linfo__lfind_addr_func(prog_linfo,
                                                                addr, sub_id,
                                                                nr_skip);
		                if (linfo && btf) {
                        srcline = btf__name_by_offset(btf, linfo->line_off);
                        nr_skip++;
                } else
                        srcline = NULL;

etc.

Having this for the kernel proper is thus doable, but then we go on
making BTF info grow.

Perhaps having this as optional, distros or appliances wanting to have a
kernel with this extra info would add it and then tools would use it if
available?

> - intelligent stack frame information from DWARF CFI (e.g.
>   register/variable values)
> - probably other things, I'm not a DWARF expert.
 
> However, I do have two interesting branches of drgn which demonstrate
> the utility of just BTF+kallsyms:
 
> 1. https://github.com/osandov/drgn/pull/162
> 2. https://github.com/brenns10/drgn/tree/kallsyms_plus_btf
 
> #1 adds preliminary BTF support, and #2 adds basic kallsyms support,
> building on #1. Finally, I have some unpublished patches which add some
> symbols into vmcoreinfo, which help us locate kallsyms info. From there,
> drgn is able to take a core dump, and lookup symbols and get their
> corresponding type info!
 
> The only real blocker I see here is that the BTF data is mainly limited
> to functions, so most of what you're doing is looking up function names
> and viewing their signatures :)
 
> >>> given that depending on your distro, it might be tough to get debuginfo,
> >>> and it is quite large to download or install.
> >>>
> >>> As I've worked toward this goal, I discovered that while the
> >>> BTF_KIND_VAR exists [1], the BTF included in the core kernel only has
> >>> declarations for percpu variables. This makes BTF much less useful for
> >>> this (admittedly odd) use case. Without a way to bind a name found in
> >>> kallsyms to its type, we can't interpret global variables. It looks like
> >>> the restriction for percpu-only variables is baked into the pahole BTF
> >>> encoder [2].

> >>> [1]: https://www.kernel.org/doc/html/latest/bpf/btf.html#btf-kind-var
> >>> [2]: https://github.com/acmel/dwarves/blob/master/btf_encoder.c

> >>> I wonder what the BPF / BTF community's thoughts are on including more
> >>> of these global variable declarations? Perhaps behind a
> >>> CONFIG_DEBUG_INFO_BTF_ALL, like how kallsyms does it? I'm aware that

> > Currently on my local machine, the vmlinux BTF's size is 4.2MB and
> > adding 1MB would be a big increase. CONFIG_DEBUG_INFO_BTF_ALL is a good
> > idea. But we might be able to just add global variables without this
> > new config if we have strong use case.
 
> And unfortunately 1MiB is really just a shot in the dark, guessing
> around 70k variables with no string data.

Maybe we can have a separate BTF file with all this extra info that
could be fetched from somewhere, keyed by build-id, like is now possible
with debuginfod and DWARF?
 
> I'd love to use kallsyms to avoid adding new strings into BTF. If the
> "all variables BTF" config added a dependency on "CONFIG_KALLSYMS_ALL",
> then we could use the BTF "kind_flag" to indicate that string values
> should be looked up in the kallsyms table, not the BTF strings section.
> This could even be used to reduce the string footprint for BTF
> function names.
 
> Of course it's a more complex change to dwarves :(
 
> >>> each declaration costs at least 16 bytes of BTF records, plus the
> >>> strings and any necessary type data. The string cost could be mitigated
> >>> by allowing "name_off" to refer to the kallsyms offset for variable or
> >>> function declaration. But the additional records could cost around 1MiB
> >>> for common distribution configurations.
> >>>
> >>> I know this isn't the designed use case for BTF, but I think it's very
> >>> exciting.
> >> 
> >> I've been wondering about the same (possibility of using BTF for postmortem
> >> debugging without debuginfo), though not to the extend that you've
> >> researched.
> >> 
> >> I find the idea exciting as well, and quite useful for distros where the
> >> kernel package changes quite often that the debuginfo package may be long
> >> gone by the time a crash dump for such kernel is captured.
> >
> > I would love to use BTF (including global variables in BTF) for crash 
> > dump. But I suspect we may still have some gaps. Maybe you can
> > explore a little bit more on this?
> 
> Hopefully my above explanation gives more context here. There is code
> (not production-ready) which can make use of these features together.
> The next step for me has been trying to get the dwarves/pahole BTF
> encoder to output *all* functions but I've hit some issues with it. If I
> can get that to work, then I can present a full demo of these pieces
> working together and we can be confident that there are no gaps.
> 
> Maybe this is a topic worth discussing at LSF/MM/BPF conference? Though
> it's quite late for that...
> 
> Thanks,
> Stephen
> 
> >
> >> 
> >> Shung-Hsi
> >> 
> >>> Thanks for your attention!
> >>> Stephen
> >> 

-- 

- Arnaldo

  reply	other threads:[~2022-03-15 17:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-09 23:20 Question: missing vmlinux BTF variable declarations Stephen Brennan
2022-03-14  7:09 ` Shung-Hsi Yu
2022-03-15  5:53   ` Yonghong Song
2022-03-15 16:37     ` Stephen Brennan
2022-03-15 17:58       ` Arnaldo Carvalho de Melo [this message]
2022-03-16 16:06         ` Stephen Brennan
2022-03-25 17:07           ` Andrii Nakryiko
2022-04-27 18:24             ` Stephen Brennan
2022-04-29 17:10               ` Alexei Starovoitov
2022-05-03 14:39                 ` Arnaldo Carvalho de Melo
2022-05-03 17:29                   ` Stephen Brennan
2022-05-03 22:31                     ` Alan Maguire
2022-05-10  0:10                       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjDT498PfzFT+kT4@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=bpf@vger.kernel.org \
    --cc=osandov@osandov.com \
    --cc=shung-hsi.yu@suse.com \
    --cc=stephen.s.brennan@oracle.com \
    --cc=stephen@brennan.io \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.