BPF List
 help / color / mirror / Atom feed
From: Stephen Brennan <stephen.s.brennan@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>, Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	dwarves@vger.kernel.org, bpf <bpf@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alan Maguire <alan.maguire@oracle.com>
Subject: Re: [PATCH dwarves 0/7] Add support for generating BTF for all variables
Date: Fri, 09 Sep 2022 12:31:24 -0700	[thread overview]
Message-ID: <87sfl0l4z7.fsf@oracle.com> (raw)
In-Reply-To: <CAADnVQJCQdL4j1FFdSE=K6mUaoVGJkcVK-xzgJ_5MSvb2tEkFw@mail.gmail.com>

>> (a) While we save space on vmlinux BTF, each module will have a bit of
>>     extra data for variable types. On my laptop (5.15 based) I have 9.8
>>     MB of BTF, and if you deduct vmlinux, you're still left with 4.7 MB.
>>     If we assume the same overhead of 23.7%, that would be 1.1 MB of
>>     extra module BTF for my particular use case.
>>
>>     $ ls -l /sys/kernel/btf | awk '{sum += $5} END {print(sum)}'
>>     9876871
>>     $ ls -l /sys/kernel/btf/vmlinux
>>     -r--r--r-- 1 root root 5174406 Sep  7 14:20 /sys/kernel/btf/vmlinux
>>
>> (b) It's possible for "vmlinux-btf-extras" and "$MODULE" to contain
>>     duplicate type definitions, wasting additional space. However, as
>>     far as I understand it, this was already a possibility, e.g.
>>     $MODULE1 and $MODULE2 could already contain duplicate types. So I
>>     think this downside is no more.
>
> Both concerns are valid, but I'm a bit puzzled with (a).
> At least in the networking drivers the number of global vars is very small.
> I expected other drivers to be similar.
> So having "functions and all vars" in ko-s should not add
> that much overhead.
>
> Maybe you're seeing this overhead because pahole is adding
> all declared vars and not only the vars that are actually present?
> That would explain the discrepancy.
> (b) with a bunch of duplicates is a sign that something is off as well.

Sorry, I didn't actually have an analysis for module BTF, I was just
extrapolating the result I had seen for vmlinux. I went ahead and did a
proper test, generating BTF for a distribution kernel from Oracle Linux
(kernel-uek-5.15.0-1.43.4.1.el9uek.x86_64) - something that I easily had
on hand and could regenerate the BTF for quickly.

Basically, the steps were:

    pahole -J vmlinux --btf_encode_detached=vmlinux.btf
    pahole -J vmlinux --btf_encode_detached=vmlinux.btf.all \
           --encode_all_btf_vars

    # For each module
    pahole -J $MODULE --btf_encode_detached=$MODULE.btf \
           --btf_base=vmlinux.btf
    pahole -J $MODULE --btf_encode_detached=$MODULE.btf.all \
           --btf_base=vmlinux.btf --encode_all_btf_vars

    # what if we based the module BTF on the "vmlinux.btf.all" instead?
    pahole -J $MODULE --btf_encode_detached=$MODULE.btf.all.all \
           --btf_base=vmlinux.btf.all --encode_all_btf_vars

And then using ls/awk to sum up the bytes of each BTF file. Results are:

vmlinux:

-rw-r-----. 1 opc opc 4904193 Sep  9 18:58 vmlinux.btf
-rw-r-----. 1 opc opc 6534684 Sep  9 18:58 vmlinux.btf.all

In this case there's a 33% increase in BTF size.

modules:

$ ls -l *.btf | awk '{sum += $5} END {print(sum)}'
43979532
$ ls -l *.btf.all | awk '{sum += $5} END {print(sum)}'
44757792
$ ls -l *.btf.all.all | awk '{sum += $5} END {print(sum)}'
44696639

So the "*.btf.all.all" modules were just an experiment to see if the
extra data inside "vmlinux.btf.all" could reduce some duplication in
module BTF. The answer was yes, but not enough to make up for the
increase in the vmlinux BTF size.

The "*.btf.all" modules are the ones we would actually expect to use in
Option #1, where we have a vmlinux-btf-extras and the rest of the
modules include their globals in their BTF sections directly, and are
based off of the vmlinux BTF. This test shows on average, that the
module BTF size would grow by 1.6% with Option #1. Of course the exact
memory size that accounts for will vary by workload, depending on how
many modules are loaded. But I'd imagine, assuming you have around 5MB
of module BTF *actually loaded*, then the overhead would be around 85k
bytes.  I don't know about how you feel, but I think that sounds
acceptable, it's just 22 pages at 4k size :)

Let me know how it sounds to you.

Thanks,
Stephen

>>
>>
>> Option #2
>> ---------
>>
>> * The vmlinux-btf-extra module is still added as in Option #1.
>>
>> * Further, each module would have its own "$MODULE-btf-extra" module to
>>   add in extra BTF. These would be built with a --btf_base=$MODULE.ko
>>   and of course that BTF is based on vmlinux, so we would have:
>>
>>   vmlinux_btf              [ functions and percpu vars only ]
>>   |- vmlinux-btf-extras    [ all other vars for vmlinux ]
>>   |- $MODULE               [ functions and percpu vars only ]
>>      |- $MODULE-btf-extra  [ all  other vars for $MODULE ]
>>
>> This is much more complex, pahole must be extended to support a
>> hierarchy of --btf_base files. The kernel itself may not need to
>> understand multi-level BTF since there's no requirement that it actually
>> understand $MODULE-btf-extra, so long as it exposes it via
>> /sys/kernel/btf/$MODULE-btf-extra. I'd also like to see some sort of
>> mechanism to allow an administrator to say "please always load
>> $MODULE-btf-extras alongside $MODULE", but I think that would be a
>> userspace problem.
>>
>> This resolves issue (a) from option #1, of course at implementation
>> cost.
>>
>> Regardless of Option #1 or #2, I'd propose that we implement this as a
>> tristate, similar to what Alan proposed [2]. When set to "m" we use the
>> solutions described above, and when set to "y", we don't bother with it,
>> instead using --encode_all_btf_vars for all generation.
>>
>> If we go with Option #1, no changes to this series should be necessary.
>> If we go with Option #2, I'll need to extend pahole to support at least
>> two BTF base files. Please let me know your thoughts.
>
> Completely agree that two level btf-extra needs quite a bit more work.
> Before we proceed with option 2 let's figure out
> the reason for extra space in option 1.

  reply	other threads:[~2022-09-09 19:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-26 18:49 [PATCH dwarves 0/7] Add support for generating BTF for all variables Stephen Brennan
2022-08-26 18:49 ` [PATCH dwarves 1/7] dutil: return ELF section name when looked up by index Stephen Brennan
2022-08-26 18:49 ` [PATCH dwarves 2/7] btf_encoder: Rename percpu structures to variables Stephen Brennan
2022-08-26 18:49 ` [PATCH dwarves 3/7] btf_encoder: cache all ELF section info Stephen Brennan
2022-08-26 18:49 ` [PATCH dwarves 4/7] btf_encoder: make the variable array dynamic Stephen Brennan
2022-08-26 18:49 ` [PATCH dwarves 5/7] btf_encoder: record ELF section for collected variables Stephen Brennan
2022-08-26 18:49 ` [PATCH dwarves 6/7] btf_encoder: collect all variables Stephen Brennan
2022-08-26 18:49 ` [PATCH dwarves 7/7] btf_encoder: allow encoding " Stephen Brennan
2022-08-30 15:14 ` [PATCH dwarves 0/7] Add support for generating BTF for " Alexei Starovoitov
2022-09-07 19:06   ` Stephen Brennan
2022-09-07 19:27     ` Alexei Starovoitov
2022-09-07 21:54       ` Stephen Brennan
2022-09-08 20:35         ` Alexei Starovoitov
2022-09-09 19:31           ` Stephen Brennan [this message]
2022-09-23 23:38             ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sfl0l4z7.fsf@oracle.com \
    --to=stephen.s.brennan@oracle.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dwarves@vger.kernel.org \
    --cc=jolsa@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox