From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Alexei Starovoitov <ast@fb.com>,
"daniel@iogearbox.net" <daniel@iogearbox.net>,
Kernel Team <Kernel-team@fb.com>,
Masahiro Yamada <yamada.masahiro@socionext.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Jiri Olsa <jolsa@kernel.org>, Sam Ravnborg <sam@ravnborg.org>
Subject: Re: [PATCH v2 bpf-next] btf: expose BTF info through sysfs
Date: Thu, 8 Aug 2019 20:21:02 +0000 [thread overview]
Message-ID: <9cde90e5-2831-0195-748c-b3325cbe1a1e@fb.com> (raw)
In-Reply-To: <CAEf4BzYAZ7x+PY0t90ty9RVSm1FSmc9XqY216DtJCA-giK3fUg@mail.gmail.com>
On 8/8/19 10:47 AM, Andrii Nakryiko wrote:
> On Wed, Aug 7, 2019 at 9:24 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 8/7/19 5:32 PM, Andrii Nakryiko wrote:
>>> Make .BTF section allocated and expose its contents through sysfs.
>>>
>>> /sys/kernel/btf directory is created to contain all the BTFs present
>>> inside kernel. Currently there is only kernel's main BTF, represented as
>>> /sys/kernel/btf/kernel file. Once kernel modules' BTFs are supported,
>>> each module will expose its BTF as /sys/kernel/btf/<module-name> file.
>>>
>>> Current approach relies on a few pieces coming together:
>>> 1. pahole is used to take almost final vmlinux image (modulo .BTF and
>>> kallsyms) and generate .BTF section by converting DWARF info into
>>> BTF. This section is not allocated and not mapped to any segment,
>>> though, so is not yet accessible from inside kernel at runtime.
>>> 2. objcopy dumps .BTF contents into binary file and subsequently
>>> convert binary file into linkable object file with automatically
>>> generated symbols _binary__btf_kernel_bin_start and
>>> _binary__btf_kernel_bin_end, pointing to start and end, respectively,
>>> of BTF raw data.
>>> 3. final vmlinux image is generated by linking this object file (and
>>> kallsyms, if necessary). sysfs_btf.c then creates
>>> /sys/kernel/btf/kernel file and exposes embedded BTF contents through
>>> it. This allows, e.g., libbpf and bpftool access BTF info at
>>> well-known location, without resorting to searching for vmlinux image
>>> on disk (location of which is not standardized and vmlinux image
>>> might not be even available in some scenarios, e.g., inside qemu
>>> during testing).
>>>
>>> Alternative approach using .incbin assembler directive to embed BTF
>>> contents directly was attempted but didn't work, because sysfs_proc.o is
>>> not re-compiled during link-vmlinux.sh stage. This is required, though,
>>> to update embedded BTF data (initially empty data is embedded, then
>>> pahole generates BTF info and we need to regenerate sysfs_btf.o with
>>> updated contents, but it's too late at that point).
>>>
>>> If BTF couldn't be generated due to missing or too old pahole,
>>> sysfs_btf.c handles that gracefully by detecting that
>>> _binary__btf_kernel_bin_start (weak symbol) is 0 and not creating
>>> /sys/kernel/btf at all.
>>>
>>> v1->v2:
>>> - allow kallsyms stage to re-use vmlinux generated by gen_btf();
>>>
>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>>> Cc: Jiri Olsa <jolsa@kernel.org>
>>> Cc: Sam Ravnborg <sam@ravnborg.org>
>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>> ---
>
> [...]
>
>>> +
>>> + # dump .BTF section into raw binary file to link with final vmlinux
>>> + bin_arch=$(${OBJDUMP} -f ${1} | grep architecture | \
>>> + cut -d, -f1 | cut -d' ' -f2)
>>> + ${OBJCOPY} --dump-section .BTF=.btf.kernel.bin ${1} 2>/dev/null
>>> + ${OBJCOPY} -I binary -O ${CONFIG_OUTPUT_FORMAT} -B ${bin_arch} \
>>> + --rename-section .data=.BTF .btf.kernel.bin ${2}
>>
>> Currently, the binary size on my config is about 2.6MB. Do you think
>> we could or need to compress it to make it smaller? I tried gzip
>> and the compressed size is 0.9MB.
>
> I'd really prefer to keep it uncompressed for two main reasons:
> - by having this in uncompressed form, kernel itself can use this BTF
> data from inside with almost no additional memory (except maybe for
> index from type ID to actual location of type info), which opens up a
> lot of new and interesting opportunities, like kernel returning its
> own BTF and BTF type ID for various types (think about driver metdata,
> all those special maps, etc).
> - if we are doing compression, now we need to decide on best
> compression format, teach it libbpf (which will make libbpf also
> bigger and depending on extra libraries), etc.
>
> So basically, in exchange of 1-1.5MB extra memory we get a bunch of
> new problems we normally don't have to deal with.
Yes, I am aware of this tradeoff. Just to make sure this has been
discussed. I am totally fine with leaving it uncompressed.
>
>>
>>> }
>>>
>>> # Create ${2} .o file with all symbols from the ${1} object file
>>> @@ -153,6 +164,7 @@ sortextable()
>>> # Delete output files in case of error
>>> cleanup()
>>> {
>
> [...]
>
prev parent reply other threads:[~2019-08-08 20:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190808003215.1462821-1-andriin@fb.com>
2019-08-08 4:24 ` [PATCH v2 bpf-next] btf: expose BTF info through sysfs Yonghong Song
2019-08-08 6:08 ` Greg KH
2019-08-08 17:53 ` Andrii Nakryiko
2019-08-08 18:11 ` Greg KH
2019-08-08 17:47 ` Andrii Nakryiko
2019-08-08 20:21 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9cde90e5-2831-0195-748c-b3325cbe1a1e@fb.com \
--to=yhs@fb.com \
--cc=Kernel-team@fb.com \
--cc=acme@redhat.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andriin@fb.com \
--cc=ast@fb.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jolsa@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sam@ravnborg.org \
--cc=yamada.masahiro@socionext.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox