Netdev List
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Jiri Olsa <jolsa@kernel.org>, Sam Ravnborg <sam@ravnborg.org>
Subject: Re: [PATCH v2 bpf-next] btf: expose BTF info through sysfs
Date: Thu, 8 Aug 2019 20:21:02 +0000	[thread overview]
Message-ID: <9cde90e5-2831-0195-748c-b3325cbe1a1e@fb.com> (raw)
In-Reply-To: <CAEf4BzYAZ7x+PY0t90ty9RVSm1FSmc9XqY216DtJCA-giK3fUg@mail.gmail.com>



On 8/8/19 10:47 AM, Andrii Nakryiko wrote:
> On Wed, Aug 7, 2019 at 9:24 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 8/7/19 5:32 PM, Andrii Nakryiko wrote:
>>> Make .BTF section allocated and expose its contents through sysfs.
>>>
>>> /sys/kernel/btf directory is created to contain all the BTFs present
>>> inside kernel. Currently there is only kernel's main BTF, represented as
>>> /sys/kernel/btf/kernel file. Once kernel modules' BTFs are supported,
>>> each module will expose its BTF as /sys/kernel/btf/<module-name> file.
>>>
>>> Current approach relies on a few pieces coming together:
>>> 1. pahole is used to take almost final vmlinux image (modulo .BTF and
>>>      kallsyms) and generate .BTF section by converting DWARF info into
>>>      BTF. This section is not allocated and not mapped to any segment,
>>>      though, so is not yet accessible from inside kernel at runtime.
>>> 2. objcopy dumps .BTF contents into binary file and subsequently
>>>      convert binary file into linkable object file with automatically
>>>      generated symbols _binary__btf_kernel_bin_start and
>>>      _binary__btf_kernel_bin_end, pointing to start and end, respectively,
>>>      of BTF raw data.
>>> 3. final vmlinux image is generated by linking this object file (and
>>>      kallsyms, if necessary). sysfs_btf.c then creates
>>>      /sys/kernel/btf/kernel file and exposes embedded BTF contents through
>>>      it. This allows, e.g., libbpf and bpftool access BTF info at
>>>      well-known location, without resorting to searching for vmlinux image
>>>      on disk (location of which is not standardized and vmlinux image
>>>      might not be even available in some scenarios, e.g., inside qemu
>>>      during testing).
>>>
>>> Alternative approach using .incbin assembler directive to embed BTF
>>> contents directly was attempted but didn't work, because sysfs_proc.o is
>>> not re-compiled during link-vmlinux.sh stage. This is required, though,
>>> to update embedded BTF data (initially empty data is embedded, then
>>> pahole generates BTF info and we need to regenerate sysfs_btf.o with
>>> updated contents, but it's too late at that point).
>>>
>>> If BTF couldn't be generated due to missing or too old pahole,
>>> sysfs_btf.c handles that gracefully by detecting that
>>> _binary__btf_kernel_bin_start (weak symbol) is 0 and not creating
>>> /sys/kernel/btf at all.
>>>
>>> v1->v2:
>>> - allow kallsyms stage to re-use vmlinux generated by gen_btf();
>>>
>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>>> Cc: Jiri Olsa <jolsa@kernel.org>
>>> Cc: Sam Ravnborg <sam@ravnborg.org>
>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>> ---
> 
> [...]
> 
>>> +
>>> +     # dump .BTF section into raw binary file to link with final vmlinux
>>> +     bin_arch=$(${OBJDUMP} -f ${1} | grep architecture | \
>>> +             cut -d, -f1 | cut -d' ' -f2)
>>> +     ${OBJCOPY} --dump-section .BTF=.btf.kernel.bin ${1} 2>/dev/null
>>> +     ${OBJCOPY} -I binary -O ${CONFIG_OUTPUT_FORMAT} -B ${bin_arch} \
>>> +             --rename-section .data=.BTF .btf.kernel.bin ${2}
>>
>> Currently, the binary size on my config is about 2.6MB. Do you think
>> we could or need to compress it to make it smaller? I tried gzip
>> and the compressed size is 0.9MB.
> 
> I'd really prefer to keep it uncompressed for two main reasons:
> - by having this in uncompressed form, kernel itself can use this BTF
> data from inside with almost no additional memory (except maybe for
> index from type ID to actual location of type info), which opens up a
> lot of new and interesting opportunities, like kernel returning its
> own BTF and BTF type ID for various types (think about driver metdata,
> all those special maps, etc).
> - if we are doing compression, now we need to decide on best
> compression format, teach it libbpf (which will make libbpf also
> bigger and depending on extra libraries), etc.
> 
> So basically, in exchange of 1-1.5MB extra memory we get a bunch of
> new problems we normally don't have to deal with.

Yes, I am aware of this tradeoff. Just to make sure this has been 
discussed. I am totally fine with leaving it uncompressed.

> 
>>
>>>    }
>>>
>>>    # Create ${2} .o file with all symbols from the ${1} object file
>>> @@ -153,6 +164,7 @@ sortextable()
>>>    # Delete output files in case of error
>>>    cleanup()
>>>    {
> 
> [...]
> 

      reply	other threads:[~2019-08-08 20:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190808003215.1462821-1-andriin@fb.com>
2019-08-08  4:24 ` [PATCH v2 bpf-next] btf: expose BTF info through sysfs Yonghong Song
2019-08-08  6:08   ` Greg KH
2019-08-08 17:53     ` Andrii Nakryiko
2019-08-08 18:11       ` Greg KH
2019-08-08 17:47   ` Andrii Nakryiko
2019-08-08 20:21     ` Yonghong Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9cde90e5-2831-0195-748c-b3325cbe1a1e@fb.com \
    --to=yhs@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=acme@redhat.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jolsa@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sam@ravnborg.org \
    --cc=yamada.masahiro@socionext.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox