From: Quentin Monnet <qmo@kernel.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>,
Eduard Zingerman <eddyz87@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net, kafai@meta.com, kernel-team@meta.com,
Mykyta Yatsenko <yatsenko@meta.com>
Subject: Re: [PATCH bpf-next] bpftool: introduce btf c dump sorting
Date: Wed, 8 May 2024 00:30:24 +0100 [thread overview]
Message-ID: <e20be6d3-b9c7-4a64-add1-f4c7a6d3a4bc@kernel.org> (raw)
In-Reply-To: <CAEf4BzZ+nw6iu8RO1xJutRf+qnxAotHx47bXuJuw8AT-5Z3QfQ@mail.gmail.com>
On 07/05/2024 22:02, Andrii Nakryiko wrote:
> On Mon, May 6, 2024 at 6:45 AM Mykyta Yatsenko
> <mykyta.yatsenko5@gmail.com> wrote:
>>
>> From: Mykyta Yatsenko <yatsenko@meta.com>
>>
>> Provide a way to sort bpftool c dump output, to simplify vmlinux.h
>> diffing and forcing more natural definitions ordering.
>>
>> Use `normalized` argument in bpftool CLI after `format c` for example:
>> ```
>> bpftool btf dump file /sys/kernel/btf/fuse format c normalized
>> ```
>>
>> Definitions are sorted by their BTF kind ranks, lexicographically and
>> typedefs are forced to go right after their base type.
>>
>> Type ranks
>>
>> Assign ranks to btf kinds (defined in function btf_type_rank) to set
>> next order:
>> 1. Anonymous enums
>> 2. Anonymous enums64
>> 3. Named enums
>> 4. Named enums64
>> 5. Trivial types typedefs (ints, then floats)
>> 6. Structs
>> 7. Unions
>> 8. Function prototypes
>> 9. Forward declarations
>>
>> Lexicographical ordering
>>
>> Definitions within the same BTF kind are ordered by their names.
>> Anonymous enums are ordered by their first element.
>>
>> Forcing typedefs to go right after their base type
>>
>> To make sure that typedefs are emitted right after their base type,
>> we build a list of type's typedefs (struct typedef_ref) and after
>> emitting type, its typedefs are emitted as well (lexicographically)
>>
>> There is a small flaw in this implementation:
>> Type dependencies are resolved by bpf lib, so when type is dumped
>> because it is a dependency, its typedefs are not output right after it,
>> as bpflib does not have the list of typedefs for a given type.
>>
>> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
>> ---
>> tools/bpf/bpftool/btf.c | 264 +++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 259 insertions(+), 5 deletions(-)
>>
>
> I applied this locally to experiment. Generated vmlinux.h for the
> production (a bit older) kernel and then for latest bpf-next/master
> kernel. And then tried diff between normalized vmlinux.h dumps and
> non-normalized.
>
> It took a bit for the diff tool to generate, but I think diff for
> normalized vmlinux.h is actually very usable. You can see an example
> at [1]. It shows whole new types being added in front of existing
> ones. And for existing ones it shows only parts that actually changed.
> It's quite nice. And note that I used a relatively stale production
> kernel vs latest upstream bpf-next, *AND* with different (bigger)
> Kconfig. So for more incremental changes in kernel config/version the
> diff should be much slower.
>
> I think this idea of normalizing vmlinux.h works and is useful.
>
> Eduard, Quentin, please take a look when you get a chance.
>
> My high-level feedback. I like the idea and it seems to work well in
> practice. I do think, though, that the current implementation is a bit
> over-engineered. I'd drop all the complexity with TYPEDEF and try to
> get almost the same behavior with a slightly different ranking
> strategy.
>
> Tracking which types are emitted seems unnecessary btf_dumper is doing
> that already internally. So I think overall flow could be basically
> three steps:
>
> - precalculate/cache "sort names" and ranks;
> - sort based on those two, construct 0-based list of types to emit
> - just go linearly over that sorted list, call btf_dump__dump_type()
> on each one with original type ID; if the type was already emitted or
> is not the type that's emitted as an independent type (e.g.,
> FUNC_PROTO), btf_dump__dump_type() should do the right thing (do
> nothing).
>
> Any flaws in the above proposal?
>
> [1] https://gist.github.com/anakryiko/cca678c8f77833d9eb99ffc102612e28
Hi, thanks for the patch - and thanks Andrii for the Cc. I didn't have
time to look at the code yet (will do), but the idea looks great.
My main question would be, how much overhead does the sorting add to the
BTF dump, and if this overhead is low, is it even worth having a
dedicated command-line keyword to trigger the sorting, or should we just
make it the default behaviour for the C-formatted dump? (Or is there any
advantage in dumping with the current, unsorted order?)
Quentin
next prev parent reply other threads:[~2024-05-07 23:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-06 13:44 [PATCH bpf-next] bpftool: introduce btf c dump sorting Mykyta
2024-05-07 21:02 ` Andrii Nakryiko
2024-05-07 23:30 ` Quentin Monnet [this message]
2024-05-08 16:21 ` Andrii Nakryiko
2024-05-08 23:07 ` Mykyta Yatsenko
2024-05-08 23:21 ` Andrii Nakryiko
2024-05-08 23:35 ` Mykyta Yatsenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e20be6d3-b9c7-4a64-add1-f4c7a6d3a4bc@kernel.org \
--to=qmo@kernel.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=kafai@meta.com \
--cc=kernel-team@meta.com \
--cc=mykyta.yatsenko5@gmail.com \
--cc=yatsenko@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox