From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>,
bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
kernel-team@meta.com, qmo@kernel.org
Cc: Mykyta Yatsenko <yatsenko@meta.com>
Subject: Re: [PATCH v2 bpf-next] bpftool: introduce btf c dump sorting
Date: Thu, 9 May 2024 19:42:23 +0100 [thread overview]
Message-ID: <3999bafb-c64e-489f-a461-ac1a748abb6d@gmail.com> (raw)
In-Reply-To: <b43d0677-5018-45a1-8b0e-00bdc68a09af@oracle.com>
On 5/9/24 17:08, Alan Maguire wrote:
> On 09/05/2024 16:17, Mykyta@web.codeaurora.org wrote:
>> From: Mykyta Yatsenko <yatsenko@meta.com>
>>
>> Sort bpftool c dump output; aiming to simplify vmlinux.h diffing and
>> forcing more natural type definitions ordering.
>>
>> Definitions are sorted first by their BTF kind ranks, then by their base
>> type name and by their own name.
>>
>> Type ranks
>>
>> Assign ranks to btf kinds (defined in function btf_type_rank) to set
>> next order:
>> 1. Anonymous enums/enums64
>> 2. Named enums/enums64
>> 3. Trivial types typedefs (ints, then floats)
>> 4. Structs/Unions
>> 5. Function prototypes
>> 6. Forward declarations
>>
>> Type rank is set to maximum for unnamed reference types, structs and
>> unions to avoid emitting those types early. They will be emitted as
>> part of the type chain starting with named type.
>>
>> Lexicographical ordering
>>
>> Each type is assigned a sort_name and own_name.
>> sort_name is the resolved name of the final base type for reference
>> types (typedef, pointer, array etc). Sorting by sort_name allows to
>> group typedefs of the same base type. sort_name for non-reference type
>> is the same as own_name. own_name is a direct name of particular type,
>> is used as final sorting step.
>>
>> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> This looks great! Not sure if you experimented with sorting for the
> split BTF case (dumping /sys/kernel/btf/tun say); are there any
> additional issues in doing that? From what I can see below the sort
> would just be applied across base and split BTF and should just work, is
> that right? A few suggestions below, but
This functionality is oblivious to split BTF, dumping
/sys/kernel/btf/tun will
sort all types across both base and split BTF, not distinguishing where
those
types come from.
> Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
>
>> ---
>> tools/bpf/bpftool/btf.c | 125 +++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 122 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
>> index 91fcb75babe3..09ecd2abf066 100644
>> --- a/tools/bpf/bpftool/btf.c
>> +++ b/tools/bpf/bpftool/btf.c
>> @@ -43,6 +43,13 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = {
>> [BTF_KIND_ENUM64] = "ENUM64",
>> };
>>
>> +struct sort_datum {
>> + int index;
>> + int type_rank;
>> + const char *sort_name;
>> + const char *own_name;
>> +};
>> +
>> static const char *btf_int_enc_str(__u8 encoding)
>> {
>> switch (encoding) {
>> @@ -460,11 +467,114 @@ static void __printf(2, 0) btf_dump_printf(void *ctx,
>> vfprintf(stdout, fmt, args);
>> }
>>
>> +static bool is_reference_type(const struct btf_type *t)
>> +{
>> + int kind = btf_kind(t);
>> +
>> + return kind == BTF_KIND_CONST || kind == BTF_KIND_PTR || kind == BTF_KIND_VOLATILE ||
>> + kind == BTF_KIND_RESTRICT || kind == BTF_KIND_ARRAY || kind == BTF_KIND_TYPEDEF ||
>> + kind == BTF_KIND_DECL_TAG;
>> +}
>> +
>> +static int btf_type_rank(const struct btf *btf, __u32 index, bool has_name)
>> +{
>> + const struct btf_type *t = btf__type_by_id(btf, index);
>> + const int max_rank = 10;
>> + const int kind = btf_kind(t);
>> +
>> + if (t->name_off)
>> + has_name = true;
>> +
>> + switch (kind) {
>> + case BTF_KIND_ENUM:
>> + case BTF_KIND_ENUM64:
>> + return has_name ? 1 : 0;
>> + case BTF_KIND_INT:
>> + case BTF_KIND_FLOAT:
>> + return 2;
>> + case BTF_KIND_STRUCT:
>> + case BTF_KIND_UNION:
>> + return has_name ? 3 : max_rank;
>> + case BTF_KIND_FUNC_PROTO:
>> + return has_name ? 4 : max_rank;
>> +
> Don't think a FUNC_PROTO will ever have a name, so has_name check
> probably not needed here.
The reason for that check is to penalize FUNC_PROTO type (assign
max_rank to it),
but assign rank 4 to typedef type pointing to that FUNC_PROTO. You can
see that
for reference types this function is called recursively until
non-reference type
is reached, we assign non-maximum rank only if there was a named type along
the chain of recursive calls. Assigning rank 4 to FUNC_PROTO will lead
to printing
those function prototypes unordered, as their names are assigned to
typedef type.
>> + default: {
>> + if (has_name && is_reference_type(t)) {
>> + const int parent = kind == BTF_KIND_ARRAY ? btf_array(t)->type : t->type;
>> +
>> + return btf_type_rank(btf, parent, has_name);
>> + }
>> + return max_rank;
>> + }
>> + }
>> +}
>> +
>> +static const char *btf_type_sort_name(const struct btf *btf, __u32 index)
>> +{
>> + const struct btf_type *t = btf__type_by_id(btf, index);
>> + int kind = btf_kind(t);
>> +
>> + /* Use name of the first element for anonymous enums */
>> + if (!t->name_off && (kind == BTF_KIND_ENUM || kind == BTF_KIND_ENUM64) &&
>> + BTF_INFO_VLEN(t->info))
>> + return btf__name_by_offset(btf, btf_enum(t)->name_off);
>> +
>> + /* Return base type name for reference types */
>> + while (is_reference_type(t)) {
> The two times is_reference_type() is used, we use this conditional to
> get the array type; worth rolling this into a get_reference_type(t)
> function that returns t->type for reference types, btf_array(t)->type
> for arrays and -1 otherwise perhaps?
Agree.
>
>> + index = btf_kind(t) == BTF_KIND_ARRAY ? btf_array(t)->type : t->type;
>> + t = btf__type_by_id(btf, index);
>> + }
>> +
>> + return btf__name_by_offset(btf, t->name_off);
>> +}
>> +
>> +static int btf_type_compare(const void *left, const void *right)
>> +{
>> + const struct sort_datum *datum1 = (const struct sort_datum *)left;
>> + const struct sort_datum *datum2 = (const struct sort_datum *)right;
>> + int sort_name_cmp;
>> +
>> + if (datum1->type_rank != datum2->type_rank)
>> + return datum1->type_rank < datum2->type_rank ? -1 : 1;
>> +
>> + sort_name_cmp = strcmp(datum1->sort_name, datum2->sort_name);
>> + if (sort_name_cmp)
>> + return sort_name_cmp;
>> +
>> + return strcmp(datum1->own_name, datum2->own_name);
>> +}
>> +
>> +static struct sort_datum *sort_btf_c(const struct btf *btf)
>> +{
>> + int total_root_types;
>> + struct sort_datum *datums;
>> +
>> + total_root_types = btf__type_cnt(btf);
>> + datums = malloc(sizeof(struct sort_datum) * total_root_types);
> calloc(total_root_types, sizeof(*datums)) will get you a
> zero-initialized array, which may be useful below...
>
>> + if (!datums)
>> + return NULL;
>> +
>> + for (int i = 0; i < total_root_types; ++i) {
> you're starting from zero here so you'll get &btf_void below; if you
> zero-initialize above I think you can just start from 1.
>
>> + struct sort_datum *current_datum = datums + i;
>> + const struct btf_type *t = btf__type_by_id(btf, i);
>> +
>> + current_datum->index = i;
>> + current_datum->type_rank = btf_type_rank(btf, i, false);
>> + current_datum->sort_name = btf_type_sort_name(btf, i);
>> + current_datum->own_name = btf__name_by_offset(btf, t->name_off);
>> + }
>> +
>> + qsort(datums, total_root_types, sizeof(struct sort_datum), btf_type_compare);
>> +
>> + return datums;
>> +}
>> +
>> static int dump_btf_c(const struct btf *btf,
>> - __u32 *root_type_ids, int root_type_cnt)
>> + __u32 *root_type_ids, int root_type_cnt, bool sort_dump)
>> {
>> struct btf_dump *d;
>> int err = 0, i;
>> + struct sort_datum *datums = NULL;
>>
>> d = btf_dump__new(btf, btf_dump_printf, NULL, NULL);
>> if (!d)
>> @@ -486,8 +596,12 @@ static int dump_btf_c(const struct btf *btf,
>> } else {
>> int cnt = btf__type_cnt(btf);
>>
>> + if (sort_dump)
>> + datums = sort_btf_c(btf);
>> for (i = 1; i < cnt; i++) {
>> - err = btf_dump__dump_type(d, i);
>> + int idx = datums ? datums[i].index : i;
>> +
>> + err = btf_dump__dump_type(d, idx);
>> if (err)
>> goto done;
>> }
>> @@ -501,6 +615,7 @@ static int dump_btf_c(const struct btf *btf,
>>
>> done:
>> btf_dump__free(d);
>> + free(datums);
>> return err;
>> }
>>
>> @@ -553,6 +668,7 @@ static int do_dump(int argc, char **argv)
>> __u32 root_type_ids[2];
>> int root_type_cnt = 0;
>> bool dump_c = false;
>> + bool sort_dump_c = true;
>> __u32 btf_id = -1;
>> const char *src;
>> int fd = -1;
>> @@ -663,6 +779,9 @@ static int do_dump(int argc, char **argv)
>> goto done;
>> }
>> NEXT_ARG();
>> + } else if (is_prefix(*argv, "unordered")) {
> you'll need to update the man page and add to the bash completion for
> this new argument I think.
>
>> + sort_dump_c = false;
>> + NEXT_ARG();
>> } else {
>> p_err("unrecognized option: '%s'", *argv);
>> err = -EINVAL;
>> @@ -691,7 +810,7 @@ static int do_dump(int argc, char **argv)
>> err = -ENOTSUP;
>> goto done;
>> }
>> - err = dump_btf_c(btf, root_type_ids, root_type_cnt);
>> + err = dump_btf_c(btf, root_type_ids, root_type_cnt, sort_dump_c);
>> } else {
>> err = dump_btf_raw(btf, root_type_ids, root_type_cnt);
>> }
next prev parent reply other threads:[~2024-05-09 18:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-09 15:17 [PATCH v2 bpf-next] bpftool: introduce btf c dump sorting Mykyta
2024-05-09 16:08 ` Alan Maguire
2024-05-09 18:42 ` Mykyta Yatsenko [this message]
2024-05-09 21:27 ` Andrii Nakryiko
2024-05-09 21:39 ` Andrii Nakryiko
2024-05-09 23:09 ` Quentin Monnet
2024-05-09 23:14 ` Andrii Nakryiko
2024-05-09 23:24 ` Alexei Starovoitov
2024-05-09 23:39 ` Quentin Monnet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3999bafb-c64e-489f-a461-ac1a748abb6d@gmail.com \
--to=mykyta.yatsenko5@gmail.com \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=kernel-team@meta.com \
--cc=qmo@kernel.org \
--cc=yatsenko@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox