Re: [RFC/PATCHES 00/12] pahole: Reproducible parallel DWARF loading/serial BTF encoding

public inbox for dwarves@vger.kernel.org
 help / color / mirror / Atom feed

From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Eduard Zingerman <eddyz87@gmail.com>
Cc: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"Alan Maguire" <alan.maguire@oracle.com>,
	dwarves@vger.kernel.org, "Jiri Olsa" <jolsa@kernel.org>,
	"Clark Williams" <williams@redhat.com>,
	"Kate Carcia" <kcarcia@redhat.com>, bpf <bpf@vger.kernel.org>,
	"Kui-Feng Lee" <kuifeng@fb.com>,
	"Thomas Weißschuh" <linux@weissschuh.net>
Subject: Re: [RFC/PATCHES 00/12] pahole: Reproducible parallel DWARF loading/serial BTF encoding
Date: Tue, 9 Apr 2024 15:45:26 -0300	[thread overview]
Message-ID: <ZhWMxu8Xq1oAUAoC@x1> (raw)
In-Reply-To: <7a08fb6a8c37e58a56121c8536b9ab68405c049d.camel@gmail.com>

On Tue, Apr 09, 2024 at 06:01:08PM +0300, Eduard Zingerman wrote:
> On Tue, 2024-04-09 at 07:56 -0700, Alexei Starovoitov wrote:
> [...]

> > I would actually go with sorted BTF, since it will probably
> > make diff-ing of BTFs practical. Will be easier to track changes

What kind of diff-ing of BTFs from different kernels are you interested
in?

in pahole's repository we have btfdiff, that will, given a vmlinux with
both DWARF and BTF use pahole to pretty print all types, expanded, and
then compare the two outputs, which should produce the same results from
BTF and DWARF. Ditto for DWARF from a vmlinux compared to a detached BTF
file.

And also now we have another regression test script that will produce
the output from 'btftool btf dump' for the BTF generated from DWARF in
serial mode, and then compare that with the output from 'bpftool btf
dump' for reproducible encodings done using -j 1 ...
number-of-processors-on-the-machine. All have to match, all types, all
BTF ids.

We can as well use something like btfdiff to compare the output from
'pahole --expand_types --sort' for two BTFs for two different kernels,
to see what are the new types and the changes to types in both.

What else do you want to compare? To be able to match we would have to
somehow have ranges for each DWARF CU so that when encoding and then
deduplicating we would have space in the ID space for new types to fill
in while keeping the old types IDs matching the same types in the new
vmlinux.

While ordering all types we would have to have ID space available from
each of the BTF kinds, no?

I haven't looked at Eduard's patches, is that what it is done?

> > from one kernel version to another. vmlinux.h will become
> > a bit more sorted too and normal diff vmlinux_6_1.h vmlinux_6_2.h
> > will be possible.
> > Or am I misunderstanding the sorting concept?

> You understand the concept correctly, here is a sample:

>   [1] INT '_Bool' size=1 bits_offset=0 nr_bits=8 encoding=BOOL
>   [2] INT '__int128' size=16 bits_offset=0 nr_bits=128 encoding=SIGNED
>   [3] INT '__int128 unsigned' size=16 bits_offset=0 nr_bits=128 encoding=(none)
>   [4] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=(none)
>   [5] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>   [6] INT 'long int' size=8 bits_offset=0 nr_bits=64 encoding=SIGNED
>   [7] INT 'long long int' size=8 bits_offset=0 nr_bits=64 encoding=SIGNED

The above: so far so good, probably there will not be something that
will push what is now BTF id 6 to become 7 in a new vmlinux, but can we
say the same for the more dynamic parts, like the list of structs?

A struct can vanish, that abstraction not being used anymore in the
kernel, so its BTF id will vacate and all of the next struct IDs will
"fall down" and gets its IDs decremented, no?

If these difficulties are present as I mentioned, then rebuilding from
the BTF data with something like the existing 'pahole --expand_types
--sort' from the BTF from kernel N to compare with the same output for
kernel N + 1 should be enough to see what changed from one kernel to the
next one?

- Arnaldo

>   ...
>   [15085] STRUCT 'arch_elf_state' size=0 vlen=0
>   [15086] STRUCT 'arch_vdso_data' size=0 vlen=0
>   [15087] STRUCT 'bpf_run_ctx' size=0 vlen=0
>   [15088] STRUCT 'dev_archdata' size=0 vlen=0
>   [15089] STRUCT 'dyn_arch_ftrace' size=0 vlen=0
>   [15090] STRUCT 'fscrypt_dummy_policy' size=0 vlen=0
>   ...
>   
> (Sort by kind, than by vlen, than by name because sorting by name is a
>  bit costly, then by member properties)

next prev parent reply	other threads:[~2024-04-09 18:45 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-02 19:39 [RFC/PATCHES 00/12] pahole: Reproducible parallel DWARF loading/serial BTF encoding Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 01/12] core: Allow asking for a reproducible build Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 02/12] pahole: Disable BTF multithreaded encoded when doing reproducible builds Arnaldo Carvalho de Melo
2024-04-03 18:19   ` Andrii Nakryiko
2024-04-03 21:38     ` Arnaldo Carvalho de Melo
2024-04-03 21:43       ` Andrii Nakryiko
2024-04-04  9:42   ` Jiri Olsa
2024-04-02 19:39 ` [PATCH 03/12] dwarf_loader: Separate creating the cu/dcu pair from processing it Arnaldo Carvalho de Melo
2024-04-04  9:42   ` Jiri Olsa
2024-04-02 19:39 ` [PATCH 04/12] dwarf_loader: Introduce dwarf_cus__process_cu() Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 05/12] dwarf_loader: Create the cu/dcu pair in dwarf_cus__nextcu() Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 06/12] dwarf_loader: Remove unused 'thr_data' arg from dwarf_cus__create_and_process_cu() Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 07/12] core: Add unlocked cus__add() variant Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 08/12] core: Add cus__remove(), counterpart of cus__add() Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 09/12] dwarf_loader: Add the cu to the cus list early, remove on LSK_DELETE Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 10/12] core/dwarf_loader: Add functions to set state of CU processing Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 11/12] pahole: Encode BTF serially in a reproducible build Arnaldo Carvalho de Melo
2024-04-02 19:39 ` [PATCH 12/12] tests: Add a BTF reproducible generation test Arnaldo Carvalho de Melo
2024-04-04  0:08 ` [RFC/PATCHES 00/12] pahole: Reproducible parallel DWARF loading/serial BTF encoding Eduard Zingerman
2024-04-04  8:05   ` Alan Maguire
2024-04-09 14:34     ` Eduard Zingerman
2024-04-09 14:56       ` Alexei Starovoitov
2024-04-09 15:01         ` Eduard Zingerman
2024-04-09 18:45           ` Arnaldo Carvalho de Melo [this message]
2024-04-09 19:29             ` Eduard Zingerman
2024-04-09 19:34               ` Alexei Starovoitov
2024-04-09 19:57               ` Arnaldo Carvalho de Melo
2024-04-12 20:37       ` Arnaldo Carvalho de Melo
2024-04-12 20:40         ` Eduard Zingerman
2024-04-12 21:09           ` Arnaldo Carvalho de Melo
2024-04-12 21:10             ` Eduard Zingerman
2024-04-04  8:58 ` Alan Maguire
2024-04-08 12:00   ` Alan Maguire
2024-04-08 14:39     ` Arnaldo Carvalho de Melo
2024-04-12 20:36       ` Arnaldo Carvalho de Melo
2024-04-04  9:42 ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhWMxu8Xq1oAUAoC@x1 \
    --to=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kcarcia@redhat.com \
    --cc=kuifeng@fb.com \
    --cc=linux@weissschuh.net \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox