All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Alcock <nick.alcock@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Nick Alcock <nick.alcock@oracle.com>,
	Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alan Maguire <alan.maguire@oracle.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Clark Williams <williams@redhat.com>,
	Kate Carcia <kcarcia@redhat.com>,
	dwarves <dwarves@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	"Jose E. Marchesi" <jose.marchesi@oracle.com>,
	Namhyung Kim <namhyung@kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [RFC 0/4] BTF archive with unmodified pahole+toolchain
Date: Mon, 15 Sep 2025 11:11:29 +0100	[thread overview]
Message-ID: <877by0vxda.fsf@esperi.org.uk> (raw)
In-Reply-To: <CAADnVQKRuMzZWq5k3Z-QVCyLiR4Vin0zjPR36Om0fQbZ3RGYNg@mail.gmail.com> (Alexei Starovoitov's message of "Tue, 26 Aug 2025 17:14:15 -0700")

On 27 Aug 2025, Alexei Starovoitov stated:

> On Thu, Aug 21, 2025 at 2:35 PM Nick Alcock <nick.alcock@oracle.com> wrote:
>>
>> >>I'd like to second Alexei's question.
>> >>In the cover letter Arnaldo points out that un-deduplicated BTF
>> >>amounts for 325Mb, while total DWARF size is 365Mb.
>>
>> That very much depends on the kernels you build. In my tests of
>> enterprise kernels (including modules) with the GCC+btfarchive toolchain
>> (not feeding it to pahole yet), I found total DWARF of 11.2GiB,
>> undeduplicated BTF of 550MiB (counting raw .o compiler output alone),
>> and a final dedupicated BTF size (including all modules) of about 38MiB
>> (which I'm sure I can reduce).
>
> 11.2G doesn't match Arnaldo's 365Mb.
> Frankly I've never seen such huge dwarf objects.

I have, but... it was a while back. I shouldn't have worked from memory.

Regenerating with a more recent toolchain, summing up all written
section sizes (so, undeduplicated .BTF compiler output *and* all the
deduplicated module intermediate links) I usually see DWARF sizes about
two to three times that of the .BTF (e.g. the BTF selftest is about
800MiB versus about 400MiB of BTF: the final BTF size from both
btfarchive and pahole is on the order of 2MiB).

Using a random enterprise kernel config (so 2900+ modules, etc), I see
4072236343 bytes of DWARF, 2199803264 bytes of undeduplicated .BTF
sections: so, again, about 50% reduction.

(toolchain-level dedup on this one takes two minutes and peaks at 5GiB
memory usage, producing a 40MiB BTF archive: I know this output can be
greatly reduced by a fix I'm planning shortly. :) )

> I'm guessing you're using some ultra verbose dwarf compilation
> mode. If so, it's not a realistic comparison, since typical
> kernel build is what Arnaldo reported.
> That's what I observe as well.
>
>> >>The size of DWARF sections in the final vmlinux is comparable to yours: 307Mb.
>> >>The total size of the generated binaries is 905Mb.

Ditto, now. I dont know what weirdo config I was using before (I suspect
it was just an older GCC with a different default DWARF version, and
this is simply DWARF 2/3 versus 5). It's still a nontrivial saving.

>> GNU ld), despite being single-threaded and doing things like ambiguous
>> type detection as well, used 12GiB and took 19 minutes. (Multithreading
>> it is in progress, too). allyesconfig is faster. Anything sane is faster
>> yet. Enterprise kernels take about four minutes, which is not too
>> different from pahole.
>>
>> I was shocked by this: I thought libctf would be slower than pahole, and
>> instead it turned out to be faster, sometimes much faster. I suspect
>> much of this frankly ridiculous difference was DWARF conversion, and so
>> would be improved by doing it in parallel (as here), but... still. Not
>> having to generate and consume all that DWARF is bound to help! It's
>> like 95% less work...
>
> Something doesn't add up here.
> Everyone is using pahole and lots of people doing allmodconfig builds
> with pahole. Noone reported that pahole consumes 70G and runs for hours.
> Something is really not right in your setup.

Well... yeah, that would be the make allmodconfig / allyesconfig
configuration options. pahole takes more reasonable times with more
reasonable configurations, but still ten minutes or more is fairly
routine for me.

> Pls use typical kernel build configs then we can have apple to apple
> comparison and reason about libctf pros/cons.

I'm not sure there is such a thing as typical, really. I hope random
enterprise configs will do, but they probably have more modules than
"normal" and God knows the BTF test configs have fewer :)

-- 
NULL && (void)

      reply	other threads:[~2025-09-15 10:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07 18:25 [RFC 0/4] BTF archive with unmodified pahole+toolchain Arnaldo Carvalho de Melo
2025-08-07 18:25 ` [PATCH 1/4] libbpf: Simplify error handling removing needless repeated err checks Arnaldo Carvalho de Melo
2025-08-07 18:25 ` [PATCH 2/4] libbpf: Check if there is extra data at the end of a BTF Arnaldo Carvalho de Melo
2025-08-07 18:25 ` [PATCH 3/4] libbpf: Add support for detecting and dedup'ing a BTF archive Arnaldo Carvalho de Melo
2025-08-07 18:25 ` [PATCH 4/4] libbpf: Check if an ELF .BTF section is an archive and combine/dedup Arnaldo Carvalho de Melo
2025-08-07 18:46 ` [RFC 0/4] BTF archive with unmodified pahole+toolchain Arnaldo Carvalho de Melo
2025-08-07 20:23 ` Arnaldo Carvalho de Melo
2025-08-08  2:09 ` Alexei Starovoitov
     [not found]   ` <CA+JHD92DODDESCfwiiCs_ZQ5bGesK5NC+xe5EvONF5g+-Bg+9Q@mail.gmail.com>
2025-08-08  2:52     ` Alexei Starovoitov
2025-08-08  3:25       ` Arnaldo Carvalho de Melo
2025-08-08  3:33         ` Sam James
2025-08-08  3:54           ` Arnaldo Carvalho de Melo
2025-08-08 14:45         ` Nick Alcock
2025-08-08 15:15       ` Nick Alcock
2025-08-08 18:28   ` Eduard Zingerman
2025-08-08 19:10     ` Arnaldo Carvalho de Melo
2025-08-08 20:15       ` Eduard Zingerman
2025-08-08 20:59         ` Arnaldo Carvalho de Melo
2025-08-21 21:35       ` Nick Alcock
2025-08-27  0:14         ` Alexei Starovoitov
2025-09-15 10:11           ` Nick Alcock [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877by0vxda.fsf@esperi.org.uk \
    --to=nick.alcock@oracle.com \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=arnaldo.melo@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jose.marchesi@oracle.com \
    --cc=kcarcia@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=williams@redhat.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.