bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Thierry Treyer <ttreyer@meta.com>
Cc: Alan Maguire <alan.maguire@oracle.com>,
	"dwarves@vger.kernel.org"	 <dwarves@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	 "acme@kernel.org"	 <acme@kernel.org>,
	"ast@kernel.org" <ast@kernel.org>, Yonghong Song <yhs@meta.com>,
	 "andrii@kernel.org"	 <andrii@kernel.org>,
	"ihor.solodrai@linux.dev" <ihor.solodrai@linux.dev>,
	 Song Liu <songliubraving@meta.com>,
	Mykola Lysenko <mykolal@meta.com>, Daniel Xu <dlxu@meta.com>
Subject: Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
Date: Thu, 22 May 2025 13:16:38 -0700	[thread overview]
Message-ID: <bfb120452de9d9ce0868485bc41fa8cf56edf4cf.camel@gmail.com> (raw)
In-Reply-To: <CAEf4BzZxccvWcGJ06hSnrVh6jJO-gdCLUitc7qNE-2oO8iK+og@mail.gmail.com>

On Thu, 2025-05-22 at 13:03 -0700, Andrii Nakryiko wrote:
> On Thu, May 22, 2025 at 10:56 AM Thierry Treyer <ttreyer@meta.com> wrote:
> > 
> > Hello everyone,
> > 
> > Here are the estimates for the different encoding schemes we discussed:
> > - parameters' location takes ~1MB without de-duplication,
> > - parameters' location shrinks to ~14kB when de-duplicated,
> > - instead of de-duplicating the individual locations,
> >   de-duplicating functions' parameter lists yields 187kB of locations data.
> > 
> > We also need to take into account the size of the corresponding funcsec
> > table, which starts at 3.6MB. The full details follows:
> > 
> >   1) // params_offset points to the first parameter's location
> >      struct fn_info { u32 type_id, offset, params_offset; };
> >   2) // param_offsets point to each parameters' location
> >      struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
> >   3) // locations are stored inline, in the funcsec table
> >      struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; };
> > 
> >   Params encoding             Locations Size   Funcsec Size   Total Size
> >   ======================================================================
> >   (1) param list, no dedup         1,017,654      5,467,824    6,485,478
> >   (1) param list, w/ dedup           187,379      5,467,824    5,655,203
> >   (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
> 
> This one is almost as good as (3) below, but fits better into the
> existing kind+vlen model where there is a variable number of fixed
> sized elements (but locations can still be variable-sized and keep
> evolving much more easily). I'd go with this one, unless I'm missing
> some important benefit of other representations.

Thierry, could you please provide some details for the representation
of both fn_info and parameters for this case?
I'm curious how far this version is from exhausting u16 limit.

> 
> >   (3) param list inline            1,017,654      3,645,216    4,662,870
> > 
> >   Estimated size in bytes of the new .BTF.func_aux section, from a
> >   production kernel v6.9. It includes both partially and fully inlined
> >   functions in the funcsec tables, with all their parameters, either inline
> >   or in their own sub-section. It does not include type information that
> >   would be required to handle fully inlined functions, functions with
> >   conflicting name, and functions with conflicting prototypes.
> > 
> >   The deduplicated locations in 2) are small enough to be indexed by a u16.
> > 
> > Storing the locations inline uses the least amount of space. Followed by
> > storing inline a list of offsets to the locations. Neither of these
> > approaches have fixed size records in funcsec. "param list, w/ dedup" is
> > ~1MB larger than inlined locations, but has fixed size records.
> > 
> > In all cases, the funcsec table uses the most space, compared to the
> > locations. The size of the `type` sub-section will also grow when we add
> > the missing type information for fully inlined functions, functions with
> > conflicting name, and functions with conflicting prototypes.
> > 
> > With fixed size records in the funcsec table, we'd get faster lookup by
> > sorting by `type_id` or `offset`.  bpftrace could efficiently search the
> > lower bound of a `type_id` to instrument all its inline instances.
> > Symbolication tools could efficiently search for inline functions at a
> > given offset.
> > 
> > However, it would rule out the most efficient encoding.
> > How do we want to approach this tradeoff?

[...]


  reply	other threads:[~2025-05-22 20:16 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
2025-04-16 19:20 ` [PATCH RFC 1/3] dwarf_loader: Add parameters list to inlined expansion Thierry Treyer via B4 Relay
2025-04-16 19:20 ` [PATCH RFC 2/3] dwarf_loader: Add name to inline expansion Thierry Treyer via B4 Relay
2025-04-16 19:20 ` [PATCH RFC 3/3] inline_encoder: Introduce inline encoder to emit BTF.inline Thierry Treyer via B4 Relay
2025-04-25 18:40 ` [PATCH RFC 0/3] list inline expansions in .BTF.inline Daniel Xu
2025-04-28 20:51   ` Alexei Starovoitov
2025-04-29 19:14     ` Thierry Treyer
2025-04-29 23:58       ` Alexei Starovoitov
2025-04-30 15:25 ` Alan Maguire
2025-05-01 19:38   ` Thierry Treyer
2025-05-02  8:31     ` Alan Maguire
2025-05-19 12:02 ` Alan Maguire
2025-05-22 17:56   ` Thierry Treyer
2025-05-22 20:03     ` Andrii Nakryiko
2025-05-22 20:16       ` Eduard Zingerman [this message]
2025-05-23 18:57         ` Thierry Treyer
2025-05-26 14:30           ` Alan Maguire
2025-05-27 21:41             ` Andrii Nakryiko
2025-05-28 10:14               ` Alan Maguire
2025-05-23 17:33     ` Alexei Starovoitov
2025-05-23 18:35       ` Thierry Treyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bfb120452de9d9ce0868485bc41fa8cf56edf4cf.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dlxu@meta.com \
    --cc=dwarves@vger.kernel.org \
    --cc=ihor.solodrai@linux.dev \
    --cc=mykolal@meta.com \
    --cc=songliubraving@meta.com \
    --cc=ttreyer@meta.com \
    --cc=yhs@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).