All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Thierry Treyer <ttreyer@meta.com>
Cc: Alan Maguire <alan.maguire@oracle.com>,
	"dwarves@vger.kernel.org"	 <dwarves@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	 "acme@kernel.org"	 <acme@kernel.org>,
	"ast@kernel.org" <ast@kernel.org>, Yonghong Song <yhs@meta.com>,
	 "andrii@kernel.org"	 <andrii@kernel.org>,
	"ihor.solodrai@linux.dev" <ihor.solodrai@linux.dev>,
	 Song Liu <songliubraving@meta.com>,
	Mykola Lysenko <mykolal@meta.com>, Daniel Xu <dlxu@meta.com>
Subject: Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
Date: Thu, 22 May 2025 13:16:38 -0700	[thread overview]
Message-ID: <bfb120452de9d9ce0868485bc41fa8cf56edf4cf.camel@gmail.com> (raw)
In-Reply-To: <CAEf4BzZxccvWcGJ06hSnrVh6jJO-gdCLUitc7qNE-2oO8iK+og@mail.gmail.com>

On Thu, 2025-05-22 at 13:03 -0700, Andrii Nakryiko wrote:
> On Thu, May 22, 2025 at 10:56 AM Thierry Treyer <ttreyer@meta.com> wrote:
> > 
> > Hello everyone,
> > 
> > Here are the estimates for the different encoding schemes we discussed:
> > - parameters' location takes ~1MB without de-duplication,
> > - parameters' location shrinks to ~14kB when de-duplicated,
> > - instead of de-duplicating the individual locations,
> >   de-duplicating functions' parameter lists yields 187kB of locations data.
> > 
> > We also need to take into account the size of the corresponding funcsec
> > table, which starts at 3.6MB. The full details follows:
> > 
> >   1) // params_offset points to the first parameter's location
> >      struct fn_info { u32 type_id, offset, params_offset; };
> >   2) // param_offsets point to each parameters' location
> >      struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
> >   3) // locations are stored inline, in the funcsec table
> >      struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; };
> > 
> >   Params encoding             Locations Size   Funcsec Size   Total Size
> >   ======================================================================
> >   (1) param list, no dedup         1,017,654      5,467,824    6,485,478
> >   (1) param list, w/ dedup           187,379      5,467,824    5,655,203
> >   (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
> 
> This one is almost as good as (3) below, but fits better into the
> existing kind+vlen model where there is a variable number of fixed
> sized elements (but locations can still be variable-sized and keep
> evolving much more easily). I'd go with this one, unless I'm missing
> some important benefit of other representations.

Thierry, could you please provide some details for the representation
of both fn_info and parameters for this case?
I'm curious how far this version is from exhausting u16 limit.

> 
> >   (3) param list inline            1,017,654      3,645,216    4,662,870
> > 
> >   Estimated size in bytes of the new .BTF.func_aux section, from a
> >   production kernel v6.9. It includes both partially and fully inlined
> >   functions in the funcsec tables, with all their parameters, either inline
> >   or in their own sub-section. It does not include type information that
> >   would be required to handle fully inlined functions, functions with
> >   conflicting name, and functions with conflicting prototypes.
> > 
> >   The deduplicated locations in 2) are small enough to be indexed by a u16.
> > 
> > Storing the locations inline uses the least amount of space. Followed by
> > storing inline a list of offsets to the locations. Neither of these
> > approaches have fixed size records in funcsec. "param list, w/ dedup" is
> > ~1MB larger than inlined locations, but has fixed size records.
> > 
> > In all cases, the funcsec table uses the most space, compared to the
> > locations. The size of the `type` sub-section will also grow when we add
> > the missing type information for fully inlined functions, functions with
> > conflicting name, and functions with conflicting prototypes.
> > 
> > With fixed size records in the funcsec table, we'd get faster lookup by
> > sorting by `type_id` or `offset`.  bpftrace could efficiently search the
> > lower bound of a `type_id` to instrument all its inline instances.
> > Symbolication tools could efficiently search for inline functions at a
> > given offset.
> > 
> > However, it would rule out the most efficient encoding.
> > How do we want to approach this tradeoff?

[...]


  reply	other threads:[~2025-05-22 20:16 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
2025-04-16 19:20 ` Thierry Treyer
2025-04-16 19:20 ` [PATCH RFC 1/3] dwarf_loader: Add parameters list to inlined expansion Thierry Treyer via B4 Relay
2025-04-16 19:20   ` Thierry Treyer
2025-04-16 19:20 ` [PATCH RFC 2/3] dwarf_loader: Add name to inline expansion Thierry Treyer via B4 Relay
2025-04-16 19:20   ` Thierry Treyer
2025-04-16 19:20 ` [PATCH RFC 3/3] inline_encoder: Introduce inline encoder to emit BTF.inline Thierry Treyer via B4 Relay
2025-04-16 19:20   ` Thierry Treyer
2025-04-25 18:40 ` [PATCH RFC 0/3] list inline expansions in .BTF.inline Daniel Xu
2025-04-28 20:51   ` Alexei Starovoitov
2025-04-29 19:14     ` Thierry Treyer
2025-04-29 23:58       ` Alexei Starovoitov
2025-04-30 15:25 ` Alan Maguire
2025-05-01 19:38   ` Thierry Treyer
2025-05-02  8:31     ` Alan Maguire
2025-05-19 12:02 ` Alan Maguire
2025-05-22 17:56   ` Thierry Treyer
2025-05-22 20:03     ` Andrii Nakryiko
2025-05-22 20:16       ` Eduard Zingerman [this message]
2025-05-23 18:57         ` Thierry Treyer
2025-05-26 14:30           ` Alan Maguire
2025-05-27 21:41             ` Andrii Nakryiko
2025-05-28 10:14               ` Alan Maguire
2025-05-23 17:33     ` Alexei Starovoitov
2025-05-23 18:35       ` Thierry Treyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bfb120452de9d9ce0868485bc41fa8cf56edf4cf.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dlxu@meta.com \
    --cc=dwarves@vger.kernel.org \
    --cc=ihor.solodrai@linux.dev \
    --cc=mykolal@meta.com \
    --cc=songliubraving@meta.com \
    --cc=ttreyer@meta.com \
    --cc=yhs@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.