[PATCH RFC 0/3] list inline expansions in .BTF.inline

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC 0/3] list inline expansions in .BTF.inline
@ 2025-04-16 19:20 Thierry Treyer via B4 Relay
  2025-04-16 19:20 ` [PATCH RFC 1/3] dwarf_loader: Add parameters list to inlined expansion Thierry Treyer via B4 Relay
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Thierry Treyer via B4 Relay @ 2025-04-16 19:20 UTC (permalink / raw)
  To: dwarves, bpf
  Cc: Thierry Treyer, acme, ast, yhs, andrii, ihor.solodrai,
	songliubraving, alan.maguire, mykolal

This proposal extends BTF to list the locations of inlined functions and
their arguments in a new '.BTF.inline` section.

== Background ==

Inline functions are often a blind spot for profiling and tracing tools:
* They cannot probe fully inlined functions.
  The BTF contains no data about them.
* They miss calls to partially inlined functions,
  where a function has a symbol, but is also inlined in some callers.
* They cannot account for time spent in inlined calls.
  Instead, they report the time to the caller.
* They don't provide a way to access the arguments of an inlined call.

The issue is exacerbated by Link-Time Optimization, which enables more
inlining across Object files. One workaround is to disable inlining for
the profiled functions, but that requires a whole kernel compilation and
doesn't allow for iterative exploration.

The information required to solve the above problems is not easily
accessible. It requires parsing most of the DWARF's '.debug_info` section,
which is time consuming and not trivial.
Instead, this proposal leverages and extends the existing information
contained in '.BTF` (for typing) and '.BTF.ext` (for caller location),
with information from a new section called '.BTF.inline`,
listing inlined instances.

== .BTF.inline Section ==

The new '.BTF.inline` section has a layout similar to '.BTF`.

 off |0-bit      |16-bits  |24-bits  |32-bits                           |
-----+-----------+---------+---------+----------------------------------+
0x00 |   magic   | version |  flags  |          header length           |
0x08 |      inline info offset       |        inline info length        |
0x10 |        location offset        |          location length         |
-----+------------------------------------------------------------------+
     ~                        inline info section                       ~
-----+------------------------------------------------------------------+
     ~                          location section                        ~
-----+------------------------------------------------------------------+

It starts with a header (see 'struct btf_inline_header`),
followed by two subsections:
1. The 'Inline Info' section contains an entry for each inlined function.
   Each entry describes the instance's location in its caller and is
   followed by the offsets in the 'Location' section of the parameters
   location expressions. See 'struct btf_inline_instance`.
2. The 'Location' section contains location expressions describing how
   to retrieve the value of a parameter. The expressions are NULL-
   terminated and are adressed similarly to '.BTF`'s string table.

struct btf_inline_header {
  uint16_t magic;
  uint8_t version, flags;
  uint32_t header_length;
  uint32_t inline_info_offset, inline_info_length;
  uint32_t location_offset, location_length;
};

struct btf_inline_instance {
  type_id_t callee_id;     // BTF id of the inlined function
  type_id_t caller_id;     // BTF id of the caller
  uint32_t caller_offset;  // offset of the callee within the caller
  uint16_t nr_parms;       // number of parameters
//uint32_t parm_location[nr_parms];  // offset of the location expression
};                                   // in 'Location' for each parameter

== Location Expressions ==

We looked at the DWARF location expressions for the arguments of inlined
instances having <= 100 instances, on a production kernel v6.9.0. This
yielded 176,800 instances with 269,327 arguments. We learned that most
expressions are simple register access, perhaps with an offset. We would
get access to 87% of the arguments by implementing literal and register.

Op. Category      Expr. Count    Expr. %
----------------------------------------
literal                 10714      3.98%
register+above         234698     87.14%
arithmetic+above       239444     88.90%
composite+above        240394     89.26%
stack+above            242075     89.88%
empty                   27252     10.12%

We propose to re-encode DWARF location expressions into a custom BTF
location expression format. It operates on a stack of values, similar to
DWARF's location expressions, but is stripped of unused operators,
while allowing future expansions.

A location expression is composed of a series of operations, terminated
by a NULL-byte/LOC_END_OF_EXPR operator. The very first expression in the
'Location' subsection must be an empty expression constisting only of
LOC_END_OF_EXPR.

An operator is a tagged union: the tag describes the operation to carry
out and the union contains the operands.

 ID | Operator Name        | Operands[...]
----+----------------------+-------------------------------------------
  0 | LOC_END_OF_EXPR      | _none_
  1 | LOC_SIGNED_CONST_1   |  s8: constant's value
  2 | LOC_SIGNED_CONST_2   | s16: constant's value
  3 | LOC_SIGNED_CONST_4   | s32: constant's value
  4 | LOC_SIGNED_CONST_8   | s64: constant's value
  5 | LOC_UNSIGNED_CONST_1 |  u8: constant's value
  6 | LOC_UNSIGNED_CONST_2 | u16: constant's value
  7 | LOC_UNSIGNED_CONST_4 | u32: constant's value
  8 | LOC_UNSIGNED_CONST_8 | u64: constant's value
  9 | LOC_REGISTER         |  u8: DWARF register number from the ABI
 10 | LOC_REGISTER_OFFSET  |  u8: DWARF register number from the ABI
                           | s64: offset added to the register's value
 11 | LOC_DEREF            |  u8: size of the deref'd type

This list should be further expanded to include arithmetic operations.

Example: accessing a field at offset 12B from a struct whose adresse is
         in the '%rdi` register, on amd64, has the following encoding:

[0x0a 0x05 0x000000000000000c] [0x0b 0x04] [0x00]
 |    |    ` Offset Added       |    |      ` LOC_END_OF_EXPR
 |    ` Register Number         |    ` Size of Deref.
 ` LOC_REGISTER_OFFSET          ` LOC_DEREF

== Summary ==

Combining the new information from '.BTF.inline` with the existing data
from '.BTF` and '.BTF.ext`, tools will be able to locate inline functions
and their arguments. Symbolizer can also use the data to display the
functions inlined at a given address.

Fully inlined functions are not part of the BTF and thus are not covered
by this proposal. Adding them to the BTF would enable their coverage and
should be considered.

Signed-off-by: Thierry Treyer <ttreyer@meta.com>
---
Thierry Treyer (3):
      dwarf_loader: Add parameters list to inlined expansion
      dwarf_loader: Add name to inline expansion
      inline_encoder: Introduce inline encoder to emit BTF.inline

 CMakeLists.txt   |   3 +-
 btf_encoder.c    |   5 +
 btf_encoder.h    |   2 +
 btf_inline.pk    |  55 ++++++
 dwarf_loader.c   | 176 ++++++++++++--------
 dwarves.c        |  26 +++
 dwarves.h        |   7 +
 inline_encoder.c | 496 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 inline_encoder.h |  25 +++
 pahole.c         |  40 ++++-
 10 files changed, 765 insertions(+), 70 deletions(-)
---
base-commit: 4ef47f84324e925051a55de10f9a4f44ef1da844
change-id: 20250416-btf_inline-e5047eea9b6f

Best regards,
-- 
Thierry Treyer <ttreyer@meta.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC 1/3] dwarf_loader: Add parameters list to inlined expansion
  2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
@ 2025-04-16 19:20 ` Thierry Treyer via B4 Relay
  2025-04-16 19:20 ` [PATCH RFC 2/3] dwarf_loader: Add name to inline expansion Thierry Treyer via B4 Relay
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Thierry Treyer via B4 Relay @ 2025-04-16 19:20 UTC (permalink / raw)
  To: dwarves, bpf
  Cc: Thierry Treyer, acme, ast, yhs, andrii, ihor.solodrai,
	songliubraving, alan.maguire, mykolal

From: Thierry Treyer <ttreyer@meta.com>

The parameters of inlined expansions are stored in the lexblock tag
list, making it difficult to iterate over the parameters of a given
inlined expansion.

Add a list of parameters to 'struct inline_expansion', so the parameters
are stored in their expansion instead of the lexblock.

Add location to 'struct parameter', so their location expression can be
referenced when consuming the DWARF.
Parameters with a constant value have their value stored in the location
too: 'location.expr' is set to NULL and 'location.exprlen' is set to the
constant's value.

Signed-off-by: Thierry Treyer <ttreyer@meta.com>
---
 dwarf_loader.c | 167 ++++++++++++++++++++++++++++++++++-----------------------
 dwarves.c      |  26 +++++++++
 dwarves.h      |   6 +++
 3 files changed, 132 insertions(+), 67 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 84122d04e42cab53c5598ae62b750a43e187e11d..24ac9afceb3793c165d3e92cfdfaf27ab67fd4d6 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1297,6 +1297,8 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
 
 		if (parm->has_loc) {
+			attr_location(die, &parm->location.expr, &parm->location.exprlen);
+
 			int expected_reg = cu->register_params[param_idx];
 			int actual_reg = parameter__reg(&attr, expected_reg);
 
@@ -1312,6 +1314,8 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 				parm->unexpected_reg = 1;
 		} else if (has_const_value) {
 			parm->optimized = 1;
+			parm->location.expr = NULL;
+			parm->location.exprlen = attr_numeric(die, DW_AT_const_value);
 		}
 	}
 
@@ -1374,6 +1378,8 @@ static struct inline_expansion *inline_expansion__new(Dwarf_Die *die, struct cu
 		dwarf_tag__set_attr_type(dtag, type, die, DW_AT_abstract_origin);
 		exp->ip.addr = 0;
 		exp->high_pc = 0;
+		exp->nr_parms = 0;
+		INIT_LIST_HEAD(&exp->parms);
 
 		if (!cu->has_addr_info)
 			goto out;
@@ -1794,6 +1800,7 @@ static struct tag *die__create_new_string_type(Dwarf_Die *die, struct cu *cu)
 static struct tag *die__create_new_parameter(Dwarf_Die *die,
 					     struct ftype *ftype,
 					     struct lexblock *lexblock,
+					     struct inline_expansion *exp,
 					     struct cu *cu, struct conf_load *conf,
 					     int param_idx)
 {
@@ -1808,15 +1815,16 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
 			if (add_child_llvm_annotations(die, param_idx, conf, &(tag__function(&ftype->tag)->annots)))
 				return NULL;
 		}
+	} else if (exp != NULL) {
+		/*
+		 * Inline expansion stores the parameters in a list to emit
+		 * .BTF_inline parameter location.
+		 */
+		inline_expansion__add_parameter(exp, parm);
 	} else {
 		/*
-		 * DW_TAG_formal_parameters on a non DW_TAG_subprogram nor
-		 * DW_TAG_subroutine_type tag happens sometimes, likely due to
-		 * compiler optimizing away a inline expansion (at least this
-		 * was observed in some cases, such as in the Linux kernel
-		 * current_kernel_time function circa 2.6.20-rc5), keep it in
-		 * the lexblock tag list because it can be referenced as an
-		 * DW_AT_abstract_origin in another DW_TAG_formal_parameter.
+		 * Keep it in the lexblock tag list because it can be referenced
+		 * as an DW_AT_abstract_origin in another DW_TAG_formal_parameter.
 		*/
 		lexblock__add_tag(lexblock, &parm->tag);
 	}
@@ -1884,7 +1892,7 @@ static struct tag *die__create_new_subroutine_type(Dwarf_Die *die,
 			tag__print_not_supported(die);
 			continue;
 		case DW_TAG_formal_parameter:
-			tag = die__create_new_parameter(die, ftype, NULL, cu, conf, -1);
+			tag = die__create_new_parameter(die, ftype, NULL, NULL, cu, conf, -1);
 			break;
 		case DW_TAG_unspecified_parameters:
 			ftype->unspec_parms = 1;
@@ -2136,10 +2144,13 @@ static struct tag *die__create_new_inline_expansion(Dwarf_Die *die,
 						    struct lexblock *lexblock,
 						    struct cu *cu, struct conf_load *conf);
 
-static int die__process_inline_expansion(Dwarf_Die *die, struct lexblock *lexblock, struct cu *cu, struct conf_load *conf)
+static int die__process_inline_expansion(Dwarf_Die *die,
+					  struct inline_expansion *exp, struct lexblock *lexblock,
+					  struct cu *cu, struct conf_load *conf)
 {
 	Dwarf_Die child;
 	struct tag *tag;
+	int parm_idx = 0;
 
 	if (!dwarf_haschildren(die) || dwarf_child(die, &child) != 0)
 		return 0;
@@ -2179,6 +2190,7 @@ static int die__process_inline_expansion(Dwarf_Die *die, struct lexblock *lexblo
 			 *
 			 * cu__tag_not_handled(cu, die);
 			 */
+			tag = die__create_new_parameter(die, NULL, lexblock, exp, cu, conf, parm_idx++);
 			continue;
 		case DW_TAG_inlined_subroutine:
 			tag = die__create_new_inline_expansion(die, lexblock, cu, conf);
@@ -2230,7 +2242,7 @@ static struct tag *die__create_new_inline_expansion(Dwarf_Die *die,
 	if (exp == NULL)
 		return NULL;
 
-	if (die__process_inline_expansion(die, lexblock, cu, conf) != 0) {
+	if (die__process_inline_expansion(die, exp, lexblock, cu, conf) != 0) {
 		tag__free(&exp->ip.tag, cu);
 		return NULL;
 	}
@@ -2315,7 +2327,7 @@ static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
 			continue;
 		}
 		case DW_TAG_formal_parameter:
-			tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++);
+			tag = die__create_new_parameter(die, ftype, lexblock, NULL, cu, conf, param_idx++);
 			break;
 		case DW_TAG_variable:
 			tag = die__create_new_variable(die, cu, conf, 0);
@@ -2607,54 +2619,87 @@ static void __tag__print_abstract_origin_not_found(struct tag *tag,
 #define tag__print_abstract_origin_not_found(tag) \
 	__tag__print_abstract_origin_not_found(tag, __func__, __LINE__)
 
+static void parameter__recode_dwarf_type(struct parameter *parm, struct cu *cu)
+{
+	struct dwarf_cu *dcu = cu->priv;
+	struct dwarf_tag *dparm = tag__dwarf(&parm->tag);
+
+	if (dparm->type == 0) {
+		if (dparm->abstract_origin == 0) {
+			/* Function without parameters */
+			parm->tag.type = 0;
+			return;
+		}
+		// FIXME: Should this use find_type_by_ref instead?
+		struct dwarf_tag *dtype = dwarf_cu__find_tag_by_ref(dcu, dparm, abstract_origin);
+		if (dtype == NULL) {
+			tag__print_abstract_origin_not_found(&parm->tag);
+			return;
+		}
+		struct parameter *oparm = tag__parameter(dtag__tag(dtype));
+		parm->name = oparm->name;
+		parm->tag.type = dtag__tag(dtype)->type;
+		/* Share location information between parameter and abstract origin;
+		 * if neither have location, we will mark the parameter as optimized out.
+		 * Also share info regarding unexpected register use for parameters.
+		 */
+		if (parm->has_loc)
+			oparm->has_loc = parm->has_loc;
+
+		if (parm->optimized)
+			oparm->optimized = parm->optimized;
+		if (parm->unexpected_reg)
+			oparm->unexpected_reg = parm->unexpected_reg;
+		return;
+	}
+
+	struct dwarf_tag *dtype = dwarf_cu__find_type_by_ref(dcu, dparm, type);
+	if (dtype == NULL) {
+		tag__print_type_not_found(&parm->tag);
+		return;
+	}
+	parm->tag.type = dtype->small_id;
+}
+
 static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 {
 	struct parameter *pos;
-	struct dwarf_cu *dcu = cu->priv;
 	struct ftype *type = tag__ftype(tag);
 
-	ftype__for_each_parameter(type, pos) {
-		struct dwarf_tag *dpos = tag__dwarf(&pos->tag);
-		struct parameter *opos;
-		struct dwarf_tag *dtype;
-
-		if (dpos->type == 0) {
-			if (dpos->abstract_origin == 0) {
-				/* Function without parameters */
-				pos->tag.type = 0;
-				continue;
-			}
-			dtype = dwarf_cu__find_tag_by_ref(dcu, dpos, abstract_origin);
-			if (dtype == NULL) {
-				tag__print_abstract_origin_not_found(&pos->tag);
-				continue;
-			}
-			opos = tag__parameter(dtag__tag(dtype));
-			pos->name = opos->name;
-			pos->tag.type = dtag__tag(dtype)->type;
-			/* share location information between parameter and
-			 * abstract origin; if neither have location, we will
-			 * mark the parameter as optimized out.  Also share
-			 * info regarding unexpected register use for
-			 * parameters.
-			 */
-			if (pos->has_loc)
-				opos->has_loc = pos->has_loc;
+	ftype__for_each_parameter(type, pos)
+		parameter__recode_dwarf_type(pos, cu);
+}
 
-			if (pos->optimized)
-				opos->optimized = pos->optimized;
-			if (pos->unexpected_reg)
-				opos->unexpected_reg = pos->unexpected_reg;
-			continue;
-		}
+static void inline_expansion__recode_dwarf_types(struct tag *tag, struct cu *cu)
+{
+	struct dwarf_cu *dcu = cu->priv;
+	struct dwarf_tag *dtag = tag__dwarf(tag);
 
-		dtype = dwarf_cu__find_type_by_ref(dcu, dpos, type);
-		if (dtype == NULL) {
-			tag__print_type_not_found(&pos->tag);
-			continue;
-		}
-		pos->tag.type = dtype->small_id;
+	/* DW_TAG_inlined_subroutine is an special case as dwarf_tag->id is
+	 * in fact an abtract origin, i.e. must be looked up in the tags_table,
+	 * not in the types_table.
+	 */
+	struct dwarf_tag *ftype = NULL;
+	if (dtag->type != 0)
+		ftype = dwarf_cu__find_tag_by_ref(dcu, dtag, type);
+	else
+		ftype = dwarf_cu__find_tag_by_ref(dcu, dtag, abstract_origin);
+	if (ftype == NULL) {
+		if (dtag->type != 0)
+			tag__print_type_not_found(tag);
+		else
+			tag__print_abstract_origin_not_found(tag);
+		return;
 	}
+
+	// TODO: Is ftype a prototype or a function?
+	ftype__recode_dwarf_types(dtag__tag(ftype), cu);
+
+	struct tag *pos;
+	struct inline_expansion *exp = tag__inline_expansion(tag);
+	list_for_each_entry(pos, &exp->parms, node)
+		parameter__recode_dwarf_type(tag__parameter(pos), cu);
+	exp->ip.tag.type = ftype->small_id;
 }
 
 static void lexblock__recode_dwarf_types(struct lexblock *tag, struct cu *cu)
@@ -2671,18 +2716,7 @@ static void lexblock__recode_dwarf_types(struct lexblock *tag, struct cu *cu)
 			lexblock__recode_dwarf_types(tag__lexblock(pos), cu);
 			continue;
 		case DW_TAG_inlined_subroutine:
-			if (dpos->type != 0)
-				dtype = dwarf_cu__find_tag_by_ref(dcu, dpos, type);
-			else
-				dtype = dwarf_cu__find_tag_by_ref(dcu, dpos, abstract_origin);
-			if (dtype == NULL) {
-				if (dpos->type != 0)
-					tag__print_type_not_found(pos);
-				else
-					tag__print_abstract_origin_not_found(pos);
-				continue;
-			}
-			ftype__recode_dwarf_types(dtag__tag(dtype), cu);
+			inline_expansion__recode_dwarf_types(pos, cu);
 			continue;
 
 		case DW_TAG_formal_parameter:
@@ -2862,12 +2896,11 @@ static int tag__recode_dwarf_type(struct tag *tag, struct cu *cu)
 
 	case DW_TAG_namespace:
 		return namespace__recode_dwarf_types(tag, cu);
-	/* Damn, DW_TAG_inlined_subroutine is an special case
-           as dwarf_tag->id is in fact an abtract origin, i.e. must be
-	   looked up in the tags_table, not in the types_table.
-	   The others also point to routines, so are in tags_table */
 	case DW_TAG_inlined_subroutine:
+		inline_expansion__recode_dwarf_types(tag, cu);
+		return 0;
 	case DW_TAG_imported_module:
+	  /* Modules point to routines, so are in tags_table */
 		dtype = dwarf_cu__find_tag_by_ref(cu->priv, dtag, type);
 		goto check_type;
 	/* Can be for both types and non types */
diff --git a/dwarves.c b/dwarves.c
index ef93239d26827711e23b405dd113986867d18507..2232ee713d57a92a700be56be7b7e1128741e24a 100644
--- a/dwarves.c
+++ b/dwarves.c
@@ -203,6 +203,20 @@ void formal_parameter_pack__delete(struct formal_parameter_pack *pack, struct cu
 	cu__tag_free(cu, &pack->tag);
 }
 
+static void parameter__delete(struct parameter *parm, struct cu *cu);
+
+static void inline_expansion__delete(struct inline_expansion *exp, struct cu *cu)
+{
+	if (exp == NULL)
+		return;
+
+	struct tag *parm, *n;
+	list_for_each_entry_safe_reverse(parm, n, &exp->parms, node) {
+		list_del_init(&parm->node);
+		parameter__delete(tag__parameter(parm), cu);
+	}
+}
+
 void tag__delete(struct tag *tag, struct cu *cu)
 {
 	if (tag == NULL)
@@ -225,6 +239,8 @@ void tag__delete(struct tag *tag, struct cu *cu)
 		ftype__delete(tag__ftype(tag), cu);		break;
 	case DW_TAG_subprogram:
 		function__delete(tag__function(tag), cu); break;
+	case DW_TAG_inlined_subroutine:
+		inline_expansion__delete(tag__inline_expansion(tag), cu); break;
 	case DW_TAG_lexical_block:
 		lexblock__delete(tag__lexblock(tag), cu); break;
 	case DW_TAG_GNU_template_parameter_pack:
@@ -1514,6 +1530,12 @@ void formal_parameter_pack__add(struct formal_parameter_pack *pack, struct param
 	list_add_tail(&param->tag.node, &pack->params);
 }
 
+void inline_expansion__add_parameter(struct inline_expansion *exp, struct parameter *parm)
+{
+	++exp->nr_parms;
+	list_add_tail(&parm->tag.node, &exp->parms);
+}
+
 void lexblock__add_tag(struct lexblock *block, struct tag *tag)
 {
 	list_add_tail(&tag->node, &block->tags);
@@ -2116,6 +2138,10 @@ static int list__for_all_tags(struct list_head *list, struct cu *cu,
 			if (list__for_all_tags(&tag__lexblock(pos)->tags,
 					       cu, iterator, cookie))
 				return 1;
+		} else if (pos->tag == DW_TAG_inlined_subroutine) {
+			if (list__for_all_tags(&tag__inline_expansion(pos)->parms,
+					       cu, iterator, cookie))
+				return 1;
 		}
 
 		if (iterator(pos, cu, cookie))
diff --git a/dwarves.h b/dwarves.h
index 8234e1a09128c667a2a516dccb6d7c6a194f8c5b..38efd6a6e2b0b0e5a571a8d12bbec33502509d8b 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -811,6 +811,8 @@ struct inline_expansion {
 	struct ip_tag	 ip;
 	size_t		 size;
 	uint64_t	 high_pc;
+	struct list_head parms;
+	uint16_t   nr_parms;
 };
 
 static inline struct inline_expansion *
@@ -819,6 +821,9 @@ static inline struct inline_expansion *
 	return (struct inline_expansion *)tag;
 }
 
+struct parameter;
+void inline_expansion__add_parameter(struct inline_expansion *exp, struct parameter *parm);
+
 struct label {
 	struct ip_tag	 ip;
 	const char	 *name;
@@ -929,6 +934,7 @@ size_t lexblock__fprintf(const struct lexblock *lexblock, const struct cu *cu,
 struct parameter {
 	struct tag tag;
 	const char *name;
+	struct location location;
 	uint8_t optimized:1;
 	uint8_t unexpected_reg:1;
 	uint8_t has_loc:1;

-- 
2.47.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 2/3] dwarf_loader: Add name to inline expansion
  2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
  2025-04-16 19:20 ` [PATCH RFC 1/3] dwarf_loader: Add parameters list to inlined expansion Thierry Treyer via B4 Relay
@ 2025-04-16 19:20 ` Thierry Treyer via B4 Relay
  2025-04-16 19:20 ` [PATCH RFC 3/3] inline_encoder: Introduce inline encoder to emit BTF.inline Thierry Treyer via B4 Relay
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Thierry Treyer via B4 Relay @ 2025-04-16 19:20 UTC (permalink / raw)
  To: dwarves, bpf
  Cc: Thierry Treyer, acme, ast, yhs, andrii, ihor.solodrai,
	songliubraving, alan.maguire, mykolal

From: Thierry Treyer <ttreyer@meta.com>

Add the name field to inline expansion. It contains the name of the
abstract origin. The name can be used to locate the origin function in
the BTF.

Surely there's a better way to find the btf_id of the original function?

Signed-off-by: Thierry Treyer <ttreyer@meta.com>
---
 dwarf_loader.c | 9 +++++++++
 dwarves.h      | 1 +
 2 files changed, 10 insertions(+)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 24ac9afceb3793c165d3e92cfdfaf27ab67fd4d6..d7a130d138a65bad71a563cf3d629bdd7b9e6c6f 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1376,6 +1376,15 @@ static struct inline_expansion *inline_expansion__new(Dwarf_Die *die, struct cu
 		dtag->decl_file = attr_string(die, DW_AT_call_file, conf);
 		dtag->decl_line = attr_numeric(die, DW_AT_call_line);
 		dwarf_tag__set_attr_type(dtag, type, die, DW_AT_abstract_origin);
+
+		Dwarf_Attribute attr_orig;
+		if (dwarf_attr(die, DW_AT_abstract_origin, &attr_orig)) {
+			Dwarf_Die die_orig;
+			if (dwarf_formref_die(&attr_orig, &die_orig)) {
+				exp->name = attr_string(&die_orig, DW_AT_name, conf);
+			}
+		}
+
 		exp->ip.addr = 0;
 		exp->high_pc = 0;
 		exp->nr_parms = 0;
diff --git a/dwarves.h b/dwarves.h
index 38efd6a6e2b0b0e5a571a8d12bbec33502509d8b..13b19433fe4aa7cd54ec0160c7008a33cbc17e24 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -809,6 +809,7 @@ struct ip_tag {
 
 struct inline_expansion {
 	struct ip_tag	 ip;
+	const char *name;
 	size_t		 size;
 	uint64_t	 high_pc;
 	struct list_head parms;

-- 
2.47.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 3/3] inline_encoder: Introduce inline encoder to emit BTF.inline
  2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
  2025-04-16 19:20 ` [PATCH RFC 1/3] dwarf_loader: Add parameters list to inlined expansion Thierry Treyer via B4 Relay
  2025-04-16 19:20 ` [PATCH RFC 2/3] dwarf_loader: Add name to inline expansion Thierry Treyer via B4 Relay
@ 2025-04-16 19:20 ` Thierry Treyer via B4 Relay
  2025-04-25 18:40 ` [PATCH RFC 0/3] list inline expansions in .BTF.inline Daniel Xu
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Thierry Treyer via B4 Relay @ 2025-04-16 19:20 UTC (permalink / raw)
  To: dwarves, bpf
  Cc: Thierry Treyer, acme, ast, yhs, andrii, ihor.solodrai,
	songliubraving, alan.maguire, mykolal

From: Thierry Treyer <ttreyer@meta.com>

Signed-off-by: Thierry Treyer <ttreyer@meta.com>
---
 CMakeLists.txt   |   3 +-
 btf_encoder.c    |   5 +
 btf_encoder.h    |   2 +
 btf_inline.pk    |  55 ++++++
 inline_encoder.c | 496 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 inline_encoder.h |  25 +++
 pahole.c         |  40 ++++-
 7 files changed, 623 insertions(+), 3 deletions(-)

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 7ca17a6789a9e71512ffb6fe5bf2683d643209e5..38302d5b6a6ded5094ebecd83fec9eca16230be2 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -109,7 +109,8 @@ endif()
 
 set(dwarves_LIB_SRCS dwarves.c dwarves_fprintf.c gobuffer.c
 		     ctf_loader.c libctf.c btf_encoder.c btf_loader.c
-		     dwarf_loader.c dutil.c elf_symtab.c rbtree.c)
+		     inline_encoder.c dwarf_loader.c dutil.c
+		     elf_symtab.c rbtree.c)
 if (NOT LIBBPF_FOUND)
 	list(APPEND dwarves_LIB_SRCS $<TARGET_OBJECTS:bpf>)
 endif()
diff --git a/btf_encoder.c b/btf_encoder.c
index 511c1ea5ee6bee67dbaec89c3d8e415af5c8e674..ead62e9034140c257e7703f58270aba19f422582 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -2682,3 +2682,8 @@ out:
 	encoder->cu = NULL;
 	return err;
 }
+
+struct btf *btf_encoder__btf(struct btf_encoder *encoder)
+{
+	return encoder->btf;
+}
diff --git a/btf_encoder.h b/btf_encoder.h
index 0f345e20c2d3065511d46d7377403a4a5fbdfdef..42d21b1cbfe71fd6ee54f41a8ae153197b3021c6 100644
--- a/btf_encoder.h
+++ b/btf_encoder.h
@@ -29,4 +29,6 @@ void btf_encoder__delete(struct btf_encoder *encoder);
 int btf_encoder__encode(struct btf_encoder *encoder, struct conf_load *conf);
 int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct conf_load *conf_load);
 
+struct btf *btf_encoder__btf(struct btf_encoder *encoder);
+
 #endif /* _BTF_ENCODER_H_ */
diff --git a/btf_inline.pk b/btf_inline.pk
new file mode 100755
index 0000000000000000000000000000000000000000..b7b984f189c9d0e6f82dc993005164e0d3609333
--- /dev/null
+++ b/btf_inline.pk
@@ -0,0 +1,55 @@
+#!/usr/bin/poke -L
+!#
+
+type II_location_op = struct {
+  uint<8> ty;
+  offset<uint<8>, B> size;
+  union {
+    byte[0] nil: ty == 0;
+    int<8> signed_const_1: ty == 1;
+    int<16> signed_const_2: ty == 2;
+    int<32> signed_const_4: ty == 3;
+    int<64> signed_const_8: ty == 4;
+    uint<8> unsigned_const_1: ty == 5;
+    uint<16> unsigned_const_2: ty == 6;
+    uint<32> unsigned_const_4: ty == 7;
+    uint<64> unsigned_const_8: ty == 8;
+    struct {
+      uint<8> register;
+      uint<64> offset;
+    } register: ty == 9;
+  } value;
+};
+
+type II_inline_instance = struct {
+  uint<64> insn_offset;
+  uint<32> type_id;
+  uint<16> param_count;
+  offset<uint<32>, B>[param_count] param_offsets;
+};
+
+type II_Header = struct {
+  uint<16> magic == 0xeb9f;
+  uint<8> version;
+  uint<8> flags;
+  uint<32> header_size;
+  offset<uint<32>, B> inline_info_offset;
+  offset<uint<32>, B> inline_info_size;
+  offset<uint<32>, B> location_offset;
+  offset<uint<32>, B> location_size;
+};
+
+type II = struct {
+  II_Header hdr;
+  II_inline_instance[hdr.inline_info_size] inline_instances @ hdr.inline_info_offset;
+  II_location_op[hdr.location_size] locations @ hdr.location_offset;
+};
+
+if (!poke_interactive_p) {
+  var inline_file = open("/tmp/inline_expansions.btf", IOS_M_RDONLY);
+  var inline_info = II @ inline_file : 0#B;
+
+  for (exp in inline_info.inline_instances) {
+    printf ("{low_pc => '0x%u64x',type_id => %u32d}\n", exp.insn_offset, exp.type_id);
+  }
+}
diff --git a/inline_encoder.c b/inline_encoder.c
new file mode 100644
index 0000000000000000000000000000000000000000..b4a681bd996088109ddf0a8167690ce83026324b
--- /dev/null
+++ b/inline_encoder.c
@@ -0,0 +1,496 @@
+/*
+  SPDX-License-Identifier: GPL-2.0-only
+
+  Copyright (C) 2025 Facebook
+
+  Derived from ctf_encoder.c, which is:
+
+  Copyright (C) Arnaldo Carvalho de Melo <acme@redhat.com>
+  Copyright (C) Red Hat Inc
+ */
+#include <bpf/btf.h>
+
+#include "dutil.h"
+#include "dwarves.h"
+#include "inline_encoder.h"
+#include "list.h"
+
+#include <assert.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/limits.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+
+enum loc_type {
+	LOC_END_OF_EXPR,
+	LOC_SIGNED_CONST_1,
+	LOC_SIGNED_CONST_2,
+	LOC_SIGNED_CONST_4,
+	LOC_SIGNED_CONST_8,
+	LOC_UNSIGNED_CONST_1,
+	LOC_UNSIGNED_CONST_2,
+	LOC_UNSIGNED_CONST_4,
+	LOC_UNSIGNED_CONST_8,
+	LOC_REGISTER,
+} __attribute__((packed));
+
+struct loc {
+	enum loc_type type;
+	uint8_t size;
+};
+
+struct inline_parameter {
+	struct loc *location[16];
+};
+
+struct inline_instance {
+	struct list_head node;
+	uint64_t die_offset;
+	const char *name;
+	uint64_t insn_offset;
+	type_id_t type_id;
+	uint16_t param_count;
+	struct inline_parameter parameters[];
+};
+
+struct inline_encoder {
+	struct btf *btf;
+	struct cu *cu;
+	const char *source_filename;
+	const char *filename;
+	size_t inline_instance_cnt;
+
+	struct list_head inline_instances;
+};
+
+struct loc_node {
+	struct rb_node rb_node;
+	struct list_head node;
+	struct loc loc;
+};
+
+struct inline_encoder__header {
+	uint16_t magic;
+	uint8_t version;
+	uint8_t flags;
+	uint32_t header_size;
+	uint32_t inline_info_offset;
+	uint32_t inline_info_size;
+	uint32_t location_offset;
+	uint32_t location_size;
+};
+
+struct inline_encoder *inline_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load)
+{
+	struct inline_encoder *encoder = zalloc(sizeof(*encoder));
+
+	if (encoder) {
+		encoder->cu = cu;
+		encoder->source_filename = strdup(cu->filename);
+		encoder->filename = strdup(detached_filename ?: cu->filename);
+		encoder->btf = base_btf;
+		encoder->inline_instance_cnt = 0;
+
+		INIT_LIST_HEAD(&encoder->inline_instances);
+	}
+
+	return encoder;
+}
+
+void inline_encoder__delete(struct inline_encoder *encoder)
+{
+	if (encoder == NULL)
+		return;
+
+	zfree(&encoder->source_filename);
+	zfree(&encoder->filename);
+	encoder->btf = NULL; // Non-owning pointer to base BTF
+
+	struct inline_instance *exp, *n;
+	list_for_each_entry_safe_reverse(exp, n, &encoder->inline_instances, node) {
+		list_del_init(&exp->node);
+		for (size_t i = 0; i < exp->param_count; ++i)
+			for (size_t j = 0; j < 16; ++j)
+				free(exp->parameters[i].location[j]);
+		free(exp);
+	}
+
+	free(encoder);
+}
+
+static size_t inline_instance__sizeof(uint16_t param_count)
+{
+	return offsetof(struct inline_instance, parameters) + param_count * sizeof(struct inline_parameter);
+}
+
+static struct loc *loc__make_const(bool is_signed, size_t size, uint64_t value)
+{
+	struct loc *new_loc = zalloc(sizeof(*new_loc) + size);
+	if (new_loc == NULL)
+		return NULL;
+	new_loc->size = sizeof(struct loc) + size;
+	switch (size) {
+		case 1:
+			new_loc->type = is_signed ? LOC_SIGNED_CONST_1 : LOC_UNSIGNED_CONST_1;
+			break;
+		case 2:
+			new_loc->type = is_signed ? LOC_SIGNED_CONST_2 : LOC_UNSIGNED_CONST_2;
+			break;
+		case 4:
+			new_loc->type = is_signed ? LOC_SIGNED_CONST_4 : LOC_UNSIGNED_CONST_4;
+			break;
+		case 8:
+			new_loc->type = is_signed ? LOC_SIGNED_CONST_8 : LOC_UNSIGNED_CONST_8;
+			break;
+		default:
+			free(new_loc);
+			return NULL;
+	}
+	uint8_t *data = (uint8_t *)new_loc + sizeof(struct loc);
+	memcpy(data, &value, size);
+	return new_loc;
+}
+
+static struct loc *loc__make_register(uint8_t reg, int64_t offset)
+{
+	struct loc *new_loc = zalloc(sizeof(*new_loc) + sizeof(reg) + sizeof(offset));
+	if (new_loc == NULL)
+		return NULL;
+	new_loc->size = sizeof(struct loc) + sizeof(uint8_t) + sizeof(int64_t);
+	new_loc->type = LOC_REGISTER;
+	uint8_t *data = (uint8_t *)new_loc + sizeof(struct loc);
+	*data = reg;
+	data += sizeof(uint8_t);
+	memcpy(data, &offset, sizeof(int64_t));
+	return new_loc;
+}
+
+static struct loc *loc__make_eoe(void)
+{
+	struct loc *new_loc = zalloc(sizeof(*new_loc));
+	if (new_loc == NULL)
+		return NULL;
+	new_loc->type = LOC_END_OF_EXPR;
+	new_loc->size = sizeof(*new_loc);
+	return new_loc;
+}
+
+static uint32_t expr__size(struct loc *expr[16])
+{
+	uint32_t size = 0;
+	for (size_t i = 0; i < 16; ++i) {
+		if (expr[i] == NULL)
+			break;
+		size += expr[i]->size;
+	}
+	return size;
+}
+
+static int inline_encoder__encode_location(struct inline_encoder *encoder, struct location *loc, struct loc *expr[16])
+{
+	if (loc->expr == NULL && loc->exprlen == 0)
+		return 0;
+
+	if (loc->expr == NULL) {
+		expr[0] = loc__make_const(false, 8, loc->exprlen);
+		return 0;
+	}
+
+	size_t expr_i = 0;
+	for (size_t i = 0; i < loc->exprlen; ++i) {
+		Dwarf_Op op = loc->expr[i];
+		switch (op.atom) {
+			case DW_OP_const1u:
+			case DW_OP_const1s:
+				expr[expr_i++] = loc__make_const(op.atom == DW_OP_const1s, 1, op.number);
+				break;
+			case DW_OP_const2u:
+			case DW_OP_const2s:
+				expr[expr_i++] = loc__make_const(op.atom == DW_OP_const1s, 2, op.number);
+				break;
+			case DW_OP_const4u:
+			case DW_OP_const4s:
+				expr[expr_i++] = loc__make_const(op.atom == DW_OP_const1s, 4, op.number);
+				break;
+			case DW_OP_const8u:
+			case DW_OP_const8s:
+				expr[expr_i++] = loc__make_const(op.atom == DW_OP_const1s, 8, op.number);
+				break;
+			case DW_OP_constu:
+			case DW_OP_consts:
+				expr[expr_i++] = loc__make_const(op.atom == DW_OP_consts, 8, op.number);
+				break;
+			case DW_OP_lit0 ... DW_OP_lit31:
+				expr[expr_i++] = loc__make_const(false, 1, op.atom - DW_OP_lit0);
+				break;
+			case DW_OP_reg0 ... DW_OP_reg31:
+				expr[expr_i++] = loc__make_register(op.atom - DW_OP_reg0, 0);
+				break;
+			case DW_OP_breg0 ... DW_OP_breg31: {
+				expr[expr_i++] = loc__make_register(op.atom - DW_OP_breg0, op.number);
+				break;
+			}
+			case DW_OP_stack_value: {
+				// no-op
+				break;
+			}
+			default:
+				goto out_err;
+		}
+		if (expr_i >= 16)
+			goto out_err;
+	}
+
+	expr[expr_i++] = loc__make_eoe();
+	return 0;
+
+out_err:
+	for (size_t i = 0; i < expr_i; ++i) {
+		free(expr[i]);
+		expr[i] = NULL;
+	}
+
+	return -1;
+}
+
+static inline struct dwarf_tag *tag__dwarf(const struct tag *tag)
+{
+	uint8_t *data = (uint8_t *)tag;
+	return (struct dwarf_tag *)data;
+}
+
+static inline uint64_t die_offset(const struct tag *tag)
+{
+	uint8_t *data = (uint8_t *)tag;
+	data -= 64;
+	data += 24;
+	return *(uint64_t *)data;
+}
+
+static int inline_encoder__save_inline_expansion(struct inline_encoder *encoder, struct inline_expansion *exp)
+{
+	if (exp->name == NULL)
+		return 0;
+
+	struct inline_instance *instance = zalloc(inline_instance__sizeof(exp->nr_parms));
+	if (instance == NULL)
+		return -ENOMEM;
+
+	instance->die_offset = die_offset((struct tag *)exp);
+	instance->name = exp->name;
+	instance->insn_offset = exp->ip.addr;
+	instance->type_id = -1;
+	instance->param_count = exp->nr_parms;
+	// printf("inline instance for %u (%s): %lx %lx\n", instance->type_id, exp->name, instance->die_offset, instance->insn_offset);
+
+	uint32_t param_index = 0;
+	struct parameter *param = NULL;
+	list_for_each_entry(param, &exp->parms, tag.node) {
+		inline_encoder__encode_location(encoder, &param->location, instance->parameters[param_index++].location);
+	}
+
+	INIT_LIST_HEAD(&instance->node);
+	list_add_tail(&instance->node, &encoder->inline_instances);
+	encoder->inline_instance_cnt++;
+
+	return 0;
+}
+
+static int inline_encoder__encode_lexblock(struct inline_encoder *encoder, struct lexblock *lexblock, struct conf_load *conf_load)
+{
+	int err = 0;
+
+	struct tag *tag = NULL;
+	list_for_each_entry(tag, &lexblock->tags, node) {
+		if (tag->tag == DW_TAG_lexical_block) {
+			err = inline_encoder__encode_lexblock(encoder, tag__lexblock(tag), conf_load);
+			if (err)
+				goto out;
+			continue;
+		} else if (tag->tag != DW_TAG_inlined_subroutine) {
+			continue;
+		}
+
+		struct inline_expansion *exp = tag__inline_expansion(tag);
+		err = inline_encoder__save_inline_expansion(encoder, exp);
+		if (err)
+			goto out;
+	}
+
+	return err;
+
+out:
+	return err;
+}
+
+int inline_encoder__encode_cu(struct inline_encoder *encoder, struct cu *cu, struct conf_load *conf_load)
+{
+	int err = 0;
+
+	uint32_t fn_id;
+	struct function *fn;
+	cu__for_each_function(cu, fn_id, fn) {
+		if (fn->declaration)
+			continue;
+		if (function__addr(fn) == 0)
+			continue;
+
+		err = inline_encoder__encode_lexblock(encoder, &fn->lexblock, conf_load);
+		if (err)
+			goto out;
+	}
+	return err;
+out:
+	return err;
+}
+
+struct node_type_id {
+	const char *name;
+	type_id_t type_id;
+};
+
+static int cmpstrp(const void *left, const void *right)
+{
+	struct node_type_id *left_node = (struct node_type_id*)left;
+	struct node_type_id *right_node = (struct node_type_id*)right;
+	return strcmp(left_node->name, right_node->name);
+}
+
+static struct node_type_id *inline_encoder__build_type_id_cache(struct inline_encoder *encoder)
+{
+	struct node_type_id *exps = (struct node_type_id*)malloc(encoder->inline_instance_cnt * sizeof(struct node_type_id));
+
+	size_t exp_id = 0;
+	struct inline_instance *exp;
+	list_for_each_entry(exp, &encoder->inline_instances, node) {
+		exps[exp_id++] = (struct node_type_id){exp->name, 0};
+	}
+
+	qsort(exps, encoder->inline_instance_cnt, sizeof(struct node_type_id), cmpstrp);
+
+	return exps;
+}
+
+static int inline_encoder__write_raw_file(struct inline_encoder *encoder, const char *filename)
+{
+	int err = -1;
+
+	int fd = creat(filename, S_IRUSR | S_IWUSR);
+	if (fd == -1) {
+		fprintf(stderr, "%s open(%s) failed!\n", __func__, filename);
+		goto out;
+	}
+
+	struct inline_encoder__header header = {
+		.magic = 0xeb9f,
+		.version = 1,
+		.flags = 0,
+		.header_size = sizeof(header),
+		.inline_info_offset = sizeof(struct inline_encoder__header),
+		.inline_info_size = 0,
+		.location_offset = 0,
+		.location_size = 2,
+	};
+	write(fd, &header, header.header_size);
+
+	struct node_type_id *type_id_cache = inline_encoder__build_type_id_cache(encoder);
+
+	struct inline_instance *exp;
+	list_for_each_entry(exp, &encoder->inline_instances, node) {
+		struct node_type_id lookup_key = { exp->name, 0 };
+		struct node_type_id *found = bsearch(&lookup_key, type_id_cache, encoder->inline_instance_cnt, sizeof(lookup_key), cmpstrp);
+		assert(found != NULL);
+		if (found->type_id == 0) {
+			int type_id = btf__find_by_name_kind(encoder->btf, exp->name, BTF_KIND_FUNC);
+			found->type_id = (type_id < 0) ? -1 : type_id;
+		}
+		exp->type_id = found->type_id;
+		if (exp->type_id == -1) continue;
+
+		const void *data = exp;
+		header.inline_info_size += write(
+			fd,
+			data + offsetof(struct inline_instance, insn_offset),
+			inline_instance__sizeof(0)
+				- offsetof(struct inline_instance, insn_offset)
+				- 2); // Skip padding at the end of the struct
+		for (uint16_t i = 0; i < exp->param_count; ++i) {
+			struct inline_parameter *param = &exp->parameters[i];
+			if (param->location[0] == NULL) {
+				uint32_t zero = 0;
+				header.inline_info_size += write(fd, &zero, sizeof(zero));
+			} else {
+				header.inline_info_size += write(fd, &header.location_size, sizeof(header.location_size));
+				header.location_size += expr__size(param->location);
+			}
+		}
+	}
+
+	free(type_id_cache);
+
+	struct loc end_of_expr = {
+		.type = LOC_END_OF_EXPR,
+		.size = sizeof(end_of_expr),
+	};
+	write(fd, &end_of_expr, sizeof(end_of_expr));
+	list_for_each_entry(exp, &encoder->inline_instances, node) {
+		if (exp->type_id == -1)
+			continue;
+		for (uint16_t i = 0; i < exp->param_count; ++i) {
+			struct inline_parameter *param = &exp->parameters[i];
+			for (size_t j = 0; j < 16; ++j) {
+				struct loc *op = param->location[j];
+				if (op == NULL)
+					break;
+				write(fd, op, op->size);
+			}
+		}
+	}
+	header.location_offset = header.inline_info_offset + header.inline_info_size;
+	lseek(fd, 0, SEEK_SET);
+	write(fd, &header, header.header_size);
+
+	close(fd);
+	return 0;
+
+out:
+	if (fd != - 1)
+		close(fd);
+	unlink(filename);
+	return err;
+}
+
+int inline_encoder__encode(struct inline_encoder *encoder, struct conf_load *conf_load)
+{
+	char tmp_fn[PATH_MAX];
+	snprintf(tmp_fn, sizeof(tmp_fn), "%s.btf_inline", encoder->filename);
+
+	int err = inline_encoder__write_raw_file(encoder, tmp_fn);
+	if (err) return err;
+
+	const char *llvm_objcopy = getenv("LLVM_OBJCOPY");
+	if (!llvm_objcopy)
+		llvm_objcopy = "llvm-objcopy";
+
+	char cmd[PATH_MAX * 2];
+	snprintf(cmd, sizeof(cmd), "%s --add-section .BTF_inline=%s %s",
+		 llvm_objcopy, tmp_fn, encoder->filename);
+	if (system(cmd)) {
+		fprintf(stderr, "%s: failed to add .BTF_inline section '%s': %d!\n",
+				__func__, tmp_fn, errno);
+		err = -1;
+		goto unlink;
+	}
+
+	err = 0;
+unlink:
+	unlink(tmp_fn);
+	return err;
+}
+
+void inline_encoder__set_btf(struct inline_encoder *encoder, struct btf *btf)
+{
+	encoder->btf = btf;
+}
diff --git a/inline_encoder.h b/inline_encoder.h
new file mode 100644
index 0000000000000000000000000000000000000000..c3472d011a56e84efc77d85fcee1a6aa88204c86
--- /dev/null
+++ b/inline_encoder.h
@@ -0,0 +1,25 @@
+#ifndef _INLINE_ENCODER_H_
+#define _INLINE_ENCODER_H_ 1
+/*
+  SPDX-License-Identifier: GPL-2.0-only
+
+  Copyright (C) 2025 Facebook
+
+  Derived from btf_encoder.h, which is:
+  Copyright (C) Arnaldo Carvalho de Melo <acme@redhat.com>
+ */
+#include <stdbool.h>
+
+struct inline_encoder;
+struct conf_load;
+struct btf;
+struct cu;
+
+struct inline_encoder *inline_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load);
+void inline_encoder__delete(struct inline_encoder *encoder);
+int inline_encoder__encode_cu(struct inline_encoder *encoder, struct cu *cu, struct conf_load *conf_load);
+int inline_encoder__encode(struct inline_encoder *encoder, struct conf_load *conf_load);
+
+void inline_encoder__set_btf(struct inline_encoder *encoder, struct btf *btf);
+
+#endif /* _INLINE_ENCODER_H_ */
diff --git a/pahole.c b/pahole.c
index af3e1cfe224dd14a5be2c072461505cf3cc387c5..093d76478dbd5a687c44b0f19fce73cfe6aae161 100644
--- a/pahole.c
+++ b/pahole.c
@@ -28,12 +28,15 @@
 #include "dutil.h"
 //#include "ctf_encoder.h" FIXME: disabled, probably its better to move to Oracle's libctf
 #include "btf_encoder.h"
+#include "inline_encoder.h"
 
 static struct btf_encoder *btf_encoder;
+static struct inline_encoder *inline_encoder;
 static char *detached_btf_filename;
 struct cus *cus;
 static bool btf_encode;
 static bool ctf_encode;
+static bool inline_encode;
 static bool sort_output;
 static bool need_resort;
 static bool first_obj_only;
@@ -1638,6 +1641,11 @@ static const struct argp_option pahole__options[] = {
 		.key  = 'J',
 		.doc  = "Encode as BTF",
 	},
+	{
+		.name = "inline_encode",
+		.key  = 'L',
+		.doc  = "Encode inline expansions into .BTF.inline section",
+	},
 	{
 		.name = "btf_encode_detached",
 		.key  = ARGP_btf_encode_detached,
@@ -1832,6 +1840,7 @@ static error_t pahole__options_parser(int key, char *arg,
 							break;
 	case ARGP_btf_encode_detached:
 		  detached_btf_filename = arg; // fallthru
+	case 'L': inline_encode = inline_encode || key == 'L'; // fallthru
 	case 'J': btf_encode = 1;
 		  conf_load.get_addr_info = true;
 		  conf_load.ignore_alignment_attr = true;
@@ -3140,6 +3149,22 @@ static enum load_steal_kind pahole_stealer__btf_encode(struct cu *cu, struct con
 		return LSK__STOP_LOADING;
 	}
 
+	if (inline_encode) {
+		if (!inline_encoder)
+			inline_encoder = inline_encoder__new(cu, detached_btf_filename, conf_load->base_btf, global_verbose, conf_load);
+
+		if (!inline_encoder) {
+			fprintf(stderr, "Error creating inline encoder.\n");
+			return LSK__STOP_LOADING;
+		}
+
+		err = inline_encoder__encode_cu(inline_encoder, cu, conf_load);
+		if (err < 0) {
+			fprintf(stderr, "Error while encoding BTF.inline.\n");
+			return LSK__STOP_LOADING;
+		}
+	}
+
 	return LSK__DELETE;
 }
 
@@ -3672,10 +3697,19 @@ try_sole_arg_as_class_names:
 
 	if (btf_encode && btf_encoder) { // maybe all CUs were filtered out and thus we don't have an encoder?
 		err = btf_encoder__encode(btf_encoder, &conf_load);
-		btf_encoder__delete(btf_encoder);
 		if (err) {
 			fputs("Failed to encode BTF\n", stderr);
-			goto out_cus_delete;
+			goto out_btf_encoder_delete;
+		}
+
+		if (inline_encode && inline_encoder) {
+			inline_encoder__set_btf(inline_encoder, btf_encoder__btf(btf_encoder));
+			err = inline_encoder__encode(inline_encoder, &conf_load);
+			inline_encoder__delete(inline_encoder);
+			if (err) {
+				fputs("Failed to encode BTF.inline\n", stderr);
+				goto out_btf_encoder_delete;
+			}
 		}
 	}
 out_ok:
@@ -3683,6 +3717,8 @@ out_ok:
 		print_stats();
 
 	rc = EXIT_SUCCESS;
+out_btf_encoder_delete:
+	btf_encoder__delete(btf_encoder);
 out_cus_delete:
 #ifdef DEBUG_CHECK_LEAKS
 	cus__delete(cus);

-- 
2.47.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
                   ` (2 preceding siblings ...)
  2025-04-16 19:20 ` [PATCH RFC 3/3] inline_encoder: Introduce inline encoder to emit BTF.inline Thierry Treyer via B4 Relay
@ 2025-04-25 18:40 ` Daniel Xu
  2025-04-28 20:51   ` Alexei Starovoitov
  2025-04-30 15:25 ` Alan Maguire
  2025-05-19 12:02 ` Alan Maguire
  5 siblings, 1 reply; 21+ messages in thread
From: Daniel Xu @ 2025-04-25 18:40 UTC (permalink / raw)
  To: ttreyer
  Cc: dwarves, bpf, acme, ast, yhs, andrii, ihor.solodrai,
	songliubraving, alan.maguire, mykolal

Hi Thierry,

Great progress!

Some high level notes below.

On Wed, Apr 16, 2025 at 07:20:34PM +0000, Thierry Treyer via B4 Relay wrote:
> This proposal extends BTF to list the locations of inlined functions and
> their arguments in a new '.BTF.inline` section.
> 
> == Background ==
> 
> Inline functions are often a blind spot for profiling and tracing tools:
> * They cannot probe fully inlined functions.
>   The BTF contains no data about them.
> * They miss calls to partially inlined functions,
>   where a function has a symbol, but is also inlined in some callers.
> * They cannot account for time spent in inlined calls.
>   Instead, they report the time to the caller.
> * They don't provide a way to access the arguments of an inlined call.
> 
> The issue is exacerbated by Link-Time Optimization, which enables more
> inlining across Object files. One workaround is to disable inlining for
> the profiled functions, but that requires a whole kernel compilation and
> doesn't allow for iterative exploration.
> 
> The information required to solve the above problems is not easily
> accessible. It requires parsing most of the DWARF's '.debug_info` section,
> which is time consuming and not trivial.
> Instead, this proposal leverages and extends the existing information
> contained in '.BTF` (for typing) and '.BTF.ext` (for caller location),
> with information from a new section called '.BTF.inline`,
> listing inlined instances.
> 
> == .BTF.inline Section ==
> 
> The new '.BTF.inline` section has a layout similar to '.BTF`.
> 
>  off |0-bit      |16-bits  |24-bits  |32-bits                           |
> -----+-----------+---------+---------+----------------------------------+
> 0x00 |   magic   | version |  flags  |          header length           |
> 0x08 |      inline info offset       |        inline info length        |
> 0x10 |        location offset        |          location length         |
> -----+------------------------------------------------------------------+
>      ~                        inline info section                       ~
> -----+------------------------------------------------------------------+
>      ~                          location section                        ~
> -----+------------------------------------------------------------------+
> 
> It starts with a header (see 'struct btf_inline_header`),
> followed by two subsections:
> 1. The 'Inline Info' section contains an entry for each inlined function.
>    Each entry describes the instance's location in its caller and is
>    followed by the offsets in the 'Location' section of the parameters
>    location expressions. See 'struct btf_inline_instance`.
> 2. The 'Location' section contains location expressions describing how
>    to retrieve the value of a parameter. The expressions are NULL-
>    terminated and are adressed similarly to '.BTF`'s string table.
> 
> struct btf_inline_header {
>   uint16_t magic;
>   uint8_t version, flags;
>   uint32_t header_length;
>   uint32_t inline_info_offset, inline_info_length;
>   uint32_t location_offset, location_length;
> };
> 
> struct btf_inline_instance {
>   type_id_t callee_id;     // BTF id of the inlined function
>   type_id_t caller_id;     // BTF id of the caller
>   uint32_t caller_offset;  // offset of the callee within the caller
>   uint16_t nr_parms;       // number of parameters
> //uint32_t parm_location[nr_parms];  // offset of the location expression
> };                                   // in 'Location' for each parameter
> 
> == Location Expressions ==
> 
> We looked at the DWARF location expressions for the arguments of inlined
> instances having <= 100 instances, on a production kernel v6.9.0. This
> yielded 176,800 instances with 269,327 arguments. We learned that most
> expressions are simple register access, perhaps with an offset. We would
> get access to 87% of the arguments by implementing literal and register.
> 
> Op. Category      Expr. Count    Expr. %
> ----------------------------------------
> literal                 10714      3.98%
> register+above         234698     87.14%
> arithmetic+above       239444     88.90%
> composite+above        240394     89.26%
> stack+above            242075     89.88%
> empty                   27252     10.12%
> 
> We propose to re-encode DWARF location expressions into a custom BTF
> location expression format. It operates on a stack of values, similar to
> DWARF's location expressions, but is stripped of unused operators,
> while allowing future expansions.

A stack machine seems overkill. I'm certainly not an expert on DWARF
location expressions, but I think we need to get away from arbitrarily
complex expressions, even if they are simpler than DWARF ones. I don't
think we want consumers implementing any kind of interpreter or VM.

I'd vote for something extremely prescriptive, even if it means adding a
lot of enum variants. At least this way, consumers can be sure they've
fully implemented the spec and detect when more complex support is
added.

> 
> A location expression is composed of a series of operations, terminated
> by a NULL-byte/LOC_END_OF_EXPR operator. The very first expression in the
> 'Location' subsection must be an empty expression constisting only of
> LOC_END_OF_EXPR.
> 
> An operator is a tagged union: the tag describes the operation to carry
> out and the union contains the operands.
>  
>  ID | Operator Name        | Operands[...]
> ----+----------------------+-------------------------------------------
>   0 | LOC_END_OF_EXPR      | _none_
>   1 | LOC_SIGNED_CONST_1   |  s8: constant's value
>   2 | LOC_SIGNED_CONST_2   | s16: constant's value
>   3 | LOC_SIGNED_CONST_4   | s32: constant's value
>   4 | LOC_SIGNED_CONST_8   | s64: constant's value
>   5 | LOC_UNSIGNED_CONST_1 |  u8: constant's value
>   6 | LOC_UNSIGNED_CONST_2 | u16: constant's value
>   7 | LOC_UNSIGNED_CONST_4 | u32: constant's value
>   8 | LOC_UNSIGNED_CONST_8 | u64: constant's value
>   9 | LOC_REGISTER         |  u8: DWARF register number from the ABI
>  10 | LOC_REGISTER_OFFSET  |  u8: DWARF register number from the ABI
>                            | s64: offset added to the register's value
>  11 | LOC_DEREF            |  u8: size of the deref'd type
> 
> This list should be further expanded to include arithmetic operations.
> 
> Example: accessing a field at offset 12B from a struct whose adresse is
>          in the '%rdi` register, on amd64, has the following encoding:
> 
> [0x0a 0x05 0x000000000000000c] [0x0b 0x04] [0x00]
>  |    |    ` Offset Added       |    |      ` LOC_END_OF_EXPR
>  |    ` Register Number         |    ` Size of Deref.
>  ` LOC_REGISTER_OFFSET          ` LOC_DEREF
> 
> == Summary ==
> 
> Combining the new information from '.BTF.inline` with the existing data
> from '.BTF` and '.BTF.ext`, tools will be able to locate inline functions
> and their arguments. Symbolizer can also use the data to display the
> functions inlined at a given address.
> 
> Fully inlined functions are not part of the BTF and thus are not covered
> by this proposal. Adding them to the BTF would enable their coverage and
> should be considered.

I think supporting fully inlined functions as part of this work would
add a lot of value to users. It doesn't necessarily have to happen in
the first patchset, but we probably want to plan on doing it.

Regarding BTF, another option is to just leave `callee_id` unset, right?
Consumers should be able to recognize BTF for the inlined function isn't
available and then act accordingly. For bpftrace, that probably means
not allowing function argument access.

[...]

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-04-25 18:40 ` [PATCH RFC 0/3] list inline expansions in .BTF.inline Daniel Xu
@ 2025-04-28 20:51   ` Alexei Starovoitov
  2025-04-29 19:14     ` Thierry Treyer
  0 siblings, 1 reply; 21+ messages in thread
From: Alexei Starovoitov @ 2025-04-28 20:51 UTC (permalink / raw)
  To: Daniel Xu
  Cc: ttreyer, dwarves, bpf, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Ihor Solodrai,
	Song Liu, Alan Maguire, Mykola Lysenko

On Fri, Apr 25, 2025 at 11:41 AM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> > We propose to re-encode DWARF location expressions into a custom BTF
> > location expression format. It operates on a stack of values, similar to
> > DWARF's location expressions, but is stripped of unused operators,
> > while allowing future expansions.
>
> A stack machine seems overkill. I'm certainly not an expert on DWARF
> location expressions, but I think we need to get away from arbitrarily
> complex expressions, even if they are simpler than DWARF ones. I don't
> think we want consumers implementing any kind of interpreter or VM.

+1
This was already brought up at lsfmm.
stack machine in BTF is not an option.
I see the appeal of simple inline_encoder__encode_location() logic
that takes dwarf VM and re-encodes things pretty much as-is.
This is a wrong trade-off.
pahole side needs to do full expression analysis and emit
easy to parse location expression.

> >  ID | Operator Name        | Operands[...]
> > ----+----------------------+-------------------------------------------
> >   0 | LOC_END_OF_EXPR      | _none_
> >   1 | LOC_SIGNED_CONST_1   |  s8: constant's value
> >   2 | LOC_SIGNED_CONST_2   | s16: constant's value
> >   3 | LOC_SIGNED_CONST_4   | s32: constant's value
> >   4 | LOC_SIGNED_CONST_8   | s64: constant's value
> >   5 | LOC_UNSIGNED_CONST_1 |  u8: constant's value
> >   6 | LOC_UNSIGNED_CONST_2 | u16: constant's value
> >   7 | LOC_UNSIGNED_CONST_4 | u32: constant's value
> >   8 | LOC_UNSIGNED_CONST_8 | u64: constant's value
> >   9 | LOC_REGISTER         |  u8: DWARF register number from the ABI
> >  10 | LOC_REGISTER_OFFSET  |  u8: DWARF register number from the ABI
> >                            | s64: offset added to the register's value
> >  11 | LOC_DEREF            |  u8: size of the deref'd type

LOC_END_OF_EXPR shouldn't be there. This is just an artifact of stack VM.
I don't see patch 3 using LOC_DEREF.
register vs register_offset is another artifact of dwarf.
It looks to me we only need:
- register
- load_from_reg_plus_offset
- constant

>
> I think supporting fully inlined functions as part of this work would
> add a lot of value to users. It doesn't necessarily have to happen in
> the first patchset, but we probably want to plan on doing it.
>
> Regarding BTF, another option is to just leave `callee_id` unset, right?
> Consumers should be able to recognize BTF for the inlined function isn't
> available and then act accordingly. For bpftrace, that probably means
> not allowing function argument access.

For fully inlined we still need callee_id otherwise bpftrace won't know
argument types. It shouldn't be hard to emit btf even for fully
inlined functions, but we need to be smart not to bloat BTF to dwarf sizes.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-04-28 20:51   ` Alexei Starovoitov
@ 2025-04-29 19:14     ` Thierry Treyer
  2025-04-29 23:58       ` Alexei Starovoitov
  0 siblings, 1 reply; 21+ messages in thread
From: Thierry Treyer @ 2025-04-29 19:14 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Xu, dwarves@vger.kernel.org, bpf, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Ihor Solodrai,
	Song Liu, Alan Maguire, Mykola Lysenko

> It looks to me we only need:
> - register
> - load_from_reg_plus_offset
> - constant

In that case, we can shorten the list of operators:

### | Operator Name          | Operands[...]
----+------------------------+-------------------------------------------
  1 | LOC_SIGNED_CONST_1     |  s8: constant's value
  2 | LOC_SIGNED_CONST_2     | s16: constant's value
  3 | LOC_SIGNED_CONST_4     | s32: constant's value
  4 | LOC_SIGNED_CONST_8     | s64: constant's value
  5 | LOC_UNSIGNED_CONST_1   |  u8: constant's value
  6 | LOC_UNSIGNED_CONST_2   | u16: constant's value
  7 | LOC_UNSIGNED_CONST_4   | u32: constant's value
  8 | LOC_UNSIGNED_CONST_8   | u64: constant's value
  9 | LOC_REGISTER           |  u8: DWARF register number from the ABI
 10 | LOC_REGISTER_OFFSET    |  u8: DWARF register number from the ABI
                             | s64: offset added to the register's value

> register vs register_offset is another artifact of dwarf.
> ...
> - load_from_reg_plus_offset

What is the difference between LOC_REGISTER_OFFSET
and load_from_reg_plus_offset?

> For fully inlined we still need callee_id otherwise bpftrace won't know
> argument types. It shouldn't be hard to emit btf even for fully
> inlined functions, but we need to be smart not to bloat BTF to dwarf sizes.

I'll come back with an estimate for the size increase of the BTF.

Have a great day,
Thierry

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-04-29 19:14     ` Thierry Treyer
@ 2025-04-29 23:58       ` Alexei Starovoitov
  0 siblings, 0 replies; 21+ messages in thread
From: Alexei Starovoitov @ 2025-04-29 23:58 UTC (permalink / raw)
  To: Thierry Treyer
  Cc: Daniel Xu, dwarves@vger.kernel.org, bpf, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko, Ihor Solodrai,
	Song Liu, Alan Maguire, Mykola Lysenko

On Tue, Apr 29, 2025 at 12:14 PM Thierry Treyer <ttreyer@meta.com> wrote:
>
> > It looks to me we only need:
> > - register
> > - load_from_reg_plus_offset
> > - constant
>
> In that case, we can shorten the list of operators:
>
> ### | Operator Name          | Operands[...]
> ----+------------------------+-------------------------------------------
>   1 | LOC_SIGNED_CONST_1     |  s8: constant's value
>   2 | LOC_SIGNED_CONST_2     | s16: constant's value
>   3 | LOC_SIGNED_CONST_4     | s32: constant's value
>   4 | LOC_SIGNED_CONST_8     | s64: constant's value
>   5 | LOC_UNSIGNED_CONST_1   |  u8: constant's value
>   6 | LOC_UNSIGNED_CONST_2   | u16: constant's value
>   7 | LOC_UNSIGNED_CONST_4   | u32: constant's value
>   8 | LOC_UNSIGNED_CONST_8   | u64: constant's value
>   9 | LOC_REGISTER           |  u8: DWARF register number from the ABI
>  10 | LOC_REGISTER_OFFSET    |  u8: DWARF register number from the ABI
>                              | s64: offset added to the register's value

Since we want to make the format compat let's use s32 for offset.
We can even consider s16.
Since that's typically stack and kernel stack is rarely above 2k.

>
> > register vs register_offset is another artifact of dwarf.
> > ...
> > - load_from_reg_plus_offset
>
> What is the difference between LOC_REGISTER_OFFSET
> and load_from_reg_plus_offset?

Just a different name, because LOC_REGISTER_OFFSET is confusing.
LOC_REGISTER means that the value of the argument is in that register.
LOC_REGISTER_OFFSET can be interpreted that the argument value
is in the register plus constant.
But that's not the case. reg+const is an address in memory when
value is stored.
So we need a different name.
Probably load_from_reg_plus_offset is confusing too.
How about
LOC_ADDR_REGISTER_OFFSET

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
                   ` (3 preceding siblings ...)
  2025-04-25 18:40 ` [PATCH RFC 0/3] list inline expansions in .BTF.inline Daniel Xu
@ 2025-04-30 15:25 ` Alan Maguire
  2025-05-01 19:38   ` Thierry Treyer
  2025-05-19 12:02 ` Alan Maguire
  5 siblings, 1 reply; 21+ messages in thread
From: Alan Maguire @ 2025-04-30 15:25 UTC (permalink / raw)
  To: ttreyer, dwarves, bpf
  Cc: acme, ast, yhs, andrii, ihor.solodrai, songliubraving, mykolal

On 16/04/2025 20:20, Thierry Treyer via B4 Relay wrote:
> This proposal extends BTF to list the locations of inlined functions and
> their arguments in a new '.BTF.inline` section.
> 
> == Background ==
> 
> Inline functions are often a blind spot for profiling and tracing tools:
> * They cannot probe fully inlined functions.
>   The BTF contains no data about them.
> * They miss calls to partially inlined functions,
>   where a function has a symbol, but is also inlined in some callers.
> * They cannot account for time spent in inlined calls.
>   Instead, they report the time to the caller.
> * They don't provide a way to access the arguments of an inlined call.
> 
> The issue is exacerbated by Link-Time Optimization, which enables more
> inlining across Object files. One workaround is to disable inlining for
> the profiled functions, but that requires a whole kernel compilation and
> doesn't allow for iterative exploration.
> 
> The information required to solve the above problems is not easily
> accessible. It requires parsing most of the DWARF's '.debug_info` section,
> which is time consuming and not trivial.
> Instead, this proposal leverages and extends the existing information
> contained in '.BTF` (for typing) and '.BTF.ext` (for caller location),
> with information from a new section called '.BTF.inline`,
> listing inlined instances.
> 
> == .BTF.inline Section ==
> 
> The new '.BTF.inline` section has a layout similar to '.BTF`.
> 
>  off |0-bit      |16-bits  |24-bits  |32-bits                           |
> -----+-----------+---------+---------+----------------------------------+
> 0x00 |   magic   | version |  flags  |          header length           |
> 0x08 |      inline info offset       |        inline info length        |
> 0x10 |        location offset        |          location length         |
> -----+------------------------------------------------------------------+
>      ~                        inline info section                       ~
> -----+------------------------------------------------------------------+
>      ~                          location section                        ~
> -----+------------------------------------------------------------------+
> 
> It starts with a header (see 'struct btf_inline_header`),
> followed by two subsections:
> 1. The 'Inline Info' section contains an entry for each inlined function.
>    Each entry describes the instance's location in its caller and is
>    followed by the offsets in the 'Location' section of the parameters
>    location expressions. See 'struct btf_inline_instance`.
> 2. The 'Location' section contains location expressions describing how
>    to retrieve the value of a parameter. The expressions are NULL-
>    terminated and are adressed similarly to '.BTF`'s string table.
> 
> struct btf_inline_header {
>   uint16_t magic;
>   uint8_t version, flags;
>   uint32_t header_length;
>   uint32_t inline_info_offset, inline_info_length;
>   uint32_t location_offset, location_length;
> };
> 
> struct btf_inline_instance {
>   type_id_t callee_id;     // BTF id of the inlined function
>   type_id_t caller_id;     // BTF id of the caller
>   uint32_t caller_offset;  // offset of the callee within the caller
>   uint16_t nr_parms;       // number of parameters
> //uint32_t parm_location[nr_parms];  // offset of the location expression
> };                                   // in 'Location' for each parameter
> 
> == Location Expressions ==
> 
> We looked at the DWARF location expressions for the arguments of inlined
> instances having <= 100 instances, on a production kernel v6.9.0. This
> yielded 176,800 instances with 269,327 arguments. We learned that most
> expressions are simple register access, perhaps with an offset. We would
> get access to 87% of the arguments by implementing literal and register.
> 
> Op. Category      Expr. Count    Expr. %
> ----------------------------------------
> literal                 10714      3.98%
> register+above         234698     87.14%
> arithmetic+above       239444     88.90%
> composite+above        240394     89.26%
> stack+above            242075     89.88%
> empty                   27252     10.12%
> 
> We propose to re-encode DWARF location expressions into a custom BTF
> location expression format. It operates on a stack of values, similar to
> DWARF's location expressions, but is stripped of unused operators,
> while allowing future expansions.
> 
> A location expression is composed of a series of operations, terminated
> by a NULL-byte/LOC_END_OF_EXPR operator. The very first expression in the
> 'Location' subsection must be an empty expression constisting only of
> LOC_END_OF_EXPR.
> 
> An operator is a tagged union: the tag describes the operation to carry
> out and the union contains the operands.
>  
>  ID | Operator Name        | Operands[...]
> ----+----------------------+-------------------------------------------
>   0 | LOC_END_OF_EXPR      | _none_
>   1 | LOC_SIGNED_CONST_1   |  s8: constant's value
>   2 | LOC_SIGNED_CONST_2   | s16: constant's value
>   3 | LOC_SIGNED_CONST_4   | s32: constant's value
>   4 | LOC_SIGNED_CONST_8   | s64: constant's value
>   5 | LOC_UNSIGNED_CONST_1 |  u8: constant's value
>   6 | LOC_UNSIGNED_CONST_2 | u16: constant's value
>   7 | LOC_UNSIGNED_CONST_4 | u32: constant's value
>   8 | LOC_UNSIGNED_CONST_8 | u64: constant's value
>   9 | LOC_REGISTER         |  u8: DWARF register number from the ABI
>  10 | LOC_REGISTER_OFFSET  |  u8: DWARF register number from the ABI
>                            | s64: offset added to the register's value
>  11 | LOC_DEREF            |  u8: size of the deref'd type
> 
> This list should be further expanded to include arithmetic operations.
> 
> Example: accessing a field at offset 12B from a struct whose adresse is
>          in the '%rdi` register, on amd64, has the following encoding:
> 
> [0x0a 0x05 0x000000000000000c] [0x0b 0x04] [0x00]
>  |    |    ` Offset Added       |    |      ` LOC_END_OF_EXPR
>  |    ` Register Number         |    ` Size of Deref.
>  ` LOC_REGISTER_OFFSET          ` LOC_DEREF
> 
> == Summary ==
> 
> Combining the new information from '.BTF.inline` with the existing data
> from '.BTF` and '.BTF.ext`, tools will be able to locate inline functions
> and their arguments. Symbolizer can also use the data to display the
> functions inlined at a given address.
> 
> Fully inlined functions are not part of the BTF and thus are not covered
> by this proposal. Adding them to the BTF would enable their coverage and
> should be considered.
> 
> Signed-off-by: Thierry Treyer <ttreyer@meta.com>

This is fantastic work; a huge step forward having a practical
implementation of inline handling! So while I have some suggestions
about how we might be able to somewhat rework things to tackle some
associated problems and integrate more completely with existing BTF
representations I want to make sure they do not detract from the huge
progress you've made here.

There are some existing problems we have in tracing functions that could
benefit from an approach like this.  In the goal to maximize the
traceable surface of the system by representing functions in BTF, we
currently have to make some compromises. These are:

1. multiple functions with the same name and different function
signatures. Since we have no mechanism currently to associate function
and site we simply refuse to encode them in BTF today
2. functions with inconsistent representations. If a function does not
use the expected registers for its function signature due to
optimizations we leave it out of BTF representation; and of course
3. inline functions are not currently represented at all.

I think we can do a better job with 1 and 2 while solving 3 as well.
Here's my suggestion.

First, we separate functions which have complicated relationships with
their parameters (cases 1, 2 and 3) into a separate .BTF.func_aux
section or similar. That can be delivered in vmlinux or via a
special-purpose module; for modules it would be just a separate ELF
section as it would likely be small. We can control access to ensure
unprivileged users cannot get address information, hence the separation
from vmlinux BTF. But it is just (split) BTF, so no new format required.

The advantage of this is that tracers today can do the straightforward
tracing of functions from /sys/kernel/btf/vmlinux, and if a function is
not there and the tracer supports handling more complex cases, it can
know to look in /sys/kernel/btf/vmlinux.func_aux.

In that section we have a split BTF representation in which function
signatures for cases 1, 2, and 3 are represented in the usual way (FUNC
pointing at FUNC_PROTO). However since we know that the relationship
between function and its site(s) is complex, we need some extra info.
I'd propose we add a DATASEC containing functions and their addresses. A
FUNC datasec it could be laid out as follows

struct btf_func_secinfo {
	__u32 type;
	__u32 offset;
	__u32 loc;
};

In the case of 1 (multiple signatures for a function name) the DATASEC
will have entries for each site which tie it to its associated FUNC.
This allows us to clarify which function is associated with which
address. So the type is the BTF_KIND_FUNC, the offset the address and
the loc is 0 since we don't need it for this case since the functions
have parameters in expected locations.

In the case of 2 (functions with inconsistent representations) we use
the type to point at the FUNC, the offset the address of the function
and the loc to represent location info for that site. By leaving out
caller/callee info from location data we could potentially exploit the
fact that a lot of locations have similar layouts in terms of where
parameters are available, making dedup of location info possible.
Caller/callee relationship can still be inferred via the site address.

Finally in case 3 we have inlines which would be represented similarly
to case 2; i.e. we marry a representation of the function (the type) to
the associated inline site via location data in the loc field.

Does this approach sound workable? I think from your side it gives you
what you need:

- signals to tracers that functions have a standard form (FUNC is
present in vmlinux BTF)
- signals that functions have a more complex, site-specific form (FUNC
is present in vmlinux.func_aux)
- privileged-only access to address info by controlling access to the
/sys/kernel/btf representation of the func aux info
- a way to trace an inlined function via secinfo mapping function
signature to location at a specific address

If so, the question becomes what are we missing today? As far as I can
see we need

- support for new kinds BTF_KIND_FUNC_DATASEC, or simply use the kind
flag for existing BTF datasec to indicate function info
- support for new location kind
- pahole support to generate address-based datasec and location separately
- for modules, we would eventually need multi-split BTF that would allow
the func aux section to be split BTF on top of existing module BTF, i.e.
a 3-level split BTF

As I think some of the challenges you ran into implementing this
indicate, the current approach of matching ELF and DWARF info via name
only is creaking at the seams, and needs to be reworked (in fact it is
the source of a bug Alexei ran into around missing kfuncs). So I'm
hoping to get a patch out this week that uses address info to aid the
matching between ELF/DWARF, and from there it's a short jump to using it
in DATASEC representations.

Anyway let me know what you think. If it sounds workable we could
perhaps try prototyping the pieces and see if we can get them working
with location info.

Thanks!

Alan

> ---
> Thierry Treyer (3):
>       dwarf_loader: Add parameters list to inlined expansion
>       dwarf_loader: Add name to inline expansion
>       inline_encoder: Introduce inline encoder to emit BTF.inline
> 
>  CMakeLists.txt   |   3 +-
>  btf_encoder.c    |   5 +
>  btf_encoder.h    |   2 +
>  btf_inline.pk    |  55 ++++++
>  dwarf_loader.c   | 176 ++++++++++++--------
>  dwarves.c        |  26 +++
>  dwarves.h        |   7 +
>  inline_encoder.c | 496 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  inline_encoder.h |  25 +++
>  pahole.c         |  40 ++++-
>  10 files changed, 765 insertions(+), 70 deletions(-)
> ---
> base-commit: 4ef47f84324e925051a55de10f9a4f44ef1da844
> change-id: 20250416-btf_inline-e5047eea9b6f
> 
> Best regards,

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-04-30 15:25 ` Alan Maguire
@ 2025-05-01 19:38   ` Thierry Treyer
  2025-05-02  8:31     ` Alan Maguire
  0 siblings, 1 reply; 21+ messages in thread
From: Thierry Treyer @ 2025-05-01 19:38 UTC (permalink / raw)
  To: Alan Maguire
  Cc: dwarves@vger.kernel.org, bpf@vger.kernel.org, acme@kernel.org,
	ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko

> 1. multiple functions with the same name and different function
> signatures. Since we have no mechanism currently to associate function
> and site we simply refuse to encode them in BTF today
> 2. functions with inconsistent representations. If a function does not
> use the expected registers for its function signature due to
> optimizations we leave it out of BTF representation; and of course
> 3. inline functions are not currently represented at all.
> 
> I think we can do a better job with 1 and 2 while solving 3 as well.
> Here's my suggestion.

I see how your approach covers all these problems!
I would also add the following issue, which is a variant of 2 and 3:

4. Partially inlined functions: functions having a symbol, but is also
        inlined at some call sites. Currently not represented either.

> First, we separate functions which have complicated relationships with
> their parameters (cases 1, 2 and 3) into a separate .BTF.func_aux
> section or similar. That can be delivered in vmlinux or via a
> special-purpose module; for modules it would be just a separate ELF
> section as it would likely be small. We can control access to ensure
> unprivileged users cannot get address information, hence the separation
> from vmlinux BTF. But it is just (split) BTF, so no new format required.
> 
> The advantage of this is that tracers today can do the straightforward
> tracing of functions from /sys/kernel/btf/vmlinux, and if a function is
> not there and the tracer supports handling more complex cases, it can
> know to look in /sys/kernel/btf/vmlinux.func_aux.

Sounds good to me!
Laying out the format of this new .BTF.func_aux section:

+---------------------+
| BTF.func_aux header |
+---------------------+
~  type info section  ~
+---------------------+
~   string section    ~
+---------------------+
~  location section   ~
+---------------------+

> In that section we have a split BTF representation in which function
> signatures for cases 1, 2, and 3 are represented in the usual way (FUNC
> pointing at FUNC_PROTO). However since we know that the relationship
> between function and its site(s) is complex, we need some extra info.

We have the same base as the BTF section, so we can encode FUNC and
FUNC_PROTO in the 'type info section'. The strings for the new functions'
names get deduplicated and stored in the 'string section'.

The 'location section' lists location expressions to locate the arguments.
As discussed with Alexei, _one_ LOC_* operation will describe the location
of _one_ argument; there is no series of operations to carry out in order
to retrieve the argument's value. This also makes re-using location
expressions across multiple arguments/functions through de-duplication.

> I'd propose we add a DATASEC containing functions and their addresses. A
> FUNC datasec it could be laid out as follows
> 
> struct btf_func_secinfo {
> __u32 type;
> __u32 offset;
> __u32 loc;
> };

We'd have a new BTF_KIND_FUNCSEC type followed by 'vlen' btf_func_secinfo.
I see how 'type' and 'offset' can be used to disambiguate between functions
sharing the same name, but I'm confused by 'loc'. Functions with multiple
arguments will need a location expression for each of them.
How about having another 'vlen', followed by the offsets into the location
section?

struct btf_func_secinfo {
__u32 type;
__u32 offset;
__u32 vlen;
// Followed by: __u32 locs[vlen];
}

Or did you have something else in mind?

> In the case of 1 (multiple signatures for a function name) the DATASEC
> will have entries for each site which tie it to its associated FUNC.
> This allows us to clarify which function is associated with which
> address. So the type is the BTF_KIND_FUNC, the offset the address and
> the loc is 0 since we don't need it for this case since the functions
> have parameters in expected locations.
> 
> In the case of 2 (functions with inconsistent representations) we use
> the type to point at the FUNC, the offset the address of the function
> and the loc to represent location info for that site. By leaving out
> caller/callee info from location data we could potentially exploit the
> fact that a lot of locations have similar layouts in terms of where
> parameters are available, making dedup of location info possible.
> Caller/callee relationship can still be inferred via the site address.
> 
> Finally in case 3 we have inlines which would be represented similarly
> to case 2; i.e. we marry a representation of the function (the type) to
> the associated inline site via location data in the loc field.

Here's how it could look like:

[1] FUNC_PROTO ...
      ...args
[2] FUNC 'foo' type_id=1   # 1. name collision with [4]
[3] FUNC_PROTO ...
      ...args
[4] FUNC 'foo' type_id=3   # 1. name collision with [2]
[5] FUNC_PROTO ...
      ...args
[6] FUNC 'bar' type_id=5   # 2. non-standard arguments location
[7] FUNC_PROTO ...
      ...args
[8] FUNC 'baz' type_id=7   # 3-4. partially/fully inlined function
[9] FUNCSEC '.text', vlen=5
  - type_id=2, offset=0x1000, loc=0 # 1. share the same name, but
  - type_id=4, offset=0x2000, loc=0 #    differentiate with the offset
  - type_id=6, offset=0x3000, loc=??? # 2. non-standard args location
    * offset of arg0 locexpr: 0x1234  #    each arg gets a loc offset
    * offset of arg1 locexpr: 0x5678  #    or some other encoding?
  - type_id=8, offset=0x4000, loc=0   # 4. non-inlined instance
  - type_id=8, offset=0x1050, loc=??? # 3. inlined instance
    * # ...args loc offsets

> If so, the question becomes what are we missing today? As far as I can
> see we need
> 
> - support for new kinds BTF_KIND_FUNC_DATASEC, or simply use the kind
> flag for existing BTF datasec to indicate function info
> - support for new location kind
> - pahole support to generate address-based datasec and location separately
> - for modules, we would eventually need multi-split BTF that would allow
> the func aux section to be split BTF on top of existing module BTF, i.e.
> a 3-level split BTF

Do you think locations should be part of the 'type info section'?
Or should they have their own 'location section'?

For modules, I'm less familiar with them.
Would you have some guidance about their requirements?

> As I think some of the challenges you ran into implementing this
> indicate, the current approach of matching ELF and DWARF info via name
> only is creaking at the seams, and needs to be reworked (in fact it is
> the source of a bug Alexei ran into around missing kfuncs). So I'm
> hoping to get a patch out this week that uses address info to aid the
> matching between ELF/DWARF, and from there it's a short jump to using it
> in DATASEC representations.
> 
> Anyway let me know what you think. If it sounds workable we could
> perhaps try prototyping the pieces and see if we can get them working
> with location info.

I'll look into emitting functions that are currently not represented,
because they fall in the pitfalls 1-4. That will give us the base for
the new .BTF.func_aux section.
I'm looking forward to use your patch to simplify the linking between
DWARF and BTF.

Thanks for your time and have a great day,
Thierry

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-01 19:38   ` Thierry Treyer
@ 2025-05-02  8:31     ` Alan Maguire
  0 siblings, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2025-05-02  8:31 UTC (permalink / raw)
  To: Thierry Treyer
  Cc: dwarves@vger.kernel.org, bpf@vger.kernel.org, acme@kernel.org,
	ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko

On 01/05/2025 20:38, Thierry Treyer wrote:
>> 1. multiple functions with the same name and different function
>> signatures. Since we have no mechanism currently to associate function
>> and site we simply refuse to encode them in BTF today
>> 2. functions with inconsistent representations. If a function does not
>> use the expected registers for its function signature due to
>> optimizations we leave it out of BTF representation; and of course
>> 3. inline functions are not currently represented at all.
>>
>> I think we can do a better job with 1 and 2 while solving 3 as well.
>> Here's my suggestion.
> 
> I see how your approach covers all these problems!
> I would also add the following issue, which is a variant of 2 and 3:
> 
> 4. Partially inlined functions: functions having a symbol, but is also
>         inlined at some call sites. Currently not represented either.
> 

exactly, should have mentioned that too as this is a real pain point for
tracing; we think we are seeing every time a function is hit, but since
it is partially inlined we do not.

>> First, we separate functions which have complicated relationships with
>> their parameters (cases 1, 2 and 3) into a separate .BTF.func_aux
>> section or similar. That can be delivered in vmlinux or via a
>> special-purpose module; for modules it would be just a separate ELF
>> section as it would likely be small. We can control access to ensure
>> unprivileged users cannot get address information, hence the separation
>> from vmlinux BTF. But it is just (split) BTF, so no new format required.
>>
>> The advantage of this is that tracers today can do the straightforward
>> tracing of functions from /sys/kernel/btf/vmlinux, and if a function is
>> not there and the tracer supports handling more complex cases, it can
>> know to look in /sys/kernel/btf/vmlinux.func_aux.
> 
> Sounds good to me!
> Laying out the format of this new .BTF.func_aux section:
> 
> +---------------------+
> | BTF.func_aux header |
> +---------------------+
> ~  type info section  ~
> +---------------------+
> ~   string section    ~
> +---------------------+
> ~  location section   ~
> +---------------------+
>

I'd say we could keep location info in the type section; that way we
benefit from existing dedup mechanisms (though we'd have to extend them
to cover locations).

>> In that section we have a split BTF representation in which function
>> signatures for cases 1, 2, and 3 are represented in the usual way (FUNC
>> pointing at FUNC_PROTO). However since we know that the relationship
>> between function and its site(s) is complex, we need some extra info.
> 
> We have the same base as the BTF section, so we can encode FUNC and
> FUNC_PROTO in the 'type info section'. The strings for the new functions'
> names get deduplicated and stored in the 'string section'.
> 
> The 'location section' lists location expressions to locate the arguments.
> As discussed with Alexei, _one_ LOC_* operation will describe the location
> of _one_ argument; there is no series of operations to carry out in order
> to retrieve the argument's value. This also makes re-using location
> expressions across multiple arguments/functions through de-duplication.
> 
>> I'd propose we add a DATASEC containing functions and their addresses. A
>> FUNC datasec it could be laid out as follows
>>
>> struct btf_func_secinfo {
>> __u32 type;
>> __u32 offset;
>> __u32 loc;
>> };
> 
> We'd have a new BTF_KIND_FUNCSEC type followed by 'vlen' btf_func_secinfo.
> I see how 'type' and 'offset' can be used to disambiguate between functions
> sharing the same name, but I'm confused by 'loc'. Functions with multiple
> arguments will need a location expression for each of them.
> How about having another 'vlen', followed by the offsets into the location
> section?
> 
> struct btf_func_secinfo {
> __u32 type;
> __u32 offset;
> __u32 vlen;
> // Followed by: __u32 locs[vlen];
> }
> 
> Or did you have something else in mind?
> 

Good point, didn't think this through. I guess we should probably
experiment here to find the most compact representation but could the
location be a BTF type BTF_KIND_LOC with a vlen specifying the number of
parameter location representations? Then we'd follow it with a location
representation for each (like what we do for struct members using a
vlen-length series of "struct btf_member"s in a BTF_KIND_STRUCT.
That would be easier I think than having variable length sec info entries.

The interesting question will be with deduplication of identical
location info, which representation is more compact? I'd say we should
try a few variations and see which one wins in practice.

>> In the case of 1 (multiple signatures for a function name) the DATASEC
>> will have entries for each site which tie it to its associated FUNC.
>> This allows us to clarify which function is associated with which
>> address. So the type is the BTF_KIND_FUNC, the offset the address and
>> the loc is 0 since we don't need it for this case since the functions
>> have parameters in expected locations.
>>
>> In the case of 2 (functions with inconsistent representations) we use
>> the type to point at the FUNC, the offset the address of the function
>> and the loc to represent location info for that site. By leaving out
>> caller/callee info from location data we could potentially exploit the
>> fact that a lot of locations have similar layouts in terms of where
>> parameters are available, making dedup of location info possible.
>> Caller/callee relationship can still be inferred via the site address.
>>
>> Finally in case 3 we have inlines which would be represented similarly
>> to case 2; i.e. we marry a representation of the function (the type) to
>> the associated inline site via location data in the loc field.
> 
> Here's how it could look like:
> 
> [1] FUNC_PROTO ...
>       ...args
> [2] FUNC 'foo' type_id=1   # 1. name collision with [4]
> [3] FUNC_PROTO ...
>       ...args
> [4] FUNC 'foo' type_id=3   # 1. name collision with [2]
> [5] FUNC_PROTO ...
>       ...args
> [6] FUNC 'bar' type_id=5   # 2. non-standard arguments location
> [7] FUNC_PROTO ...
>       ...args
> [8] FUNC 'baz' type_id=7   # 3-4. partially/fully inlined function
> [9] FUNCSEC '.text', vlen=5
>   - type_id=2, offset=0x1000, loc=0 # 1. share the same name, but
>   - type_id=4, offset=0x2000, loc=0 #    differentiate with the offset
>   - type_id=6, offset=0x3000, loc=??? # 2. non-standard args location
>     * offset of arg0 locexpr: 0x1234  #    each arg gets a loc offset
>     * offset of arg1 locexpr: 0x5678  #    or some other encoding?
>   - type_id=8, offset=0x4000, loc=0   # 4. non-inlined instance
>   - type_id=8, offset=0x1050, loc=??? # 3. inlined instance
>     * # ...args loc offsets
> 
>> If so, the question becomes what are we missing today? As far as I can
>> see we need
>>
>> - support for new kinds BTF_KIND_FUNC_DATASEC, or simply use the kind
>> flag for existing BTF datasec to indicate function info
>> - support for new location kind
>> - pahole support to generate address-based datasec and location separately
>> - for modules, we would eventually need multi-split BTF that would allow
>> the func aux section to be split BTF on top of existing module BTF, i.e.
>> a 3-level split BTF
> 
> Do you think locations should be part of the 'type info section'?
> Or should they have their own 'location section'?
>

I'd say part of the type info section as we can use existing dedup with
tweaks to support location info.

> For modules, I'm less familiar with them.
> Would you have some guidance about their requirements?
> 

For modules we use split BTF to represent the type/function info etc. So
we deduplicate the module type representation, removing redundant info
already in the vmlinux BTF. Then the split BTF that covers type info
only in the kernel starts at last_vmlinux_BTF_id + 1.

So if we are going to separate location/inline info, we'd probably use a
similar split BTF approach, but the challenge with modules is that we'd
be using split BTF on top of existing module BTF, which is itself split
BTF. So we'd need to support a 3-level split where the location-related
BTF can refer to types in the module BTF (split) or the base BTF
(vmlinux BTF). We haven't got any cases of such multi-split BTF but it's
something we've discussed in the past and would probably be quite
doable; would just need kernel and libbpf support.

>> As I think some of the challenges you ran into implementing this
>> indicate, the current approach of matching ELF and DWARF info via name
>> only is creaking at the seams, and needs to be reworked (in fact it is
>> the source of a bug Alexei ran into around missing kfuncs). So I'm
>> hoping to get a patch out this week that uses address info to aid the
>> matching between ELF/DWARF, and from there it's a short jump to using it
>> in DATASEC representations.
>>
>> Anyway let me know what you think. If it sounds workable we could
>> perhaps try prototyping the pieces and see if we can get them working
>> with location info.
> 
> I'll look into emitting functions that are currently not represented,
> because they fall in the pitfalls 1-4. That will give us the base for
> the new .BTF.func_aux section.

Sounds great!

> I'm looking forward to use your patch to simplify the linking between
> DWARF and BTF.
> 

I've sent a series to dwarves [1] that starts down that road storing
function addresses internally in pahole but not emitting them yet. Needs
more work but could be a basis for us to start working to emit function
datasec sections.

Thanks!

Alan

[1]
https://lore.kernel.org/dwarves/20250501145645.3317264-1-alan.maguire@oracle.com/

> Thanks for your time and have a great day,
> Thierry


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
                   ` (4 preceding siblings ...)
  2025-04-30 15:25 ` Alan Maguire
@ 2025-05-19 12:02 ` Alan Maguire
  2025-05-22 17:56   ` Thierry Treyer
  5 siblings, 1 reply; 21+ messages in thread
From: Alan Maguire @ 2025-05-19 12:02 UTC (permalink / raw)
  To: ttreyer, dwarves, bpf
  Cc: acme, ast, yhs, andrii, ihor.solodrai, songliubraving, mykolal

hi folks

I just wanted to try and capture some of the discussion from last week's
BPF office hours where we talked about this and hopefully we can
together plot a path forward that supports inline representation and
helps us fix some other long-standing issues with more complex function
representation. If I've missed anything important or if anything looks
wrong, please do chime in!

In discussing this, we concluded that

- separating the complex function representations into a separate .BTF
section (.BTF.func_aux or something like it) would be valuable since it
means tracers can continue to interact with existing function
representations that have a straightforward relationship between their
parameters and calling conventions stored in the .BTF section, and can
optionally also utilize the auxiliary function information in .BTF.func_aux

- this gives us a bit more freedom to add new kinds etc to that
auxiliary function info, and also to control unauthorized access that
might be able to retrieve a function address or other potentially
sensitive info from the aux function data

- it also means that the only kernel support we would likely initially
need to add would be to allow reading of
/sys/kernel/btf/vmlinux.func_aux , likely via a dummy module supporting
sysfs read.

- for modules, we would need to support multi-split BTF, i.e split BTF
in .BTF.func_aux in the module that sits atop the .BTF section of the
module which in turn sits atop the vmlinux BTF.  Again only userspace
and tooling support would likely be needed as a first step. I'm looking
at this now and it may require no or minimal code changes to libbpf,
just testing of the feature.  bpftool and pahole would need to support a
means of specifying multiple base BTFs in order, but that seems doable too.

We were less conclusive on the final form of the representation, but it
would ideally help support fully and partially inlined representations
and other situations we have today where the calling
convention-specified registers and the function parameters do not
cleanly line up. Today we leave such representations out of BTF but a
location representation would allow us to add them back in. Similarly
for functions with the same name but different signatures, having a
function address to clarify which signature goes with which site will help.

Again we don't have to solve all these problems at once but having them
in mind as we figure out the right form of the representation will help.

Something along the lines of the variable section where we have triples
of <function type id, site address, location BTF id> for each function
site will play a role. Again the exact form of the location data is TBD,
but we can experiment here to maximize compactness. Andrii pointed out a
BTF kind representation may waste bytes; for example a location will
likely not require a name offset string representation. Could be an
index into an array of location descriptions perhaps. Would be nice to
make use of dedup for locations too, likely within pahole rather than
BTF dedup proper. An empirical question is how much dedup will help,
likely we will just have to try and see.

So based on this I think our next steps are:

1. add address info to pahole; I'm working on a proof-of-concept on this
hope to have a newer version out this week. Address info would be needed
for functions that we wish to represent in the aux section as a way of
associating a function site with a location representation.
2. refine the representation of inline info, exploring adding new
kind(s) to UAPI btf.h if needed. This would likely mean new APIs in
libbpf to add locations and function site info.
3. explore multi-split BTF, adding libbpf-related tests for
creation/manipulation of split BTF where the base is another split BTF.
Multi-split BTF would be needed for module function aux info

I'm hoping we can remove any blocks to further progress; task 3 above
can be tackled in parallel while we explore vmlinux inline
representation (multi-split is only needed for the module case where the
aux info is created atop the module split BTF). I'm hoping to have a bit
more done on task 1 later this week. So hopefully there's nothing here
that impedes making progress on the inline problem.

Again if there's anything I've missed above or that seems unclear,
please do follow up. It's really positive that we're tackling this issue
so I want to make sure that nothing gets in the way of progressing this.
Thanks again!

Alan


On 16/04/2025 20:20, Thierry Treyer via B4 Relay wrote:
> This proposal extends BTF to list the locations of inlined functions and
> their arguments in a new '.BTF.inline` section.
> 
> == Background ==
> 
> Inline functions are often a blind spot for profiling and tracing tools:
> * They cannot probe fully inlined functions.
>   The BTF contains no data about them.
> * They miss calls to partially inlined functions,
>   where a function has a symbol, but is also inlined in some callers.
> * They cannot account for time spent in inlined calls.
>   Instead, they report the time to the caller.
> * They don't provide a way to access the arguments of an inlined call.
> 
> The issue is exacerbated by Link-Time Optimization, which enables more
> inlining across Object files. One workaround is to disable inlining for
> the profiled functions, but that requires a whole kernel compilation and
> doesn't allow for iterative exploration.
> 
> The information required to solve the above problems is not easily
> accessible. It requires parsing most of the DWARF's '.debug_info` section,
> which is time consuming and not trivial.
> Instead, this proposal leverages and extends the existing information
> contained in '.BTF` (for typing) and '.BTF.ext` (for caller location),
> with information from a new section called '.BTF.inline`,
> listing inlined instances.
> 
> == .BTF.inline Section ==
> 
> The new '.BTF.inline` section has a layout similar to '.BTF`.
> 
>  off |0-bit      |16-bits  |24-bits  |32-bits                           |
> -----+-----------+---------+---------+----------------------------------+
> 0x00 |   magic   | version |  flags  |          header length           |
> 0x08 |      inline info offset       |        inline info length        |
> 0x10 |        location offset        |          location length         |
> -----+------------------------------------------------------------------+
>      ~                        inline info section                       ~
> -----+------------------------------------------------------------------+
>      ~                          location section                        ~
> -----+------------------------------------------------------------------+
> 
> It starts with a header (see 'struct btf_inline_header`),
> followed by two subsections:
> 1. The 'Inline Info' section contains an entry for each inlined function.
>    Each entry describes the instance's location in its caller and is
>    followed by the offsets in the 'Location' section of the parameters
>    location expressions. See 'struct btf_inline_instance`.
> 2. The 'Location' section contains location expressions describing how
>    to retrieve the value of a parameter. The expressions are NULL-
>    terminated and are adressed similarly to '.BTF`'s string table.
> 
> struct btf_inline_header {
>   uint16_t magic;
>   uint8_t version, flags;
>   uint32_t header_length;
>   uint32_t inline_info_offset, inline_info_length;
>   uint32_t location_offset, location_length;
> };
> 
> struct btf_inline_instance {
>   type_id_t callee_id;     // BTF id of the inlined function
>   type_id_t caller_id;     // BTF id of the caller
>   uint32_t caller_offset;  // offset of the callee within the caller
>   uint16_t nr_parms;       // number of parameters
> //uint32_t parm_location[nr_parms];  // offset of the location expression
> };                                   // in 'Location' for each parameter
> 
> == Location Expressions ==
> 
> We looked at the DWARF location expressions for the arguments of inlined
> instances having <= 100 instances, on a production kernel v6.9.0. This
> yielded 176,800 instances with 269,327 arguments. We learned that most
> expressions are simple register access, perhaps with an offset. We would
> get access to 87% of the arguments by implementing literal and register.
> 
> Op. Category      Expr. Count    Expr. %
> ----------------------------------------
> literal                 10714      3.98%
> register+above         234698     87.14%
> arithmetic+above       239444     88.90%
> composite+above        240394     89.26%
> stack+above            242075     89.88%
> empty                   27252     10.12%
> 
> We propose to re-encode DWARF location expressions into a custom BTF
> location expression format. It operates on a stack of values, similar to
> DWARF's location expressions, but is stripped of unused operators,
> while allowing future expansions.
> 
> A location expression is composed of a series of operations, terminated
> by a NULL-byte/LOC_END_OF_EXPR operator. The very first expression in the
> 'Location' subsection must be an empty expression constisting only of
> LOC_END_OF_EXPR.
> 
> An operator is a tagged union: the tag describes the operation to carry
> out and the union contains the operands.
>  
>  ID | Operator Name        | Operands[...]
> ----+----------------------+-------------------------------------------
>   0 | LOC_END_OF_EXPR      | _none_
>   1 | LOC_SIGNED_CONST_1   |  s8: constant's value
>   2 | LOC_SIGNED_CONST_2   | s16: constant's value
>   3 | LOC_SIGNED_CONST_4   | s32: constant's value
>   4 | LOC_SIGNED_CONST_8   | s64: constant's value
>   5 | LOC_UNSIGNED_CONST_1 |  u8: constant's value
>   6 | LOC_UNSIGNED_CONST_2 | u16: constant's value
>   7 | LOC_UNSIGNED_CONST_4 | u32: constant's value
>   8 | LOC_UNSIGNED_CONST_8 | u64: constant's value
>   9 | LOC_REGISTER         |  u8: DWARF register number from the ABI
>  10 | LOC_REGISTER_OFFSET  |  u8: DWARF register number from the ABI
>                            | s64: offset added to the register's value
>  11 | LOC_DEREF            |  u8: size of the deref'd type
> 
> This list should be further expanded to include arithmetic operations.
> 
> Example: accessing a field at offset 12B from a struct whose adresse is
>          in the '%rdi` register, on amd64, has the following encoding:
> 
> [0x0a 0x05 0x000000000000000c] [0x0b 0x04] [0x00]
>  |    |    ` Offset Added       |    |      ` LOC_END_OF_EXPR
>  |    ` Register Number         |    ` Size of Deref.
>  ` LOC_REGISTER_OFFSET          ` LOC_DEREF
> 
> == Summary ==
> 
> Combining the new information from '.BTF.inline` with the existing data
> from '.BTF` and '.BTF.ext`, tools will be able to locate inline functions
> and their arguments. Symbolizer can also use the data to display the
> functions inlined at a given address.
> 
> Fully inlined functions are not part of the BTF and thus are not covered
> by this proposal. Adding them to the BTF would enable their coverage and
> should be considered.
> 
> Signed-off-by: Thierry Treyer <ttreyer@meta.com>
> ---
> Thierry Treyer (3):
>       dwarf_loader: Add parameters list to inlined expansion
>       dwarf_loader: Add name to inline expansion
>       inline_encoder: Introduce inline encoder to emit BTF.inline
> 
>  CMakeLists.txt   |   3 +-
>  btf_encoder.c    |   5 +
>  btf_encoder.h    |   2 +
>  btf_inline.pk    |  55 ++++++
>  dwarf_loader.c   | 176 ++++++++++++--------
>  dwarves.c        |  26 +++
>  dwarves.h        |   7 +
>  inline_encoder.c | 496 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  inline_encoder.h |  25 +++
>  pahole.c         |  40 ++++-
>  10 files changed, 765 insertions(+), 70 deletions(-)
> ---
> base-commit: 4ef47f84324e925051a55de10f9a4f44ef1da844
> change-id: 20250416-btf_inline-e5047eea9b6f
> 
> Best regards,


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-19 12:02 ` Alan Maguire
@ 2025-05-22 17:56   ` Thierry Treyer
  2025-05-22 20:03     ` Andrii Nakryiko
  2025-05-23 17:33     ` Alexei Starovoitov
  0 siblings, 2 replies; 21+ messages in thread
From: Thierry Treyer @ 2025-05-22 17:56 UTC (permalink / raw)
  To: Alan Maguire, dwarves@vger.kernel.org, bpf@vger.kernel.org
  Cc: acme@kernel.org, ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko, Daniel Xu

Hello everyone,

Here are the estimates for the different encoding schemes we discussed:
- parameters' location takes ~1MB without de-duplication,
- parameters' location shrinks to ~14kB when de-duplicated,
- instead of de-duplicating the individual locations,
  de-duplicating functions' parameter lists yields 187kB of locations data.

We also need to take into account the size of the corresponding funcsec
table, which starts at 3.6MB. The full details follows:

  1) // params_offset points to the first parameter's location
     struct fn_info { u32 type_id, offset, params_offset; };
  2) // param_offsets point to each parameters' location
     struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
  3) // locations are stored inline, in the funcsec table
     struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; };

  Params encoding             Locations Size   Funcsec Size   Total Size
  ======================================================================
  (1) param list, no dedup         1,017,654      5,467,824    6,485,478
  (1) param list, w/ dedup           187,379      5,467,824    5,655,203
  (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
  (3) param list inline            1,017,654      3,645,216    4,662,870

  Estimated size in bytes of the new .BTF.func_aux section, from a
  production kernel v6.9. It includes both partially and fully inlined
  functions in the funcsec tables, with all their parameters, either inline
  or in their own sub-section. It does not include type information that
  would be required to handle fully inlined functions, functions with
  conflicting name, and functions with conflicting prototypes.

  The deduplicated locations in 2) are small enough to be indexed by a u16.

Storing the locations inline uses the least amount of space. Followed by
storing inline a list of offsets to the locations. Neither of these
approaches have fixed size records in funcsec. "param list, w/ dedup" is
~1MB larger than inlined locations, but has fixed size records.

In all cases, the funcsec table uses the most space, compared to the
locations. The size of the `type` sub-section will also grow when we add
the missing type information for fully inlined functions, functions with
conflicting name, and functions with conflicting prototypes.

With fixed size records in the funcsec table, we'd get faster lookup by
sorting by `type_id` or `offset`.  bpftrace could efficiently search the
lower bound of a `type_id` to instrument all its inline instances.
Symbolication tools could efficiently search for inline functions at a
given offset.

However, it would rule out the most efficient encoding.
How do we want to approach this tradeoff?

> 2. refine the representation of inline info, exploring adding new
> kind(s) to UAPI btf.h if needed. This would likely mean new APIs in
> libbpf to add locations and function site info.


I currently have a pahole prototype to emit "param list, no dedup" and am
close to a patch adding FUNCSEC to libbpf. I was wondering if it would make
sense for FUNCSEC to be a DATASEC with its 'kind_flag` set?

Let me know if you have any questions or have new ideas for the encoding!

Have a great day,
Thierry


> On 19 May 2025, at 13:02, Alan Maguire <alan.maguire@oracle.com> wrote:
> 
> hi folks
> 
> I just wanted to try and capture some of the discussion from last week's
> BPF office hours where we talked about this and hopefully we can
> together plot a path forward that supports inline representation and
> helps us fix some other long-standing issues with more complex function
> representation. If I've missed anything important or if anything looks
> wrong, please do chime in!
> 
> In discussing this, we concluded that
> 
> - separating the complex function representations into a separate .BTF
> section (.BTF.func_aux or something like it) would be valuable since it
> means tracers can continue to interact with existing function
> representations that have a straightforward relationship between their
> parameters and calling conventions stored in the .BTF section, and can
> optionally also utilize the auxiliary function information in .BTF.func_aux
> 
> - this gives us a bit more freedom to add new kinds etc to that
> auxiliary function info, and also to control unauthorized access that
> might be able to retrieve a function address or other potentially
> sensitive info from the aux function data
> 
> - it also means that the only kernel support we would likely initially
> need to add would be to allow reading of
> /sys/kernel/btf/vmlinux.func_aux , likely via a dummy module supporting
> sysfs read.
> 
> - for modules, we would need to support multi-split BTF, i.e split BTF
> in .BTF.func_aux in the module that sits atop the .BTF section of the
> module which in turn sits atop the vmlinux BTF.  Again only userspace
> and tooling support would likely be needed as a first step. I'm looking
> at this now and it may require no or minimal code changes to libbpf,
> just testing of the feature.  bpftool and pahole would need to support a
> means of specifying multiple base BTFs in order, but that seems doable too.
> 
> We were less conclusive on the final form of the representation, but it
> would ideally help support fully and partially inlined representations
> and other situations we have today where the calling
> convention-specified registers and the function parameters do not
> cleanly line up. Today we leave such representations out of BTF but a
> location representation would allow us to add them back in. Similarly
> for functions with the same name but different signatures, having a
> function address to clarify which signature goes with which site will help.
> 
> Again we don't have to solve all these problems at once but having them
> in mind as we figure out the right form of the representation will help.
> 
> Something along the lines of the variable section where we have triples
> of <function type id, site address, location BTF id> for each function
> site will play a role. Again the exact form of the location data is TBD,
> but we can experiment here to maximize compactness. Andrii pointed out a
> BTF kind representation may waste bytes; for example a location will
> likely not require a name offset string representation. Could be an
> index into an array of location descriptions perhaps. Would be nice to
> make use of dedup for locations too, likely within pahole rather than
> BTF dedup proper. An empirical question is how much dedup will help,
> likely we will just have to try and see.
> 
> So based on this I think our next steps are:
> 
> 1. add address info to pahole; I'm working on a proof-of-concept on this
> hope to have a newer version out this week. Address info would be needed
> for functions that we wish to represent in the aux section as a way of
> associating a function site with a location representation.
> 2. refine the representation of inline info, exploring adding new
> kind(s) to UAPI btf.h if needed. This would likely mean new APIs in
> libbpf to add locations and function site info.
> 3. explore multi-split BTF, adding libbpf-related tests for
> creation/manipulation of split BTF where the base is another split BTF.
> Multi-split BTF would be needed for module function aux info
> 
> I'm hoping we can remove any blocks to further progress; task 3 above
> can be tackled in parallel while we explore vmlinux inline
> representation (multi-split is only needed for the module case where the
> aux info is created atop the module split BTF). I'm hoping to have a bit
> more done on task 1 later this week. So hopefully there's nothing here
> that impedes making progress on the inline problem.
> 
> Again if there's anything I've missed above or that seems unclear,
> please do follow up. It's really positive that we're tackling this issue
> so I want to make sure that nothing gets in the way of progressing this.
> Thanks again!
> 
> Alan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-22 17:56   ` Thierry Treyer
@ 2025-05-22 20:03     ` Andrii Nakryiko
  2025-05-22 20:16       ` Eduard Zingerman
  2025-05-23 17:33     ` Alexei Starovoitov
  1 sibling, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2025-05-22 20:03 UTC (permalink / raw)
  To: Thierry Treyer
  Cc: Alan Maguire, dwarves@vger.kernel.org, bpf@vger.kernel.org,
	acme@kernel.org, ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko, Daniel Xu

On Thu, May 22, 2025 at 10:56 AM Thierry Treyer <ttreyer@meta.com> wrote:
>
> Hello everyone,
>
> Here are the estimates for the different encoding schemes we discussed:
> - parameters' location takes ~1MB without de-duplication,
> - parameters' location shrinks to ~14kB when de-duplicated,
> - instead of de-duplicating the individual locations,
>   de-duplicating functions' parameter lists yields 187kB of locations data.
>
> We also need to take into account the size of the corresponding funcsec
> table, which starts at 3.6MB. The full details follows:
>
>   1) // params_offset points to the first parameter's location
>      struct fn_info { u32 type_id, offset, params_offset; };
>   2) // param_offsets point to each parameters' location
>      struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
>   3) // locations are stored inline, in the funcsec table
>      struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; };
>
>   Params encoding             Locations Size   Funcsec Size   Total Size
>   ======================================================================
>   (1) param list, no dedup         1,017,654      5,467,824    6,485,478
>   (1) param list, w/ dedup           187,379      5,467,824    5,655,203
>   (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364

This one is almost as good as (3) below, but fits better into the
existing kind+vlen model where there is a variable number of fixed
sized elements (but locations can still be variable-sized and keep
evolving much more easily). I'd go with this one, unless I'm missing
some important benefit of other representations.

>   (3) param list inline            1,017,654      3,645,216    4,662,870
>
>   Estimated size in bytes of the new .BTF.func_aux section, from a
>   production kernel v6.9. It includes both partially and fully inlined
>   functions in the funcsec tables, with all their parameters, either inline
>   or in their own sub-section. It does not include type information that
>   would be required to handle fully inlined functions, functions with
>   conflicting name, and functions with conflicting prototypes.
>
>   The deduplicated locations in 2) are small enough to be indexed by a u16.
>
> Storing the locations inline uses the least amount of space. Followed by
> storing inline a list of offsets to the locations. Neither of these
> approaches have fixed size records in funcsec. "param list, w/ dedup" is
> ~1MB larger than inlined locations, but has fixed size records.
>
> In all cases, the funcsec table uses the most space, compared to the
> locations. The size of the `type` sub-section will also grow when we add
> the missing type information for fully inlined functions, functions with
> conflicting name, and functions with conflicting prototypes.
>
> With fixed size records in the funcsec table, we'd get faster lookup by
> sorting by `type_id` or `offset`.  bpftrace could efficiently search the
> lower bound of a `type_id` to instrument all its inline instances.
> Symbolication tools could efficiently search for inline functions at a
> given offset.
>
> However, it would rule out the most efficient encoding.
> How do we want to approach this tradeoff?
>
> > 2. refine the representation of inline info, exploring adding new
> > kind(s) to UAPI btf.h if needed. This would likely mean new APIs in
> > libbpf to add locations and function site info.
>
>
> I currently have a pahole prototype to emit "param list, no dedup" and am
> close to a patch adding FUNCSEC to libbpf. I was wondering if it would make
> sense for FUNCSEC to be a DATASEC with its 'kind_flag` set?

Why abuse DATASEC if we are extending BTF with new types anyways? I'd
go with a dedicated FUNCSEC (or FUNCSET, maybe?..)

BTW, Alan, you've been working on self-describing BTF (size per fixed
part of kind + size per vlen items). Any update on that one? Did you
get blocked on it somewhere?

>
> Let me know if you have any questions or have new ideas for the encoding!
>
> Have a great day,
> Thierry
>
>
> > On 19 May 2025, at 13:02, Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > hi folks
> >
> > I just wanted to try and capture some of the discussion from last week's
> > BPF office hours where we talked about this and hopefully we can
> > together plot a path forward that supports inline representation and
> > helps us fix some other long-standing issues with more complex function
> > representation. If I've missed anything important or if anything looks
> > wrong, please do chime in!
> >
> > In discussing this, we concluded that
> >
> > - separating the complex function representations into a separate .BTF
> > section (.BTF.func_aux or something like it) would be valuable since it
> > means tracers can continue to interact with existing function
> > representations that have a straightforward relationship between their
> > parameters and calling conventions stored in the .BTF section, and can
> > optionally also utilize the auxiliary function information in .BTF.func_aux
> >
> > - this gives us a bit more freedom to add new kinds etc to that
> > auxiliary function info, and also to control unauthorized access that
> > might be able to retrieve a function address or other potentially
> > sensitive info from the aux function data
> >
> > - it also means that the only kernel support we would likely initially
> > need to add would be to allow reading of
> > /sys/kernel/btf/vmlinux.func_aux , likely via a dummy module supporting
> > sysfs read.
> >
> > - for modules, we would need to support multi-split BTF, i.e split BTF
> > in .BTF.func_aux in the module that sits atop the .BTF section of the
> > module which in turn sits atop the vmlinux BTF.  Again only userspace
> > and tooling support would likely be needed as a first step. I'm looking
> > at this now and it may require no or minimal code changes to libbpf,
> > just testing of the feature.  bpftool and pahole would need to support a
> > means of specifying multiple base BTFs in order, but that seems doable too.
> >
> > We were less conclusive on the final form of the representation, but it
> > would ideally help support fully and partially inlined representations
> > and other situations we have today where the calling
> > convention-specified registers and the function parameters do not
> > cleanly line up. Today we leave such representations out of BTF but a
> > location representation would allow us to add them back in. Similarly
> > for functions with the same name but different signatures, having a
> > function address to clarify which signature goes with which site will help.
> >
> > Again we don't have to solve all these problems at once but having them
> > in mind as we figure out the right form of the representation will help.
> >
> > Something along the lines of the variable section where we have triples
> > of <function type id, site address, location BTF id> for each function
> > site will play a role. Again the exact form of the location data is TBD,
> > but we can experiment here to maximize compactness. Andrii pointed out a
> > BTF kind representation may waste bytes; for example a location will
> > likely not require a name offset string representation. Could be an
> > index into an array of location descriptions perhaps. Would be nice to
> > make use of dedup for locations too, likely within pahole rather than
> > BTF dedup proper. An empirical question is how much dedup will help,
> > likely we will just have to try and see.
> >
> > So based on this I think our next steps are:
> >
> > 1. add address info to pahole; I'm working on a proof-of-concept on this
> > hope to have a newer version out this week. Address info would be needed
> > for functions that we wish to represent in the aux section as a way of
> > associating a function site with a location representation.
> > 2. refine the representation of inline info, exploring adding new
> > kind(s) to UAPI btf.h if needed. This would likely mean new APIs in
> > libbpf to add locations and function site info.
> > 3. explore multi-split BTF, adding libbpf-related tests for
> > creation/manipulation of split BTF where the base is another split BTF.
> > Multi-split BTF would be needed for module function aux info
> >
> > I'm hoping we can remove any blocks to further progress; task 3 above
> > can be tackled in parallel while we explore vmlinux inline
> > representation (multi-split is only needed for the module case where the
> > aux info is created atop the module split BTF). I'm hoping to have a bit
> > more done on task 1 later this week. So hopefully there's nothing here
> > that impedes making progress on the inline problem.
> >
> > Again if there's anything I've missed above or that seems unclear,
> > please do follow up. It's really positive that we're tackling this issue
> > so I want to make sure that nothing gets in the way of progressing this.
> > Thanks again!
> >
> > Alan
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-22 20:03     ` Andrii Nakryiko
@ 2025-05-22 20:16       ` Eduard Zingerman
  2025-05-23 18:57         ` Thierry Treyer
  0 siblings, 1 reply; 21+ messages in thread
From: Eduard Zingerman @ 2025-05-22 20:16 UTC (permalink / raw)
  To: Andrii Nakryiko, Thierry Treyer
  Cc: Alan Maguire, dwarves@vger.kernel.org, bpf@vger.kernel.org,
	acme@kernel.org, ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko, Daniel Xu

On Thu, 2025-05-22 at 13:03 -0700, Andrii Nakryiko wrote:
> On Thu, May 22, 2025 at 10:56 AM Thierry Treyer <ttreyer@meta.com> wrote:
> > 
> > Hello everyone,
> > 
> > Here are the estimates for the different encoding schemes we discussed:
> > - parameters' location takes ~1MB without de-duplication,
> > - parameters' location shrinks to ~14kB when de-duplicated,
> > - instead of de-duplicating the individual locations,
> >   de-duplicating functions' parameter lists yields 187kB of locations data.
> > 
> > We also need to take into account the size of the corresponding funcsec
> > table, which starts at 3.6MB. The full details follows:
> > 
> >   1) // params_offset points to the first parameter's location
> >      struct fn_info { u32 type_id, offset, params_offset; };
> >   2) // param_offsets point to each parameters' location
> >      struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
> >   3) // locations are stored inline, in the funcsec table
> >      struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; };
> > 
> >   Params encoding             Locations Size   Funcsec Size   Total Size
> >   ======================================================================
> >   (1) param list, no dedup         1,017,654      5,467,824    6,485,478
> >   (1) param list, w/ dedup           187,379      5,467,824    5,655,203
> >   (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
> 
> This one is almost as good as (3) below, but fits better into the
> existing kind+vlen model where there is a variable number of fixed
> sized elements (but locations can still be variable-sized and keep
> evolving much more easily). I'd go with this one, unless I'm missing
> some important benefit of other representations.

Thierry, could you please provide some details for the representation
of both fn_info and parameters for this case?
I'm curious how far this version is from exhausting u16 limit.

> 
> >   (3) param list inline            1,017,654      3,645,216    4,662,870
> > 
> >   Estimated size in bytes of the new .BTF.func_aux section, from a
> >   production kernel v6.9. It includes both partially and fully inlined
> >   functions in the funcsec tables, with all their parameters, either inline
> >   or in their own sub-section. It does not include type information that
> >   would be required to handle fully inlined functions, functions with
> >   conflicting name, and functions with conflicting prototypes.
> > 
> >   The deduplicated locations in 2) are small enough to be indexed by a u16.
> > 
> > Storing the locations inline uses the least amount of space. Followed by
> > storing inline a list of offsets to the locations. Neither of these
> > approaches have fixed size records in funcsec. "param list, w/ dedup" is
> > ~1MB larger than inlined locations, but has fixed size records.
> > 
> > In all cases, the funcsec table uses the most space, compared to the
> > locations. The size of the `type` sub-section will also grow when we add
> > the missing type information for fully inlined functions, functions with
> > conflicting name, and functions with conflicting prototypes.
> > 
> > With fixed size records in the funcsec table, we'd get faster lookup by
> > sorting by `type_id` or `offset`.  bpftrace could efficiently search the
> > lower bound of a `type_id` to instrument all its inline instances.
> > Symbolication tools could efficiently search for inline functions at a
> > given offset.
> > 
> > However, it would rule out the most efficient encoding.
> > How do we want to approach this tradeoff?

[...]


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-22 17:56   ` Thierry Treyer
  2025-05-22 20:03     ` Andrii Nakryiko
@ 2025-05-23 17:33     ` Alexei Starovoitov
  2025-05-23 18:35       ` Thierry Treyer
  1 sibling, 1 reply; 21+ messages in thread
From: Alexei Starovoitov @ 2025-05-23 17:33 UTC (permalink / raw)
  To: Thierry Treyer
  Cc: Alan Maguire, dwarves@vger.kernel.org, bpf@vger.kernel.org,
	acme@kernel.org, ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko, Daniel Xu

On Thu, May 22, 2025 at 10:56 AM Thierry Treyer <ttreyer@meta.com> wrote:
>
> Hello everyone,
>
> Here are the estimates for the different encoding schemes we discussed:
> - parameters' location takes ~1MB without de-duplication,
> - parameters' location shrinks to ~14kB when de-duplicated,
> - instead of de-duplicating the individual locations,
>   de-duplicating functions' parameter lists yields 187kB of locations data.
>
> We also need to take into account the size of the corresponding funcsec
> table, which starts at 3.6MB. The full details follows:
>
>   1) // params_offset points to the first parameter's location
>      struct fn_info { u32 type_id, offset, params_offset; };
>   2) // param_offsets point to each parameters' location
>      struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
>   3) // locations are stored inline, in the funcsec table
>      struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; };
>
>   Params encoding             Locations Size   Funcsec Size   Total Size
>   ======================================================================
>   (1) param list, no dedup         1,017,654      5,467,824    6,485,478
>   (1) param list, w/ dedup           187,379      5,467,824    5,655,203
>   (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
>   (3) param list inline            1,017,654      3,645,216    4,662,870

I feel u16 offset isn't really viable. Sooner or later we'd need to bump it,
and then we will have a mix of u32 and u16 offsets.

The main question I have is why funcsec size is bigger for (1) ?
struct fn_info { u32 type_id, offset, params_offset; };

this is fixed size record and the number of them should be the same
as in (2) and (3), so single u32 params_offset should be smaller
than u16[arg_cnt], assuming that on average arg_cnt >= 2.

Or you meant that average arg_cnt <= 1,
but then the math is suspicious, since struct fn_info should
be 4-byte aligned as everything in BTF.

Also for (3), if locs are inlined, why "Locations Size" is not zero ?
Or the math for (3) is actually:
struct fn_info { u32 type_id, offset } * num_of_funcs ?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-23 17:33     ` Alexei Starovoitov
@ 2025-05-23 18:35       ` Thierry Treyer
  0 siblings, 0 replies; 21+ messages in thread
From: Thierry Treyer @ 2025-05-23 18:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alan Maguire, dwarves@vger.kernel.org, bpf@vger.kernel.org,
	acme@kernel.org, ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko, Daniel Xu

> I feel u16 offset isn't really viable. Sooner or later we'd need to bump it,
> and then we will have a mix of u32 and u16 offsets.

We’re already using 22% of the addressable capacity of a u16.
It won’t get any better when introducing more locations.

> The main question I have is why funcsec size is bigger for (1) ?
> struct fn_info { u32 type_id, offset, params_offset; };
> 
> this is fixed size record and the number of them should be the same
> as in (2) and (3), so single u32 params_offset should be smaller
> than u16[arg_cnt], assuming that on average arg_cnt >= 2.
> 
> Or you meant that average arg_cnt <= 1,
> but then the math is suspicious, since struct fn_info should
> be 4-byte aligned as everything in BTF.

Average arg_cnt is 1.277, which is why (2) with u16 is smaller.
I did not know about the required 4-byte alignment of the BTF!

> Also for (3), if locs are inlined, why "Locations Size" is not zero ?
> Or the math for (3) is actually:
> struct fn_info { u32 type_id, offset } * num_of_funcs ?

Sorry for the confusion, I kept the numbers separated, but the locations are
indeed inside the funcsec table.



Updated table with (2-u16) and (3) aligned to 4-bytes, and added (2-u32):

  Params encoding                 Locations Size   Funcsec Size        Total
  ==========================================================================
  (1) param list, no dedup             1,017,654      5,467,824    6,485,478
  (1) param list, w/ dedup               187,379      5,467,824    5,655,203
  (2) param offsets u16, w/ dedup         14,526      5,324,888    5,339,414
  (2) param offsets u32, w/ dedup         14,526      5,972,460    5,986,986
  (3) param list inline, w/ align              0      5,172,268    5,172,268

(3)’s advantage is now much smaller.
(2)’s size also increased, either when using u32 for the offsets,
      or using u16, but keeping in mind the BTF 4-bytes alignment.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-22 20:16       ` Eduard Zingerman
@ 2025-05-23 18:57         ` Thierry Treyer
  2025-05-26 14:30           ` Alan Maguire
  0 siblings, 1 reply; 21+ messages in thread
From: Thierry Treyer @ 2025-05-23 18:57 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, Alan Maguire, dwarves@vger.kernel.org,
	bpf@vger.kernel.org, acme@kernel.org, ast@kernel.org,
	Yonghong Song, andrii@kernel.org, ihor.solodrai@linux.dev,
	Song Liu, Mykola Lysenko, Daniel Xu

>>>  2) // param_offsets point to each parameters' location
>>>     struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
>>>  [...]
>>>  (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
>> 
>> This one is almost as good as (3) below, but fits better into the
>> existing kind+vlen model where there is a variable number of fixed
>> sized elements (but locations can still be variable-sized and keep
>> evolving much more easily). I'd go with this one, unless I'm missing
>> some important benefit of other representations.
> 
> Thierry, could you please provide some details for the representation
> of both fn_info and parameters for this case?

The locations are stored in their own sub-section, like strings, using the
encoding described previously. A location is a tagged union of an operation
and its operands describing how to find to parameter’s value.

The locations for nil, ’%rdi’ and ’*(%rdi + 32)’ are encoded as follow:

  [0x00] [0x09 0x05] [0x0a 0x05 0x00000020]
#  `NIL   `REG   #5   |    `Reg#5        `Offset added to Reg’s value
#                     `ADDR_REG_OFF

The funcsec table starts with a `struct btf_type` of type FUNCSEC, followed by
vlen `struct btf_func_secinfo` (referred previously as fn_info):

  .align(4)
  struct btf_func_secinfo {
    __u32 type_id;                       // Type ID of FUNC
    __u32 offset;                        // Offset in section
    __u16 parameter_offsets[proto.vlen]; // Offsets to params’ location
  };

To know how many parameters a function has, you’d use its type_id to retrieve
its FUNC, then its FUNC_PROTO to finally get the FUNC_PROTO vlen.
Optimized out parameters won’t have a location, so we need a NIL to skip them.


Given a function with arg0 optimized out, arg1 at *(%rdi + 32) and arg2 in %rdi.
You’d get the following encoding:

  [1] FUNC_PROTO, vlen=3
      ...args
  [2] FUNC 'foo' type_id=1
  [3] FUNCSEC '.text', vlen=1           # ,NIL   ,*(%rdi + 32)
      - type_id=n, offset=0x1234, params=[0x0, 0x3, 0x1]
                                        #             `%rdi

# Regular BTF encoding for 1 and 2
  ...
# ,FUNCSEC ’.text’, vlen=1
  [0x000001 0x14000001 0x00000000]
# ,btf_func_secinfo      ,params=[0x0, 0x3, 0x1] + extra nil for alignment
  [0x00000002 0x00001234 0x0000 0x0003 0x0001 0x0000]

Note: I didn’t take into account the 4-bytes padding requirement of BTF.
      I’ve sent the correct numbers when responding to Alexei.

> I'm curious how far this version is from exhausting u16 limit.


We’re already using 22% of the 64 kiB addressable by u16.

> Why abuse DATASEC if we are extending BTF with new types anyways? I'd
> go with a dedicated FUNCSEC (or FUNCSET, maybe?..)

I'm not sure that a 'set' describes the table best, since a function
can have multiple entries in the table.
FUNCSEC is ugly, but it conveys that the offsets are from a section’s base.

Have a nice weekend,
Thierry

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-23 18:57         ` Thierry Treyer
@ 2025-05-26 14:30           ` Alan Maguire
  2025-05-27 21:41             ` Andrii Nakryiko
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Maguire @ 2025-05-26 14:30 UTC (permalink / raw)
  To: Thierry Treyer, Eduard Zingerman
  Cc: Andrii Nakryiko, dwarves@vger.kernel.org, bpf@vger.kernel.org,
	acme@kernel.org, ast@kernel.org, Yonghong Song, andrii@kernel.org,
	ihor.solodrai@linux.dev, Song Liu, Mykola Lysenko, Daniel Xu

On 23/05/2025 19:57, Thierry Treyer wrote:
>>>>  2) // param_offsets point to each parameters' location
>>>>     struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
>>>>  [...]
>>>>  (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
>>>
>>> This one is almost as good as (3) below, but fits better into the
>>> existing kind+vlen model where there is a variable number of fixed
>>> sized elements (but locations can still be variable-sized and keep
>>> evolving much more easily). I'd go with this one, unless I'm missing
>>> some important benefit of other representations.
>>
>> Thierry, could you please provide some details for the representation
>> of both fn_info and parameters for this case?
> 
> The locations are stored in their own sub-section, like strings, using the
> encoding described previously. A location is a tagged union of an operation
> and its operands describing how to find to parameter’s value.
> 
> The locations for nil, ’%rdi’ and ’*(%rdi + 32)’ are encoded as follow:
> 
>   [0x00] [0x09 0x05] [0x0a 0x05 0x00000020]
> #  `NIL   `REG   #5   |    `Reg#5        `Offset added to Reg’s value
> #                     `ADDR_REG_OFF
> 
> The funcsec table starts with a `struct btf_type` of type FUNCSEC, followed by
> vlen `struct btf_func_secinfo` (referred previously as fn_info):
> 
>   .align(4)
>   struct btf_func_secinfo {
>     __u32 type_id;                       // Type ID of FUNC
>     __u32 offset;                        // Offset in section
>     __u16 parameter_offsets[proto.vlen]; // Offsets to params’ location
>   };
> 
> To know how many parameters a function has, you’d use its type_id to retrieve
> its FUNC, then its FUNC_PROTO to finally get the FUNC_PROTO vlen.
> Optimized out parameters won’t have a location, so we need a NIL to skip them.
> 
> 
> Given a function with arg0 optimized out, arg1 at *(%rdi + 32) and arg2 in %rdi.
> You’d get the following encoding:
> 
>   [1] FUNC_PROTO, vlen=3
>       ...args
>   [2] FUNC 'foo' type_id=1
>   [3] FUNCSEC '.text', vlen=1           # ,NIL   ,*(%rdi + 32)
>       - type_id=n, offset=0x1234, params=[0x0, 0x3, 0x1]
>                                         #             `%rdi
> 
> # Regular BTF encoding for 1 and 2
>   ...
> # ,FUNCSEC ’.text’, vlen=1
>   [0x000001 0x14000001 0x00000000]
> # ,btf_func_secinfo      ,params=[0x0, 0x3, 0x1] + extra nil for alignment
>   [0x00000002 0x00001234 0x0000 0x0003 0x0001 0x0000]
> 
> Note: I didn’t take into account the 4-bytes padding requirement of BTF.
>       I’ve sent the correct numbers when responding to Alexei.
> 
>> I'm curious how far this version is from exhausting u16 limit.
> 
> 
> We’re already using 22% of the 64 kiB addressable by u16.
> 
>> Why abuse DATASEC if we are extending BTF with new types anyways? I'd
>> go with a dedicated FUNCSEC (or FUNCSET, maybe?..)
> 
> I'm not sure that a 'set' describes the table best, since a function
> can have multiple entries in the table.
> FUNCSEC is ugly, but it conveys that the offsets are from a section’s base.


I totally agree that we have more freedom to define new representations
here, so don't feel too constrained by existing representations like
DATASEC if they are not helpful.

One thing I hadn't really thought about before you suggested it is
having the locations in a separate section from types as we have for
strings. Do we need that? Or could we have a BTF_KIND_LOC_SEC that is
associated with the FUNC_SEC via a type id (loc sec points at the type
of the associated func sec) and contains the packed location info?

In other words

[3] FUNCSEC '.text', vlen= ...
<func_id, offset, param_location_offsets[]>
...
[4] LOCSEC '.text', type_id=3
<packed locations>
...


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-26 14:30           ` Alan Maguire
@ 2025-05-27 21:41             ` Andrii Nakryiko
  2025-05-28 10:14               ` Alan Maguire
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2025-05-27 21:41 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Thierry Treyer, Eduard Zingerman, dwarves@vger.kernel.org,
	bpf@vger.kernel.org, acme@kernel.org, ast@kernel.org,
	Yonghong Song, andrii@kernel.org, ihor.solodrai@linux.dev,
	Song Liu, Mykola Lysenko, Daniel Xu

On Mon, May 26, 2025 at 7:30 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 23/05/2025 19:57, Thierry Treyer wrote:
> >>>>  2) // param_offsets point to each parameters' location
> >>>>     struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
> >>>>  [...]
> >>>>  (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
> >>>
> >>> This one is almost as good as (3) below, but fits better into the
> >>> existing kind+vlen model where there is a variable number of fixed
> >>> sized elements (but locations can still be variable-sized and keep
> >>> evolving much more easily). I'd go with this one, unless I'm missing
> >>> some important benefit of other representations.
> >>
> >> Thierry, could you please provide some details for the representation
> >> of both fn_info and parameters for this case?
> >
> > The locations are stored in their own sub-section, like strings, using the
> > encoding described previously. A location is a tagged union of an operation
> > and its operands describing how to find to parameter’s value.
> >
> > The locations for nil, ’%rdi’ and ’*(%rdi + 32)’ are encoded as follow:
> >
> >   [0x00] [0x09 0x05] [0x0a 0x05 0x00000020]
> > #  `NIL   `REG   #5   |    `Reg#5        `Offset added to Reg’s value
> > #                     `ADDR_REG_OFF
> >
> > The funcsec table starts with a `struct btf_type` of type FUNCSEC, followed by
> > vlen `struct btf_func_secinfo` (referred previously as fn_info):
> >
> >   .align(4)
> >   struct btf_func_secinfo {
> >     __u32 type_id;                       // Type ID of FUNC
> >     __u32 offset;                        // Offset in section
> >     __u16 parameter_offsets[proto.vlen]; // Offsets to params’ location
> >   };
> >
> > To know how many parameters a function has, you’d use its type_id to retrieve
> > its FUNC, then its FUNC_PROTO to finally get the FUNC_PROTO vlen.
> > Optimized out parameters won’t have a location, so we need a NIL to skip them.
> >
> >
> > Given a function with arg0 optimized out, arg1 at *(%rdi + 32) and arg2 in %rdi.
> > You’d get the following encoding:
> >
> >   [1] FUNC_PROTO, vlen=3
> >       ...args
> >   [2] FUNC 'foo' type_id=1
> >   [3] FUNCSEC '.text', vlen=1           # ,NIL   ,*(%rdi + 32)
> >       - type_id=n, offset=0x1234, params=[0x0, 0x3, 0x1]
> >                                         #             `%rdi
> >
> > # Regular BTF encoding for 1 and 2
> >   ...
> > # ,FUNCSEC ’.text’, vlen=1
> >   [0x000001 0x14000001 0x00000000]
> > # ,btf_func_secinfo      ,params=[0x0, 0x3, 0x1] + extra nil for alignment
> >   [0x00000002 0x00001234 0x0000 0x0003 0x0001 0x0000]
> >
> > Note: I didn’t take into account the 4-bytes padding requirement of BTF.
> >       I’ve sent the correct numbers when responding to Alexei.
> >
> >> I'm curious how far this version is from exhausting u16 limit.
> >
> >
> > We’re already using 22% of the 64 kiB addressable by u16.
> >
> >> Why abuse DATASEC if we are extending BTF with new types anyways? I'd
> >> go with a dedicated FUNCSEC (or FUNCSET, maybe?..)
> >
> > I'm not sure that a 'set' describes the table best, since a function
> > can have multiple entries in the table.
> > FUNCSEC is ugly, but it conveys that the offsets are from a section’s base.
>
>
> I totally agree that we have more freedom to define new representations
> here, so don't feel too constrained by existing representations like
> DATASEC if they are not helpful.
>
> One thing I hadn't really thought about before you suggested it is
> having the locations in a separate section from types as we have for
> strings. Do we need that? Or could we have a BTF_KIND_LOC_SEC that is
> associated with the FUNC_SEC via a type id (loc sec points at the type
> of the associated func sec) and contains the packed location info?
>
> In other words
>
> [3] FUNCSEC '.text', vlen= ...
> <func_id, offset, param_location_offsets[]>
> ...
> [4] LOCSEC '.text', type_id=3
> <packed locations>

LOCSEC pointing to FUNCSEC isn't that useful, no? You'd want to go
from FUNCSEC to LOCSEC quickly, not the other way around, no? But I
also don't see the need to have a per-ELF-section set of locations,
tbh... One set ought to be enough across all FUNCSECs?

> ...
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline
  2025-05-27 21:41             ` Andrii Nakryiko
@ 2025-05-28 10:14               ` Alan Maguire
  0 siblings, 0 replies; 21+ messages in thread
From: Alan Maguire @ 2025-05-28 10:14 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Thierry Treyer, Eduard Zingerman, dwarves@vger.kernel.org,
	bpf@vger.kernel.org, acme@kernel.org, ast@kernel.org,
	Yonghong Song, andrii@kernel.org, ihor.solodrai@linux.dev,
	Song Liu, Mykola Lysenko, Daniel Xu

On 27/05/2025 22:41, Andrii Nakryiko wrote:
> On Mon, May 26, 2025 at 7:30 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> On 23/05/2025 19:57, Thierry Treyer wrote:
>>>>>>  2) // param_offsets point to each parameters' location
>>>>>>     struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
>>>>>>  [...]
>>>>>>  (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
>>>>>
>>>>> This one is almost as good as (3) below, but fits better into the
>>>>> existing kind+vlen model where there is a variable number of fixed
>>>>> sized elements (but locations can still be variable-sized and keep
>>>>> evolving much more easily). I'd go with this one, unless I'm missing
>>>>> some important benefit of other representations.
>>>>
>>>> Thierry, could you please provide some details for the representation
>>>> of both fn_info and parameters for this case?
>>>
>>> The locations are stored in their own sub-section, like strings, using the
>>> encoding described previously. A location is a tagged union of an operation
>>> and its operands describing how to find to parameter’s value.
>>>
>>> The locations for nil, ’%rdi’ and ’*(%rdi + 32)’ are encoded as follow:
>>>
>>>   [0x00] [0x09 0x05] [0x0a 0x05 0x00000020]
>>> #  `NIL   `REG   #5   |    `Reg#5        `Offset added to Reg’s value
>>> #                     `ADDR_REG_OFF
>>>
>>> The funcsec table starts with a `struct btf_type` of type FUNCSEC, followed by
>>> vlen `struct btf_func_secinfo` (referred previously as fn_info):
>>>
>>>   .align(4)
>>>   struct btf_func_secinfo {
>>>     __u32 type_id;                       // Type ID of FUNC
>>>     __u32 offset;                        // Offset in section
>>>     __u16 parameter_offsets[proto.vlen]; // Offsets to params’ location
>>>   };
>>>
>>> To know how many parameters a function has, you’d use its type_id to retrieve
>>> its FUNC, then its FUNC_PROTO to finally get the FUNC_PROTO vlen.
>>> Optimized out parameters won’t have a location, so we need a NIL to skip them.
>>>
>>>
>>> Given a function with arg0 optimized out, arg1 at *(%rdi + 32) and arg2 in %rdi.
>>> You’d get the following encoding:
>>>
>>>   [1] FUNC_PROTO, vlen=3
>>>       ...args
>>>   [2] FUNC 'foo' type_id=1
>>>   [3] FUNCSEC '.text', vlen=1           # ,NIL   ,*(%rdi + 32)
>>>       - type_id=n, offset=0x1234, params=[0x0, 0x3, 0x1]
>>>                                         #             `%rdi
>>>
>>> # Regular BTF encoding for 1 and 2
>>>   ...
>>> # ,FUNCSEC ’.text’, vlen=1
>>>   [0x000001 0x14000001 0x00000000]
>>> # ,btf_func_secinfo      ,params=[0x0, 0x3, 0x1] + extra nil for alignment
>>>   [0x00000002 0x00001234 0x0000 0x0003 0x0001 0x0000]
>>>
>>> Note: I didn’t take into account the 4-bytes padding requirement of BTF.
>>>       I’ve sent the correct numbers when responding to Alexei.
>>>
>>>> I'm curious how far this version is from exhausting u16 limit.
>>>
>>>
>>> We’re already using 22% of the 64 kiB addressable by u16.
>>>
>>>> Why abuse DATASEC if we are extending BTF with new types anyways? I'd
>>>> go with a dedicated FUNCSEC (or FUNCSET, maybe?..)
>>>
>>> I'm not sure that a 'set' describes the table best, since a function
>>> can have multiple entries in the table.
>>> FUNCSEC is ugly, but it conveys that the offsets are from a section’s base.
>>
>>
>> I totally agree that we have more freedom to define new representations
>> here, so don't feel too constrained by existing representations like
>> DATASEC if they are not helpful.
>>
>> One thing I hadn't really thought about before you suggested it is
>> having the locations in a separate section from types as we have for
>> strings. Do we need that? Or could we have a BTF_KIND_LOC_SEC that is
>> associated with the FUNC_SEC via a type id (loc sec points at the type
>> of the associated func sec) and contains the packed location info?
>>
>> In other words
>>
>> [3] FUNCSEC '.text', vlen= ...
>> <func_id, offset, param_location_offsets[]>
>> ...
>> [4] LOCSEC '.text', type_id=3
>> <packed locations>
> 
> LOCSEC pointing to FUNCSEC isn't that useful, no? You'd want to go
> from FUNCSEC to LOCSEC quickly, not the other way around, no? But I
> also don't see the need to have a per-ELF-section set of locations,
> tbh... One set ought to be enough across all FUNCSECs?
>

I can't think of scenario where we'd need more than one (aside from
below) and as you say since FUNCSECs refer to LOCSEC offsets, if we did
want to relate them it'd make more sense to have FUNCSEC point at the
LOCSEC it uses.

BTW thanks for the reminder on the kind layout stuff; I cleaned it up
and sent a v5 [1] along with pahole support [2]. I guess it's a bit
hypocritical that I'm arguing against adding an additional section here
while trying to add one for kind layouts, but the kind layout stuff does
raise how a BTF type might represent a packed set of locations.
Traditionally we've had a singular element or a vlen-specified number of
elements (or both), so for LOCSEC we have the problem that vlen is 16
bits. We could I suppose have the vlen-associated size be 4 bytes which
would mean we'd support up to 262144 bytes of location information.
However it is a bit of an abuse of the vlen * size model since each
location isn't necessarily contained in the 4 bytes. I guess that in
fact might be one possible justification for multiple LOCSECs; hitting
size limitations. Or indeed a justification for using a separate
location section since it doesn't fit well into the existing btf_type model.

Anyway these are all things we should weigh up when deciding whether to
use a separate location section versus a LOCSEC kind. Thanks!

Alan


[1]
https://lore.kernel.org/bpf/20250528095743.791722-1-alan.maguire@oracle.com/
[2]
https://lore.kernel.org/dwarves/20250528095349.788793-1-alan.maguire@oracle.com/


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-05-28 10:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 19:20 [PATCH RFC 0/3] list inline expansions in .BTF.inline Thierry Treyer via B4 Relay
2025-04-16 19:20 ` [PATCH RFC 1/3] dwarf_loader: Add parameters list to inlined expansion Thierry Treyer via B4 Relay
2025-04-16 19:20 ` [PATCH RFC 2/3] dwarf_loader: Add name to inline expansion Thierry Treyer via B4 Relay
2025-04-16 19:20 ` [PATCH RFC 3/3] inline_encoder: Introduce inline encoder to emit BTF.inline Thierry Treyer via B4 Relay
2025-04-25 18:40 ` [PATCH RFC 0/3] list inline expansions in .BTF.inline Daniel Xu
2025-04-28 20:51   ` Alexei Starovoitov
2025-04-29 19:14     ` Thierry Treyer
2025-04-29 23:58       ` Alexei Starovoitov
2025-04-30 15:25 ` Alan Maguire
2025-05-01 19:38   ` Thierry Treyer
2025-05-02  8:31     ` Alan Maguire
2025-05-19 12:02 ` Alan Maguire
2025-05-22 17:56   ` Thierry Treyer
2025-05-22 20:03     ` Andrii Nakryiko
2025-05-22 20:16       ` Eduard Zingerman
2025-05-23 18:57         ` Thierry Treyer
2025-05-26 14:30           ` Alan Maguire
2025-05-27 21:41             ` Andrii Nakryiko
2025-05-28 10:14               ` Alan Maguire
2025-05-23 17:33     ` Alexei Starovoitov
2025-05-23 18:35       ` Thierry Treyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).