public inbox for dwarves@vger.kernel.org
 help / color / mirror / Atom feed
From: "Alexis Lothoré" <alexis.lothore@bootlin.com>
To: dwarves@vger.kernel.org
Cc: bpf@vger.kernel.org, "Alan Maguire" <alan.maguire@oracle.com>,
	"Arnaldo Carvalho de Melo" <acme@kernel.org>,
	"Alexei Starovoitov" <ast@fb.com>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Bastien Curutchet" <bastien.curutchet@bootlin.com>,
	"Alexis Lothoré" <alexis.lothore@bootlin.com>
Subject: [PATCH RFC] btf_encoder: skip functions consuming structs passed by value on stack
Date: Wed, 18 Jun 2025 17:02:14 +0200	[thread overview]
Message-ID: <20250618-btf_skip_structs_on_stack-v1-1-e70be639cc53@bootlin.com> (raw)

Most ABIs allow functions to receive structs passed by value, if they
fit in a register or a pair of registers, depending on the exact ABI.
However, when there is a struct passed by value but all registers are
already used for parameters passing, the struct is still passed by value
but on the stack, leading to possible mistakes about its exact location:
if the struct definition has some attributes like
__attribute__((packed)) or __attribute__((aligned(X)), its location on
the stack is altered, but this change is not reflected in dwarf
information. The corresponding BTF data generated from this can lead to
incorrect BPF trampolines generation (eg to attach bpf tracing programs
to kernel functions) in the Linux kernel.

It may not desirable to try to encode this kind of detail in BTF:
- BTF aims to remain a compact format
- this case currently does not really exist in the Linux kernel
- those attributes are not reliably encoded by compilers in DWARF info

So rather than trying to encode more details about those specific
functions, detect those and prevent the encoder from encoding the
corresponding info in BTF

Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
---
Hello,
this small RFC series is a follow-up to an attempt to prevent some
possible wrong bpf attachment cases in the kernel.
When attaching ebpf tracing programs to kernel functions, the kernel has
to generate a trampoline able to read and save the target function
arguments. There is a specific case in which we can not reliably perform
this task: most supported architectures allow passing a struct by value
if they fit in a register or a pair of register. The issue is when there
is no available register to pass such argument (eg because all registers
have been used by previous arguments): in this case the struct is passed
by value on the stack, but the exact struct location can not be reliably
deduced by the trampoline, as it may be altered by some attributes like
__attribute__((packed)) or __attribute__((aligned(X)). This detail is
not present in the BTF info used to build the trampoline.

There has been multiple attempts and discussions to handle this case.
[1] is a first attempt, trying to properly support this case, by
deducing more info about the struct passed on stack, but the comments
quickly highlighted some issues. A second attempt ([2]) handled it the
other way around, by simply preventing the trampoline creation if such
variable is consumed by the target function. However, discussions around
this series showed that despite being handled for the specific
architecture receiving this series (ARM64), other architectures still
allow to generate wrong trampolines, as they did not have the
corresponding check. So it has been followed by [3], aiming to enforce
the same constraint for all affected architectures. Alexei eventually
suggested that it may not be the correct direction, and that a solution
could be to just make sure that those specific functions are not
described in BTF info. This series then brings a RFC implementing this
option in pahole. 

This has been tested on both a vmlinux file and bpf_testmod.ko on x86
- in bpf_testmod.ko: bpf_testmod_test_struct_arg_9 is not encoded
  anymore, as it passes a struct bpf_testmod_struct_arg_5 on the stack
- on kernel side:
  - __vxlan_fdb_delete is now encoded in BTF info (I am not sure why
    yet, and I still fail to understand how it relates to this series)
  - vma_modify_flags_uffd is not encoded anymore, as it passes a
    struct vm_userfaultfd_ctx on the stack

A few points remain to be handled/answered, depending on the feedback on
this approach:
- the series currently bring no new test to ensure that those specific
  functions are not encoded
- AFAIU Arnaldo's update on pahole @ LSFMM suggests that  pahole will
  eventually be replaced by compilers as the _main_ tool to  generate
  BTF ([4]), so if the fix is relevant in pahole, it may become relevant
  as well in compilers generating btf data ?
- should this change be locked behind a pahole commandline parameter ?
- sort this __vxlan_fdb_delete issue out

[1] https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-0-0a32fe72339e@bootlin.com/
[2] https://lore.kernel.org/bpf/20250527-many_args_arm64-v3-0-3faf7bb8e4a2@bootlin.com/
[3] https://lore.kernel.org/bpf/20250613-deny_trampoline_structs_on_stack-v1-0-5be9211768c3@bootlin.com/
[4] http://oldvger.kernel.org/~acme/prez/lsfmm-bpf-2025/#/32
---
 btf_encoder.c  | 14 ++++++++++----
 dwarf_loader.c | 17 ++++++++++++++++-
 dwarves.h      |  2 ++
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 0bc23349b5d740c3ddab8208b2e15cdbdd139b9d..fc2f114695e9aad790ec4074f37cf6ca51adfeec 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -87,6 +87,7 @@ struct btf_encoder_func_state {
 	uint8_t optimized_parms:1;
 	uint8_t unexpected_reg:1;
 	uint8_t inconsistent_proto:1;
+	uint8_t uncertain_loc:1;
 	int ret_type_id;
 	struct btf_encoder_func_parm *parms;
 	struct btf_encoder_func_annot *annots;
@@ -1203,6 +1204,7 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 	state->inconsistent_proto = ftype->inconsistent_proto;
 	state->unexpected_reg = ftype->unexpected_reg;
 	state->optimized_parms = ftype->optimized_parms;
+	state->uncertain_loc = ftype->uncertain_loc;
 	ftype__for_each_parameter(ftype, param) {
 		const char *name = parameter__name(param) ?: "";
 
@@ -1365,7 +1367,7 @@ static int saved_functions_cmp(const void *_a, const void *_b)
 
 static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b)
 {
-	uint8_t optimized, unexpected, inconsistent;
+	uint8_t optimized, unexpected, inconsistent, uncertain_loc;
 	int ret;
 
 	ret = strncmp(a->elf->name, b->elf->name,
@@ -1375,11 +1377,13 @@ static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_
 	optimized = a->optimized_parms | b->optimized_parms;
 	unexpected = a->unexpected_reg | b->unexpected_reg;
 	inconsistent = a->inconsistent_proto | b->inconsistent_proto;
-	if (!unexpected && !inconsistent && !funcs__match(a, b))
+	uncertain_loc = a->uncertain_loc | b->uncertain_loc;
+	if (!unexpected && !inconsistent && !uncertain_loc && !funcs__match(a, b))
 		inconsistent = 1;
 	a->optimized_parms = b->optimized_parms = optimized;
 	a->unexpected_reg = b->unexpected_reg = unexpected;
 	a->inconsistent_proto = b->inconsistent_proto = inconsistent;
+	a->uncertain_loc = b->uncertain_loc = uncertain_loc;
 
 	return 0;
 }
@@ -1432,9 +1436,11 @@ static int btf_encoder__add_saved_funcs(struct btf_encoder *encoder, bool skip_e
 		 * just do not _use_ them.  Only exclude functions with
 		 * unexpected register use or multiple inconsistent prototypes.
 		 */
-		add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto;
+                add_to_btf |= !state->unexpected_reg &&
+                              !state->inconsistent_proto &&
+                              !state->uncertain_loc;
 
-		if (add_to_btf) {
+                if (add_to_btf) {
 			err = btf_encoder__add_func(state->encoder, state);
 			if (err < 0)
 				goto out;
diff --git a/dwarf_loader.c b/dwarf_loader.c
index abf17175d830caa63834464f959e6376223eb19c..ef00cdf308166732fc0938f2eecd56d314d3354f 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1258,8 +1258,20 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
 
-		if (param_idx >= cu->nr_register_params || param_idx < 0)
+		if(param_idx < 0)
 			return parm;
+
+		if (param_idx >= cu->nr_register_params) {
+			if(dwarf_attr(die, DW_AT_type, &attr)){
+				Dwarf_Die type_die;
+				if (dwarf_formref_die(&attr, &type_die) &&
+						dwarf_tag(&type_die) == DW_TAG_structure_type) {
+					parm->uncertain_loc = 1;
+				}
+			}
+			return parm;
+		}
+
 		/* Parameters which use DW_AT_abstract_origin to point at
 		 * the original parameter definition (with no name in the DIE)
 		 * are the result of later DWARF generation during compilation
@@ -2953,6 +2965,9 @@ static int cu__resolve_func_ret_types_optimized(struct cu *cu)
 
 			if (pos->unexpected_reg)
 				has_unexpected_reg = true;
+
+			if (pos->uncertain_loc)
+				fn->proto.uncertain_loc = 1;
 		}
 		if (has_unexpected_reg) {
 			ftype__for_each_parameter(&fn->proto, pos) {
diff --git a/dwarves.h b/dwarves.h
index 36c689847ebf29a1ab9936f9d0f928dd46514547..883852f4b60a36d73bce4568cbacd8ef3cff97ea 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -943,6 +943,7 @@ struct parameter {
 	uint8_t optimized:1;
 	uint8_t unexpected_reg:1;
 	uint8_t has_loc:1;
+	uint8_t uncertain_loc:1;
 };
 
 static inline struct parameter *tag__parameter(const struct tag *tag)
@@ -1021,6 +1022,7 @@ struct ftype {
 	uint8_t		 unexpected_reg:1;
 	uint8_t		 processed:1;
 	uint8_t		 inconsistent_proto:1;
+	uint8_t		 uncertain_loc:1;
 	struct list_head template_type_params;
 	struct list_head template_value_params;
 	struct template_parameter_pack *template_parameter_pack;

---
base-commit: 40c2a7b9277c774a47b27f48ede7d1824587da50
change-id: 20250617-btf_skip_structs_on_stack-006adf457d50

Best regards,
-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


             reply	other threads:[~2025-06-18 15:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-18 15:02 Alexis Lothoré [this message]
2025-06-18 16:28 ` [PATCH RFC] btf_encoder: skip functions consuming structs passed by value on stack Alexei Starovoitov
2025-06-19 13:12   ` Alexis Lothoré
2025-06-19 15:41     ` Alan Maguire
2025-06-19 16:06       ` Alexis Lothoré
2025-06-19 16:49     ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250618-btf_skip_structs_on_stack-v1-1-e70be639cc53@bootlin.com \
    --to=alexis.lothore@bootlin.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=ast@fb.com \
    --cc=bastien.curutchet@bootlin.com \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=thomas.petazzoni@bootlin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox