All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF
@ 2026-05-23 16:57 Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

Current vmlinux BTF encoding is based on the source level signatures.
But the compiler may do some optimization and changed the signature.
If the user tried with source level signature, their initial implementation
may have wrong results and then the user need to check what is the
problem and work around it, e.g. through kprobe since kprobe does not
need vmlinux BTF.

Majority of changed signatures are due to dead argument elimination.
The following is a more complex one. The original source signature:
  typedef struct {
        union {
                void            *kernel;
                void __user     *user;
        };
        bool            is_kernel : 1;
  } sockptr_t;
  typedef sockptr_t bpfptr_t;
  static int map_create(union bpf_attr *attr, bpfptr_t uattr) { ... }
After compiler optimization, the signature becomes:
  static int map_create(union bpf_attr *attr, bool uattr__is_kernel) { ... }
In the above, uattr__is_kernel corresponds to 'is_kernel' field in sockptr_t.
This makes it easier for developers to understand what changed.

The new signature needs to properly follow ABI specification based on
locations. Otherwise, that signature should be discarded. For example,

    0x0242f1f7:   DW_TAG_subprogram
                    DW_AT_name      ("memblock_find_in_range")
                    DW_AT_calling_convention        (DW_CC_nocall)
                    DW_AT_type      (0x0242decc "phys_addr_t")
                    ...
    0x0242f22e:     DW_TAG_formal_parameter
                      DW_AT_location        (indexed (0x14a) loclist = 0x005595bc:
                         [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
                         [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
                         [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                         [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
                      DW_AT_name    ("start")
                      DW_AT_type    (0x0242decc "phys_addr_t")
                      ...
    0x0242f239:     DW_TAG_formal_parameter
                      DW_AT_location        (indexed (0x14b) loclist = 0x005595e6:
                         [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
                         [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
                         [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
                         [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
                      DW_AT_name    ("end")
                      DW_AT_type    (0x0242decc "phys_addr_t")
                      ...
    0x0242f245:     DW_TAG_formal_parameter
                      DW_AT_location        (indexed (0x14c) loclist = 0x00559610:
                         [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
                      DW_AT_name    ("size")
                      DW_AT_type    (0x0242decc "phys_addr_t")
                      ...
    0x0242f250:     DW_TAG_formal_parameter
                      DW_AT_const_value     (4096)
                      DW_AT_name    ("align")
                      DW_AT_type    (0x0242decc "phys_addr_t")
                      ...

The third argument should correspond to RDX for x86_64. But the location suggests that
the parameter value is stored in the address with 'RSI + 0'. It is not clear whether
the parameter value is stored in RDX or not. So we have to discard this funciton in
vmlinux BTF to avoid incorrect true signatures.

For llvm, any function having
  DW_AT_calling_convention        (DW_CC_nocall)
in dwarf DW_TAG_subprogram will indicate that this function has signature changed.
I did experiment with latest bpf-next. For x86_64, there are 69103 kernel functions
and 875 kernel functions having signature changed. A series of patches are intended
to ensure true signatures are properly represented. Eventually, only 18 functions
cannot have true signatures due to locations.

For arm64, there are 863 kernel functions having signature changed, and
70 functions cannot have true signatures due to locations. I checked those
functions and look like llvm arm64 backend more relaxed to compute parameter
values.

For full testing, I enabled true signature support in kernel scripts/Makefile.btf like below:
  -pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes
  +pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes --btf_features=+true_signature

For the patch set, Patch 1 introduced usage of DW_AT_calling_convention, which
can precisely identify which function has signature changed. This can filter
majority of functions where their signature won't change. Patch 2 did a prescan
of parameter registers to accommodate some cases where the optimization could
happen but didn't. Patches 3 to 9 tried to find functions with true signature.
Patch 10 enables to btf encoder to properly generate BTF.
Patch 11 includes a few tests.

Changelog:
  v4 -> v5:
    - v4: https://lore.kernel.org/bpf/20260326013144.2901265-1-yonghong.song@linux.dev/
    - Check info.signature_changed only under clang.
    - Fix an uninitialized varable issue (var reg_dix) for gcc.
  v3 -> v4:
    - v3: https://lore.kernel.org/bpf/20260320190917.1970524-1-yonghong.song@linux.dev/
    - Add simple prescan of parameter registers in order to get true signatures
      for those functions where optimization could happen but compiler didn't do it.
    - Do not create a new name (e.g. "uattr__is_kernel") with malloc at parameter_reg()
      stage. Instead remember both "uattr" and "is_kernel" and later generate the
      name "uattr_is_kernel" in btf encoder.
    - Add comments to explain how to handle parameters which may take two registers.
    - Fix some test failures on aarch64.
  v2 -> v3:
    - v2: https://lore.kernel.org/bpf/20260309153215.1917033-1-yonghong.song@linux.dev/
    - Change tests by using newly added test_lib.sh.
    - Simplify to get bool variable producer_clang.
    - Try to avoid producer_clang appearance in dwarf_loader.c in order to avoid
      clear separation between clang and gcc.
  v1 -> v2:
    - v1: https://lore.kernel.org/bpf/20260305225455.1151066-1-yonghong.song@linux.dev/
    - Added producer_clang guarding in btf_encoder. Otherwise, gcc kernel build
      will crash pahole.
    - Fix an early return in parameter__reg() which didn't do pthread_mutex_unlock()
      which caused the deadlock for arm64.
    - Add a few more places to guard with producer_clang and conf->true_signature
      to maintain the previous behavior if not clang or conf->true_signature is false.

Yonghong Song (11):
  dwarf_loader: Reduce parameter checking with clang
    DW_AT_calling_convention attr
  dwarf_loader: Prescan all parameters with expected registers
  dwarf_loader: Handle signatures with dead arguments
  dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL
  dwarf_laoder: Handle locations with DW_OP_fbreg
  dwarf_loader: Change exprlen checking condition in parameter__reg()
  dwarf_loader: Detect optimized parameters with locations having
    constant values
  dwarf_loader: Check whether two-reg parameter actually use two regs or
    not
  dwarf_loader: Handle expression lists
  btf_encoder: Handle optimized parameter properly
  tests: Add a few clang true signature tests

 btf_encoder.c                       |  32 +-
 dwarf_loader.c                      | 552 ++++++++++++++++++++++++++--
 dwarves.h                           |   3 +
 tests/clang_parm_aggregate.sh       |  85 +++++
 tests/clang_parm_optimized.sh       |  63 ++++
 tests/clang_parm_optimized_stack.sh |  63 ++++
 6 files changed, 769 insertions(+), 29 deletions(-)
 create mode 100755 tests/clang_parm_aggregate.sh
 create mode 100755 tests/clang_parm_optimized.sh
 create mode 100755 tests/clang_parm_optimized_stack.sh

-- 
2.53.0-Meta


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-06-11  9:15   ` Alan Maguire
  2026-05-23 16:57 ` [PATCH dwarves v5 02/11] dwarf_loader: Prescan all parameters with expected registers Yonghong Song
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

Currently every function is checked for its parameters to identify whether
the signature changed or not. If signature indeed changed, pahole may do
some adjustment for parameters for true signatures.

In clang, any function with the following attribute
      DW_AT_calling_convention        (DW_CC_nocall)
indicates this function having signature changed.
pahole can take advantage of this to avoid parameter checking if
DW_AT_calling_convention is not DW_CC_nocall.

But more importantly, DW_CC_nocall can identify signature-changed functions
and parameters can be checked one-after-another to create the true
signatures. Otherwise, it takes more effort to identify whether a
function has signature changed or not. For example, for funciton
  __bpf_kfunc static void bbr_main(struct sock *sk, u32 ack, int flag,
     const struct rate_sample *rs) { ... }
and bbr_main() is a callback function in
  .cong_control   = bbr_main
in 'struct tcp_congestion_ops tcp_bbr_cong_ops'.
In the above bbr_main(...), parameter 'ack' and 'flag' are not used.
The following are some details:

0x0a713b8d:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x28) loclist = 0x0166d452:
                     [0xffffffff83e77fd9, 0xffffffff83e78016): DW_OP_reg5 RDI
                     ...
                  DW_AT_name    ("sk")
                  DW_AT_type    (0x0a6f5b2b "sock *")
                  ...

0x0a713b98:     DW_TAG_formal_parameter
                  DW_AT_name    ("ack")
                  DW_AT_type    (0x0a6f58fd "u32")
                  ...

0x0a713ba2:     DW_TAG_formal_parameter
                  DW_AT_name    ("flag")
                  DW_AT_type    (0x0a6f57d1 "int")
                  ...

0x0a713bac:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x29) loclist = 0x0166d4a8:
                     [0xffffffff83e77fd9, 0xffffffff83e78016): DW_OP_reg2 RCX
                     ...
                  DW_AT_name    ("rs")
                  DW_AT_type    (0x0a710da5 "const rate_sample *")

Some analysis for the above dwarf can conclude that the 'ark' and 'flag'
may be related to RSI and RDX, considering the last one is RCX. Basically this
requires all parameters are available to collectively decide whether the
true signature can be found or not. In such case, DW_CC_nocall can make things
easier as parameter can be checked one after another.

For a clang built bpf-next kernel with x86_64, in non-LTO setup,
the number of kernel functions is 69103 and the number of signature changed
functions is 875, based on
      DW_AT_calling_convention        (DW_CC_nocall)
indication.

Among 875 signature changed functions, after this patch, 343 functions
can have proper true signatures, mostly due to simple dead argument
elimination. The number of remaining functions, which cannot get the
true signature, is 532 due to dead or additional-checked parameters.

They will be addressed in the subsequent commits.

In llvm23, I implemented [1] which added DW_CC_nocall for ArgumentPromotion pass.
This compiler pass can add additional DW_CC_nocall cases for the following
compilation:
      - Flag -O3 or FullLTO
So once llvm23 available, we may have more DW_CC_nocall cases, hence more
potential true signatures if the kernel is built with -O3 or
with FullLTO (CONFIG_LTO_CLANG_FULL).

  [1] https://github.com/llvm/llvm-project/pull/178973

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 86 ++++++++++++++++++++++++++++++++++++++++++--------
 dwarves.h      |  1 +
 2 files changed, 73 insertions(+), 14 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 16fb7be..0bc4fc4 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1190,6 +1190,10 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 	return ret;
 }
 
+struct func_info {
+	bool signature_changed;
+};
+
 /* For DW_AT_location 'attr':
  * - if first location is DW_OP_regXX with expected number, return the register;
  *   otherwise save the register for later return
@@ -1252,7 +1256,8 @@ out:
 }
 
 static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
-					struct conf_load *conf, int param_idx)
+					struct conf_load *conf, int param_idx,
+					struct func_info *info)
 {
 	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
 
@@ -1263,8 +1268,15 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
 		parm->idx = param_idx;
-		if (param_idx >= cu->nr_register_params || param_idx < 0)
+		if (param_idx < 0)
 			return parm;
+		if (!info->signature_changed) {
+			if (cu->producer_clang || param_idx >= cu->nr_register_params)
+				return parm;
+		} else if (param_idx >= cu->nr_register_params) {
+			return parm;
+		}
+
 		/* Parameters which use DW_AT_abstract_origin to point at
 		 * the original parameter definition (with no name in the DIE)
 		 * are the result of later DWARF generation during compilation
@@ -1337,7 +1349,7 @@ static int formal_parameter_pack__load_params(struct formal_parameter_pack *pack
 			continue;
 		}
 
-		struct parameter *param = parameter__new(die, cu, conf, -1);
+		struct parameter *param = parameter__new(die, cu, conf, -1, NULL);
 
 		if (param == NULL)
 			return -1;
@@ -1502,6 +1514,29 @@ static struct ftype *ftype__new(Dwarf_Die *die, struct cu *cu)
 	return ftype;
 }
 
+static bool function__signature_changed(struct function *func, Dwarf_Die *die)
+{
+	/* The inlined DW_TAG_subprogram typically has the original source type for
+	 * abstract origin of a concrete function with address range, inlined subroutine,
+	 * or call site.
+	 */
+	if (func->inlined)
+		return false;
+
+	if (!func->abstract_origin)
+		return attr_numeric(die, DW_AT_calling_convention) == DW_CC_nocall;
+
+	Dwarf_Attribute attr;
+	if (dwarf_attr(die, DW_AT_abstract_origin, &attr)) {
+		Dwarf_Die origin;
+		if (dwarf_formref_die(&attr, &origin))
+			return attr_numeric(&origin, DW_AT_calling_convention) == DW_CC_nocall;
+	}
+
+	/* This should not happen */
+	return false;
+}
+
 static struct function *function__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
 {
 	struct function *func = tag__alloc(cu, sizeof(*func));
@@ -1800,9 +1835,9 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
 					     struct ftype *ftype,
 					     struct lexblock *lexblock,
 					     struct cu *cu, struct conf_load *conf,
-					     int param_idx)
+					     int param_idx, struct func_info *info)
 {
-	struct parameter *parm = parameter__new(die, cu, conf, param_idx);
+	struct parameter *parm = parameter__new(die, cu, conf, param_idx, info);
 
 	if (parm == NULL)
 		return NULL;
@@ -1889,7 +1924,7 @@ static struct tag *die__create_new_subroutine_type(Dwarf_Die *die,
 			tag__print_not_supported(die);
 			continue;
 		case DW_TAG_formal_parameter:
-			tag = die__create_new_parameter(die, ftype, NULL, cu, conf, -1);
+			tag = die__create_new_parameter(die, ftype, NULL, cu, conf, -1, NULL);
 			break;
 		case DW_TAG_unspecified_parameters:
 			ftype->unspec_parms = 1;
@@ -2118,7 +2153,8 @@ out_enomem:
 }
 
 static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
-				  struct lexblock *lexblock, struct cu *cu, struct conf_load *conf);
+				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf,
+				 struct func_info *info);
 
 static int die__create_new_lexblock(Dwarf_Die *die,
 				    struct cu *cu, struct lexblock *father, struct conf_load *conf)
@@ -2126,7 +2162,7 @@ static int die__create_new_lexblock(Dwarf_Die *die,
 	struct lexblock *lexblock = lexblock__new(die, cu);
 
 	if (lexblock != NULL) {
-		if (die__process_function(die, NULL, lexblock, cu, conf) != 0)
+		if (die__process_function(die, NULL, lexblock, cu, conf, NULL) != 0)
 			goto out_delete;
 	}
 	if (father != NULL)
@@ -2246,7 +2282,8 @@ static struct tag *die__create_new_inline_expansion(Dwarf_Die *die,
 }
 
 static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
-				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf)
+				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf,
+				 struct func_info *info)
 {
 	int param_idx = 0;
 	Dwarf_Die child;
@@ -2320,7 +2357,7 @@ static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
 			continue;
 		}
 		case DW_TAG_formal_parameter:
-			tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++);
+			tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++, info);
 			break;
 		case DW_TAG_variable:
 			tag = die__create_new_variable(die, cu, conf, 0);
@@ -2391,11 +2428,19 @@ out_enomem:
 static struct tag *die__create_new_function(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
 {
 	struct function *function = function__new(die, cu, conf);
+	struct func_info info = {};
 
-	if (function != NULL &&
-	    die__process_function(die, &function->proto, &function->lexblock, cu, conf) != 0) {
-		function__delete(function, cu);
-		function = NULL;
+	if (function != NULL) {
+		/* For clang, we determine if function signature changes via DW_AT_calling_convention
+		 * set to DW_CC_nocall.
+		 */
+		if (cu->producer_clang)
+			info.signature_changed = function__signature_changed(function, die);
+
+		if (die__process_function(die, &function->proto, &function->lexblock, cu, conf, &info) != 0) {
+			function__delete(function, cu);
+			function = NULL;
+		}
 	}
 
 	return function ? &function->proto.tag : NULL;
@@ -3045,6 +3090,17 @@ static unsigned long long dwarf_tag__orig_id(const struct tag *tag,
 	return cu->extra_dbg_info ? dtag->id : 0;
 }
 
+static bool attr_producer_clang(Dwarf_Die *die)
+{
+	const char *producer;
+
+	producer = attr_string(die, DW_AT_producer, NULL);
+	if (!producer)
+		return false;
+
+	return !!strstr(producer, "clang");
+}
+
 struct debug_fmt_ops dwarf__ops;
 
 static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
@@ -3082,6 +3138,7 @@ static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
 	}
 
 	cu->language = attr_numeric(die, DW_AT_language);
+	cu->producer_clang = attr_producer_clang(die);
 
 	if (conf->early_cu_filter)
 		cu = conf->early_cu_filter(cu);
@@ -3841,6 +3898,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
 			cu->priv = dcu;
 			cu->dfops = &dwarf__ops;
 			cu->language = attr_numeric(cu_die, DW_AT_language);
+			cu->producer_clang = attr_producer_clang(cu_die);
 			cus__add(cus, cu);
 		}
 
diff --git a/dwarves.h b/dwarves.h
index 5ec16e7..b49e651 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -306,6 +306,7 @@ struct cu {
 	uint8_t		 has_addr_info:1;
 	uint8_t		 uses_global_strings:1;
 	uint8_t		 little_endian:1;
+	uint8_t		 producer_clang:1;
 	uint8_t		 nr_register_params;
 	int		 register_params[ARCH_MAX_REGISTER_PARAMS];
 	int		 functions_saved;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 02/11] dwarf_loader: Prescan all parameters with expected registers
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 03/11] dwarf_loader: Handle signatures with dead arguments Yonghong Song
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

Find expected registers for each parameter so the current
parameter can check the next one to decide what type should be
used. In some cases, based on dwarf locations, a particular
parameter can be optimized. But the compiler may not really
optimize it. In such cases, the original parameter type should
be preserved in order to match the next parameter register.

The following are two examples, all from arm64.
Example 1:
  $ cat t.c
  struct t { long f1; long f2; };
  __attribute__((noinline)) static long foo(struct t a, struct t b, int i)
  {
          return a.f1 + b.f1 + b.f2;
  }
  struct t p1, p2;
  int i;
  int main()
  {
          return (int)foo(p1, p2, i);
  }
  $ clang -O2 -g t.c
  $ llvm-dwarfdump a.out
  ...
  0x00000041:   DW_TAG_subprogram
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x0000008f "long")
                  ...

  0x00000051:     DW_TAG_formal_parameter
                    DW_AT_location        (indexed (0x0) loclist = 0x00000014:
                       [0x0000000000000740, 0x0000000000000748): DW_OP_reg0 W0, DW_OP_piece 0x8)
                    DW_AT_name    ("a")
                    DW_AT_type    (0x00000077 "t")
                    ...

  0x0000005a:     DW_TAG_formal_parameter
                    DW_AT_location        (indexed (0x1) loclist = 0x0000001c:
                       [0x0000000000000740, 0x000000000000074c): DW_OP_reg2 W2, DW_OP_piece 0x8, DW_OP_reg3 W3, DW_OP_piece 0x8)
                    DW_AT_name    ("b")
                    DW_AT_type    (0x00000077 "t")
                    ...

  0x00000063:     DW_TAG_formal_parameter
                    DW_AT_name    ("i")
                    DW_AT_type    (0x00000027 "int")
                    ...

  0x0000006b:     NULL

In the above, parameter 'a' actually only uses the first 8 byte value, so looks like
it can be optimized. But since the second parameter starts with register W2, it makes
sense to keep the first parameter original type to ensure correct ABI.

Another example from vmlinux dwarf:

  0x0533fd03:   DW_TAG_subprogram
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x05334dc7 "int")
                  ...

  0x0533fd15:     DW_TAG_formal_parameter
                    DW_AT_name    ("str")
                    DW_AT_type    (0x05335918 "char *")
                    ...

  0x0533fd1f:     DW_TAG_formal_parameter
                    DW_AT_location        (indexed (0x3b) loclist = 0x00eb9d83:
                       [0xffff80008419f2e0, 0xffff80008419f324): DW_OP_reg1 W1
                       [0xffff80008419f324, 0xffff80008419f47c): DW_OP_reg19 W19
                       [0xffff80008419f47c, 0xffff80008419f494): DW_OP_entry_value(DW_OP_reg1 W1), DW_OP_stack_value
                       [0xffff80008419f494, 0xffff80008419f498): DW_OP_reg19 W19)
                    DW_AT_name    ("used")
                    DW_AT_type    (0x05334dc7 "int")
                    ...

In the above, since the second argument has register W1, it makes sense to
keep the type of the first argument to ensure correct ABI.

Without prescan, the above two cases will be rejected for btf due to mismatched
expected registers.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 70 insertions(+), 1 deletion(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 0bc4fc4..f0ac699 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1190,10 +1190,40 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 	return ret;
 }
 
+/* Max 20 register parameters, considering some parameters may be optimized out.  */
+#define	MAX_PRESCAN_PARAMS	20
+
 struct func_info {
 	bool signature_changed;
+	int nr_params;
+	int param_start_regs[MAX_PRESCAN_PARAMS];
 };
 
+/* Get the first DW_OP_X (should be a register) from a parameter's DW_AT_location. */
+static int parameter__peek_first_reg(Dwarf_Die *die)
+{
+	Dwarf_Attribute attr;
+	if (dwarf_attr(die, DW_AT_location, &attr) == NULL)
+		return -1;
+
+	Dwarf_Addr base, start, end;
+	Dwarf_Op *expr;
+	size_t exprlen;
+	ptrdiff_t offset = 0;
+
+	pthread_mutex_lock(&libdw__lock);
+	offset = __dwarf_getlocations(&attr, offset, &base, &start, &end, &expr, &exprlen);
+	pthread_mutex_unlock(&libdw__lock);
+
+	if (offset <= 0 || exprlen == 0)
+		return -1;
+
+	if (expr[0].atom >= DW_OP_reg0 && expr[0].atom <= DW_OP_reg31)
+		return expr[0].atom;
+
+	return -1;
+}
+
 /* For DW_AT_location 'attr':
  * - if first location is DW_OP_regXX with expected number, return the register;
  *   otherwise save the register for later return
@@ -2425,6 +2455,43 @@ out_enomem:
 	return -ENOMEM;
 }
 
+/* Pre-scan all formal parameters to collect their starting registers.
+ * This allows look-ahead when processing parameters sequentially, so that
+ * a parameter can check the next parameter's register to determine if the
+ * ABI register layout is preserved despite partial optimization.
+ * For example, for a function like below:
+ *  struct t { long f1; long f2; };
+ *  __attribute__((noinline)) static long foo(struct t a, struct t b)
+ *  {
+ *      return a.f1 + b.f1 + b.f2;
+ *  }
+ * If dwarf has parameter 'a' at aarch64 register W0, and 'b' at register W2,
+ * even compiler could optimize 'a' to 'a.f1'. To conform to ABI, the
+ * parameter 'a' will keep 'struct t' type.
+ */
+static void func_info__prescan_params(struct func_info *info, Dwarf_Die *die)
+{
+	Dwarf_Die child;
+	int idx = 0;
+
+	if (!info->signature_changed)
+		return;
+
+	if (!dwarf_haschildren(die) || dwarf_child(die, &child) != 0)
+		return;
+
+	do {
+		if (dwarf_tag(&child) != DW_TAG_formal_parameter)
+			continue;
+		if (idx >= MAX_PRESCAN_PARAMS)
+			break;
+		info->param_start_regs[idx] = parameter__peek_first_reg(&child);
+		idx++;
+	} while (dwarf_siblingof(&child, &child) == 0);
+
+	info->nr_params = idx;
+}
+
 static struct tag *die__create_new_function(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
 {
 	struct function *function = function__new(die, cu, conf);
@@ -2434,8 +2501,10 @@ static struct tag *die__create_new_function(Dwarf_Die *die, struct cu *cu, struc
 		/* For clang, we determine if function signature changes via DW_AT_calling_convention
 		 * set to DW_CC_nocall.
 		 */
-		if (cu->producer_clang)
+		if (cu->producer_clang) {
 			info.signature_changed = function__signature_changed(function, die);
+			func_info__prescan_params(&info, die);
+		}
 
 		if (die__process_function(die, &function->proto, &function->lexblock, cu, conf, &info) != 0) {
 			function__delete(function, cu);
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 03/11] dwarf_loader: Handle signatures with dead arguments
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 02/11] dwarf_loader: Prescan all parameters with expected registers Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 04/11] dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL Yonghong Song
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

For llvm dwarf, the dead argument may be in the middle of
DW_TAG_subprogram. So we introduce skip_idx in order to
match expected registers properly.

For example:
  0x00042897:   DW_TAG_subprogram
                  DW_AT_name      ("create_dev")
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x0002429a "int")
                  ...

  0x000428ab:     DW_TAG_formal_parameter
                    DW_AT_name    ("name")
                    DW_AT_type    (0x000242ed "char *")
                    ...

  0x000428b5:     DW_TAG_formal_parameter
                    DW_AT_location        (indexed (0x3f) loclist = 0x000027f8:
                       [0xffffffff87681370, 0xffffffff8768137a): DW_OP_reg5 RDI
                       [0xffffffff8768137a, 0xffffffff87681392): DW_OP_reg3 RBX
                       [0xffffffff87681392, 0xffffffff876813ae): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
                    DW_AT_name    ("dev")
                    DW_AT_type    (0x00026859 "dev_t")
                    ...

With skip_idx, we can identify that the second original argument
'dev' becomes the first one after optimization.

The previous patch has the following:
  0x0533fd03:   DW_TAG_subprogram
                  DW_AT_name      ("acpi_irq_penalty_update")
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x05334dc7 "int")
                  ...

  0x0533fd15:     DW_TAG_formal_parameter
                    DW_AT_name    ("str")
                    DW_AT_type    (0x05335918 "char *")
                    ...

  0x0533fd1f:     DW_TAG_formal_parameter
                    DW_AT_location        (indexed (0x3b) loclist = 0x00eb9d83:
                       [0xffff80008419f2e0, 0xffff80008419f324): DW_OP_reg1 W1
                       [0xffff80008419f324, 0xffff80008419f47c): DW_OP_reg19 W19
                       [0xffff80008419f47c, 0xffff80008419f494): DW_OP_entry_value(DW_OP_reg1 W1), DW_OP_stack_value
                       [0xffff80008419f494, 0xffff80008419f498): DW_OP_reg19 W19)
                    DW_AT_name    ("used")
                    DW_AT_type    (0x05334dc7 "int")
                    ...

It is also handled properly with parameter 'str' will have W0 register.

With this patch, I checked x86_64 that the number of invalid true signatures is reduced
from 532 to 96. This suggests that majority of optimized functions are caused by
dead arguments.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 88 insertions(+), 5 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index f0ac699..49993ee 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1195,10 +1195,62 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 
 struct func_info {
 	bool signature_changed;
+	int skip_idx;
 	int nr_params;
 	int param_start_regs[MAX_PRESCAN_PARAMS];
 };
 
+static int __get_type_byte_size(Dwarf_Die *die, struct cu *cu)
+{
+	Dwarf_Attribute attr;
+	if (dwarf_attr(die, DW_AT_type, &attr) == NULL)
+		return 0;
+
+	Dwarf_Die type_die;
+	if (dwarf_formref_die(&attr, &type_die) == NULL)
+		return 0;
+
+	/* A type does not have byte_size.
+	 * 0x000dac83: DW_TAG_formal_parameter
+			 DW_AT_location        (indexed (0x385) loclist = 0x00016175:
+			   [0xffff800080098cb0, 0xffff800080098cb4): DW_OP_breg8 W8+0
+			   [0xffff800080098cb4, 0xffff800080098ff4): DW_OP_breg31 WSP+16, DW_OP_deref
+			   [0xffff800080099054, 0xffff80008009908c): DW_OP_breg31 WSP+16, DW_OP_deref)
+			 DW_AT_name    ("ubuf")
+			 DW_AT_decl_file       ("/home/yhs/work/bpf-next/arch/arm64/kernel/ptrace.c")
+			 DW_AT_decl_line       (886)
+			 DW_AT_type    (0x000d467e "const void *")
+
+	  * 0x000d467e: DW_TAG_pointer_type
+			  DW_AT_type      (0x000c4320 "const void")
+
+	  * 0x000c4320: DW_TAG_const_type
+	  */
+	if (dwarf_tag(&type_die) == DW_TAG_pointer_type)
+		return cu->addr_size;
+
+	uint64_t bsize = attr_numeric(&type_die, DW_AT_byte_size);
+	if (bsize == 0)
+		return __get_type_byte_size(&type_die, cu);
+
+	return bsize;
+}
+
+static int get_type_byte_size(Dwarf_Die *die, struct cu *cu)
+{
+	int byte_size = 0;
+
+	Dwarf_Attribute attr;
+	if (dwarf_attr(die, DW_AT_abstract_origin, &attr)) {
+		Dwarf_Die origin;
+		if (dwarf_formref_die(&attr, &origin))
+			byte_size = __get_type_byte_size(&origin, cu);
+	} else {
+		byte_size = __get_type_byte_size(die, cu);
+	}
+	return byte_size;
+}
+
 /* Get the first DW_OP_X (should be a register) from a parameter's DW_AT_location. */
 static int parameter__peek_first_reg(Dwarf_Die *die)
 {
@@ -1292,8 +1344,9 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
 
 	if (parm != NULL) {
-		bool has_const_value;
+		bool has_const_value, true_sig_enabled;
 		Dwarf_Attribute attr;
+		int reg_idx;
 
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
@@ -1303,8 +1356,10 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		if (!info->signature_changed) {
 			if (cu->producer_clang || param_idx >= cu->nr_register_params)
 				return parm;
-		} else if (param_idx >= cu->nr_register_params) {
-			return parm;
+		} else {
+			reg_idx = param_idx - info->skip_idx;
+			if (reg_idx >= cu->nr_register_params)
+				return parm;
 		}
 
 		/* Parameters which use DW_AT_abstract_origin to point at
@@ -1342,9 +1397,10 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		 */
 		has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
 		parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
+		true_sig_enabled = conf->true_signature && info->signature_changed;
 
 		if (parm->has_loc) {
-			int expected_reg = cu->register_params[param_idx];
+			int expected_reg = cu->register_params[reg_idx];
 			int actual_reg = parameter__reg(&attr, expected_reg);
 
 			if (actual_reg < 0)
@@ -1357,8 +1413,35 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 				 * contents.
 				 */
 				parm->unexpected_reg = 1;
-		} else if (has_const_value) {
+		} else if (has_const_value && !cu->producer_clang) {
+			parm->optimized = 1;
+		} else if (true_sig_enabled) {
+			int byte_size, num_regs, next_reg_idx;
+
+			if (param_idx + 1 < info->nr_params) {
+				int next_start = info->param_start_regs[param_idx + 1];
+				if (next_start >= 0) {
+					/* check whether we should preserve the argument or not */
+					byte_size = get_type_byte_size(die, cu);
+					/* byte_size 0 should not happen. */
+					if (!byte_size) {
+						parm->unexpected_reg = 1;
+						return parm;
+					}
+
+					num_regs = (byte_size + cu->addr_size - 1) / cu->addr_size;
+					next_reg_idx = reg_idx + num_regs;
+					if (next_reg_idx < cu->nr_register_params &&
+					    next_start == cu->register_params[next_reg_idx]) {
+						if (byte_size > cu->addr_size)
+							info->skip_idx--;
+						return parm;
+					}
+				}
+			}
+
 			parm->optimized = 1;
+			info->skip_idx++;
 		}
 	}
 
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 04/11] dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (2 preceding siblings ...)
  2026-05-23 16:57 ` [PATCH dwarves v5 03/11] dwarf_loader: Handle signatures with dead arguments Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 05/11] dwarf_laoder: Handle locations with DW_OP_fbreg Yonghong Song
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

Later on, More macro return values will be implemented to make
code easier to understand.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 49993ee..0b5530d 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1190,6 +1190,8 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 	return ret;
 }
 
+#define	PARM_DEFAULT_FAIL	-1
+
 /* Max 20 register parameters, considering some parameters may be optimized out.  */
 #define	MAX_PRESCAN_PARAMS	20
 
@@ -1281,7 +1283,7 @@ static int parameter__peek_first_reg(Dwarf_Die *die)
  *   otherwise save the register for later return
  * - if location DW_OP_entry_value(DW_OP_regXX) with expected number is in the
  *   list, return the register; otherwise save register for later return
- * - otherwise if no register was found for locations, return -1.
+ * - otherwise if no register was found for locations, return PARM_DEFAULT_FAIL.
  */
 static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
 {
@@ -1291,7 +1293,7 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
 	size_t exprlen, entry_len;
 	ptrdiff_t offset = 0;
 	int loc_num = -1;
-	int ret = -1;
+	int ret = PARM_DEFAULT_FAIL;
 
 	/* use libdw__lock as dwarf_getlocation(s) has concurrency issues
 	 * when libdw is not compiled with experimental --enable-thread-safety
@@ -1403,7 +1405,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 			int expected_reg = cu->register_params[reg_idx];
 			int actual_reg = parameter__reg(&attr, expected_reg);
 
-			if (actual_reg < 0)
+			if (actual_reg == PARM_DEFAULT_FAIL)
 				parm->optimized = 1;
 			else if (expected_reg >= 0 && expected_reg != actual_reg)
 				/* mark parameters that use an unexpected
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 05/11] dwarf_laoder: Handle locations with DW_OP_fbreg
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (3 preceding siblings ...)
  2026-05-23 16:57 ` [PATCH dwarves v5 04/11] dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 06/11] dwarf_loader: Change exprlen checking condition in parameter__reg() Yonghong Song
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

DW_OP_fbreg means the parameter value will be stored on the
stack. So the corresponding parameter register is not used.
For example:

  0x071f7717:   DW_TAG_subprogram
                  DW_AT_name      ("jent_health_failure")
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x071f7626 "unsigned int")
                  ...

  0x071f7728:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_fbreg -8)
                    DW_AT_name    ("ec")
                    DW_AT_type    (0x071f7ab6 "rand_data *")
                    ...

  0x071f7734:     NULL

In the above, the parameter 'ec' type is a pointer so it perfectly fits
into a register. But the location uses 'DW_OP_fbreg -8' which prevents
from generating a function with true signatures. I didn't find an example
in vmlinux. The above example is from a crypto module.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 0b5530d..bf8b973 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1191,6 +1191,7 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 }
 
 #define	PARM_DEFAULT_FAIL	-1
+#define	PARM_UNEXPECTED		-2
 
 /* Max 20 register parameters, considering some parameters may be optimized out.  */
 #define	MAX_PRESCAN_PARAMS	20
@@ -1285,7 +1286,8 @@ static int parameter__peek_first_reg(Dwarf_Die *die)
  *   list, return the register; otherwise save register for later return
  * - otherwise if no register was found for locations, return PARM_DEFAULT_FAIL.
  */
-static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
+static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_load *conf,
+			  struct func_info *info)
 {
 	Dwarf_Addr base, start, end;
 	Dwarf_Op *expr, *entry_ops;
@@ -1321,6 +1323,16 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
 			if (ret == expected_reg)
 				goto out;
 			break;
+		case DW_OP_fbreg:
+			/* The location like
+			 *   DW_AT_location        (DW_OP_fbreg +<num>)
+			 * indicates that the parameter is on the stack. But it is possible
+			 * that the parameter can fit in register(s). So conservatively
+			 * mark this parameter not suitable for true signatures.
+			 */
+			if (info->signature_changed && conf->true_signature)
+				ret = PARM_UNEXPECTED;
+			break;
 		/* match DW_OP_entry_value(DW_OP_regXX) at any location */
 		case DW_OP_entry_value:
 		case DW_OP_GNU_entry_value:
@@ -1403,11 +1415,11 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 
 		if (parm->has_loc) {
 			int expected_reg = cu->register_params[reg_idx];
-			int actual_reg = parameter__reg(&attr, expected_reg);
+			int actual_reg = parameter__reg(&attr, expected_reg, conf, info);
 
 			if (actual_reg == PARM_DEFAULT_FAIL)
 				parm->optimized = 1;
-			else if (expected_reg >= 0 && expected_reg != actual_reg)
+			else if (actual_reg == PARM_UNEXPECTED || (expected_reg >= 0 && expected_reg != actual_reg))
 				/* mark parameters that use an unexpected
 				 * register to hold a parameter; these will
 				 * be problematic for users of BTF as they
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 06/11] dwarf_loader: Change exprlen checking condition in parameter__reg()
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (4 preceding siblings ...)
  2026-05-23 16:57 ` [PATCH dwarves v5 05/11] dwarf_laoder: Handle locations with DW_OP_fbreg Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 07/11] dwarf_loader: Detect optimized parameters with locations having constant values Yonghong Song
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

The change does not change any functionalities. But it allows
DW_OP_stack_value preserved in longer location list for future
parameter checking.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index bf8b973..4b65e30 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1308,7 +1308,7 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_l
 		 * DW_OP_stack_value instructs interpreter to pop current value from
 		 * DWARF expression evaluation stack, and thus is not important here.
 		 */
-		if (exprlen > 1 && expr[exprlen - 1].atom == DW_OP_stack_value)
+		if (exprlen == 2 && expr[exprlen - 1].atom == DW_OP_stack_value)
 			exprlen--;
 
 		if (exprlen != 1)
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 07/11] dwarf_loader: Detect optimized parameters with locations having constant values
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (5 preceding siblings ...)
  2026-05-23 16:57 ` [PATCH dwarves v5 06/11] dwarf_loader: Change exprlen checking condition in parameter__reg() Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-05-23 16:57 ` [PATCH dwarves v5 08/11] dwarf_loader: Check whether two-reg parameter actually use two regs or not Yonghong Song
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

The following is an example:

0x00899a78:   DW_TAG_subprogram
                DW_AT_calling_convention        (DW_CC_nocall)
                DW_AT_type      (0x008861cb "int")
                ...

0x00899a8c:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x3e) loclist = 0x000b2195:
                     [0xffffffff879be485, 0xffffffff879be49a): DW_OP_reg5 RDI
                     [0xffffffff879be49a, 0xffffffff879beac4): DW_OP_breg7 RSP+0)
                  DW_AT_name    ("mr")
                  DW_AT_type    (0x00899c88 "map_range *")
                  ...

0x00899a98:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x41) loclist = 0x000b21d4:
                     [0xffffffff879be480, 0xffffffff879be554): DW_OP_consts +0, DW_OP_stack_value
                     [0xffffffff879be554, 0xffffffff879be56d): DW_OP_consts +1, DW_OP_stack_value
                     [0xffffffff879be56d, 0xffffffff879be572): DW_OP_reg2 RCX
                     [0xffffffff879be572, 0xffffffff879be638): DW_OP_breg7 RSP+8
                     [0xffffffff879be638, 0xffffffff879be63d): DW_OP_reg0 RAX
                     [0xffffffff879be63d, 0xffffffff879be6ef): DW_OP_breg7 RSP+8
                     [0xffffffff879be6ef, 0xffffffff879be6f5): DW_OP_reg14 R14
                     [0xffffffff879be6f5, 0xffffffff879be6fd): DW_OP_breg7 RSP+8
                     [0xffffffff879be6fd, 0xffffffff879be879): DW_OP_reg14 R14
                     [0xffffffff879be879, 0xffffffff879be931): DW_OP_reg12 R12
                     [0xffffffff879be955, 0xffffffff879be961): DW_OP_reg14 R14
                     [0xffffffff879be961, 0xffffffff879be966): DW_OP_reg12 R12
                     [0xffffffff879be966, 0xffffffff879be976): DW_OP_reg14 R14
                     [0xffffffff879be976, 0xffffffff879be9df): DW_OP_reg12 R12
                     [0xffffffff879be9df, 0xffffffff879be9e5): DW_OP_reg14 R14
                     [0xffffffff879be9fc, 0xffffffff879bea24): DW_OP_consts +0, DW_OP_stack_value
                     [0xffffffff879bea24, 0xffffffff879bea74): DW_OP_breg7 RSP+8
                     [0xffffffff879bea74, 0xffffffff879beac4): DW_OP_reg14 R14)
                  DW_AT_name    ("nr_range")
                  DW_AT_type    (0x008861cb "int")
                  ...

0x00899aa4:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x3f) loclist = 0x000b21a7:
                     [0xffffffff879be485, 0xffffffff879be4a4): DW_OP_reg4 RSI
                     [0xffffffff879be4a4, 0xffffffff879be4e6): DW_OP_reg12 R12
                     [0xffffffff879be4e6, 0xffffffff879beac4): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value)
                  DW_AT_name    ("start")
                  DW_AT_type    (0x008861cf "unsigned long")
                  ...

0x00899ab0:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x40) loclist = 0x000b21c2:
                     [0xffffffff879be485, 0xffffffff879be4a9): DW_OP_reg1 RDX
                     [0xffffffff879be4a9, 0xffffffff879beac4): DW_OP_breg7 RSP+32)
                  DW_AT_name    ("end")
                  DW_AT_type    (0x008861cf "unsigned long")
                  ...

The parameter 'nr_range' is a constant and won't consume any ABI register.
With this commit, for x86_64, the number of invalid true signature reduced from 96
to 83.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 47 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 4b65e30..97f576a 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1192,6 +1192,8 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 
 #define	PARM_DEFAULT_FAIL	-1
 #define	PARM_UNEXPECTED		-2
+#define	PARM_OPTIMIZED_OUT	-3
+#define	PARM_CONTINUE		-4
 
 /* Max 20 register parameters, considering some parameters may be optimized out.  */
 #define	MAX_PRESCAN_PARAMS	20
@@ -1279,6 +1281,20 @@ static int parameter__peek_first_reg(Dwarf_Die *die)
 	return -1;
 }
 
+static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num)
+{
+	switch (expr[0].atom) {
+	case DW_OP_lit0 ... DW_OP_lit31:
+	case DW_OP_constu:
+	case DW_OP_consts:
+		if (loc_num != 0)
+			break;
+		return PARM_OPTIMIZED_OUT;
+	}
+
+	return PARM_CONTINUE;
+}
+
 /* For DW_AT_location 'attr':
  * - if first location is DW_OP_regXX with expected number, return the register;
  *   otherwise save the register for later return
@@ -1311,8 +1327,17 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_l
 		if (exprlen == 2 && expr[exprlen - 1].atom == DW_OP_stack_value)
 			exprlen--;
 
-		if (exprlen != 1)
-			continue;
+		if (exprlen != 1) {
+			if (!info->signature_changed || !conf->true_signature)
+				continue;
+
+			int res;
+			res = parameter__multi_exprs(expr, loc_num);
+			if (res == PARM_CONTINUE)
+				continue;
+			ret = res;
+			goto out;
+		}
 
 		switch (expr->atom) {
 		/* match DW_OP_regXX at first location */
@@ -1333,6 +1358,16 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_l
 			if (info->signature_changed && conf->true_signature)
 				ret = PARM_UNEXPECTED;
 			break;
+		case DW_OP_lit0 ... DW_OP_lit31:
+		case DW_OP_constu:
+		case DW_OP_consts:
+			if (info->signature_changed && conf->true_signature) {
+				if (loc_num != 0)
+					break;
+				ret = PARM_OPTIMIZED_OUT;
+				goto out;
+			}
+			break;
 		/* match DW_OP_entry_value(DW_OP_regXX) at any location */
 		case DW_OP_entry_value:
 		case DW_OP_GNU_entry_value:
@@ -1417,9 +1452,12 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 			int expected_reg = cu->register_params[reg_idx];
 			int actual_reg = parameter__reg(&attr, expected_reg, conf, info);
 
-			if (actual_reg == PARM_DEFAULT_FAIL)
+			if (actual_reg == PARM_DEFAULT_FAIL) {
 				parm->optimized = 1;
-			else if (actual_reg == PARM_UNEXPECTED || (expected_reg >= 0 && expected_reg != actual_reg))
+			} else if (actual_reg == PARM_OPTIMIZED_OUT) {
+				parm->optimized = 1;
+				info->skip_idx++;
+			} else if (actual_reg == PARM_UNEXPECTED || (expected_reg >= 0 && expected_reg != actual_reg)) {
 				/* mark parameters that use an unexpected
 				 * register to hold a parameter; these will
 				 * be problematic for users of BTF as they
@@ -1427,6 +1465,7 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 				 * contents.
 				 */
 				parm->unexpected_reg = 1;
+			}
 		} else if (has_const_value && !cu->producer_clang) {
 			parm->optimized = 1;
 		} else if (true_sig_enabled) {
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 08/11] dwarf_loader: Check whether two-reg parameter actually use two regs or not
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (6 preceding siblings ...)
  2026-05-23 16:57 ` [PATCH dwarves v5 07/11] dwarf_loader: Detect optimized parameters with locations having constant values Yonghong Song
@ 2026-05-23 16:57 ` Yonghong Song
  2026-05-23 16:58 ` [PATCH dwarves v5 09/11] dwarf_loader: Handle expression lists Yonghong Song
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:57 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

For a parameter whose type occuplies two registers, for x86_64 and aarch64,
the parameter will actually use two registers. S390 is different
as it allocates on stack and pass a pointer to the function.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 11 +++++++++++
 dwarves.h      |  1 +
 2 files changed, 12 insertions(+)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index 97f576a..b888783 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1100,6 +1100,16 @@ static void arch__set_register_params(const GElf_Ehdr *ehdr, struct cu *cu)
 	}
 }
 
+static bool arch__agg_use_two_regs(const GElf_Ehdr *ehdr)
+{
+	switch (ehdr->e_machine) {
+	case EM_S390:
+		return false;
+	default:
+		return true;
+	}
+}
+
 static struct template_type_param *template_type_param__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
 {
 	struct template_type_param *ttparm = tag__alloc(cu, sizeof(*ttparm));
@@ -3562,6 +3572,7 @@ static int cu__set_common(struct cu *cu, struct conf_load *conf,
 
 	cu->little_endian = ehdr.e_ident[EI_DATA] == ELFDATA2LSB;
 	cu->nr_register_params = arch__nr_register_params(&ehdr);
+	cu->agg_use_two_regs = arch__agg_use_two_regs(&ehdr);
 	arch__set_register_params(&ehdr, cu);
 	return 0;
 }
diff --git a/dwarves.h b/dwarves.h
index b49e651..2d94dff 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -307,6 +307,7 @@ struct cu {
 	uint8_t		 uses_global_strings:1;
 	uint8_t		 little_endian:1;
 	uint8_t		 producer_clang:1;
+	uint8_t		 agg_use_two_regs:1;	/* An aggregate like {long a; long b;} */
 	uint8_t		 nr_register_params;
 	int		 register_params[ARCH_MAX_REGISTER_PARAMS];
 	int		 functions_saved;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 09/11] dwarf_loader: Handle expression lists
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (7 preceding siblings ...)
  2026-05-23 16:57 ` [PATCH dwarves v5 08/11] dwarf_loader: Check whether two-reg parameter actually use two regs or not Yonghong Song
@ 2026-05-23 16:58 ` Yonghong Song
  2026-05-23 16:58 ` [PATCH dwarves v5 10/11] btf_encoder: Handle optimized parameter properly Yonghong Song
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:58 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

Location lists having more than one op's are checked.
If the parameter size is less or equal to size of long,
the argument should match the corresponding ABI register.
For example:

0x0aba0808:   DW_TAG_subprogram
                DW_AT_name      ("addrconf_ifdown")
                DW_AT_calling_convention        (DW_CC_nocall)
                DW_AT_type      (0x0ab7d8e9 "int")
		...

0x0aba082b:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x32b) loclist = 0x016eabcd:
                     [0xffffffff83f6fef9, 0xffffffff83f6ff98): DW_OP_reg5 RDI
                     [0xffffffff83f6ff98, 0xffffffff83f70080): DW_OP_reg12 R12
                     [0xffffffff83f70080, 0xffffffff83f70111): DW_OP_breg7 RSP+112
                     [0xffffffff83f70111, 0xffffffff83f7014f): DW_OP_reg12 R12
                     [0xffffffff83f7014f, 0xffffffff83f7123c): DW_OP_breg7 RSP+112
                     [0xffffffff83f7123c, 0xffffffff83f7128c): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                     [0xffffffff83f7128c, 0xffffffff83f712a9): DW_OP_reg12 R12
                     [0xffffffff83f712a9, 0xffffffff83f712cd): DW_OP_breg7 RSP+112
                     [0xffffffff83f712cd, 0xffffffff83f712d2): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                     [0xffffffff83f712d2, 0xffffffff83f713dd): DW_OP_breg7 RSP+112)
                  DW_AT_name    ("dev")
                  DW_AT_type    (0x0ab7cb7d "net_device *")
		  ...

0x0aba0836:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x32c) loclist = 0x016eac39:
                     [0xffffffff83f6fef9, 0xffffffff83f6ff15): DW_OP_breg4 RSI+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_convert (0x0ab7b571) "DW_ATE_unsigned_1", DW_OP_convert (0x0ab7b576) "DW_ATE_unsigned_8", DW_OP_stack_value
                     [0xffffffff83f6ff15, 0xffffffff83f7127c): DW_OP_breg7 RSP+36, DW_OP_deref_size 0x4, DW_OP_convert (0x0ab7b571) "DW_ATE_unsigned_1", DW_OP_convert (0x0ab7b576) "DW_ATE_unsigned_8", DW_OP_stack_value
                     [0xffffffff83f7128c, 0xffffffff83f713dd): DW_OP_breg7 RSP+36, DW_OP_deref_size 0x4, DW_OP_convert (0x0ab7b571) "DW_ATE_unsigned_1", DW_OP_convert (0x0ab7b576) "DW_ATE_unsigned_8", DW_OP_stack_value)
                  DW_AT_name    ("unregister")
                  DW_AT_type    (0x0ab7c933 "bool")
		  ...

The parameter 'unregister' is the second argument which matches ABI register RSI.
So the function "addrconf_ifdown" signature is valid.

If the parameter size is '2 x size_of_long', more handling is necessary, e.g., below:

0x0a01e174:   DW_TAG_subprogram
                DW_AT_name      ("check_zeroed_sockptr")
                DW_AT_calling_convention        (DW_CC_nocall)
                DW_AT_type      (0x09fead35 "int")
		...

0x0a01e187:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x5b6) loclist = 0x0157f03f:
                     [0xffffffff83c941c0, 0xffffffff83c941c4): DW_OP_reg5 RDI, DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
                     [0xffffffff83c941c4, 0xffffffff83c941cc): DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
                     [0xffffffff83c941e1, 0xffffffff83c941e4): DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1)
                  DW_AT_name    ("src")
                  DW_AT_type    (0x09ff832d "sockptr_t")
		  ...

0x0a01e193:     DW_TAG_formal_parameter
                  DW_AT_const_value     (64)
                  DW_AT_name    ("offset")
                  DW_AT_type    (0x09fee984 "size_t")
		  ...

0x0a01e19e:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x5b7) loclist = 0x0157f06b:
                     [0xffffffff83c941c0, 0xffffffff83c941d1): DW_OP_reg1 RDX
                     [0xffffffff83c941d1, 0xffffffff83c941e1): DW_OP_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value
                     [0xffffffff83c941e1, 0xffffffff83c941e9): DW_OP_reg1 RDX)
                  DW_AT_name    ("size")
                  DW_AT_type    (0x09fee984 "size_t")
		  ...

The first parameter 'src' will take two ABI registers. This patch correctly detects such a pattern
to construct the true signature.

However, it is possible that only one 'size_of_long' is used from '2 x size_of_long'. For example

0x019520c6:   DW_TAG_subprogram
                DW_AT_name      ("map_create")
                DW_AT_calling_convention        (DW_CC_nocall)
                DW_AT_type      (0x01934b29 "int")
		...

0x01952111:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x31b) loclist = 0x0034fa0f:
                     [0xffffffff81892345, 0xffffffff8189237c): DW_OP_reg5 RDI
                     [0xffffffff8189237c, 0xffffffff818923bd): DW_OP_reg3 RBX
                     [0xffffffff818923bd, 0xffffffff818923d4): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                     [0xffffffff818923d4, 0xffffffff81892dcb): DW_OP_reg3 RBX
                     [0xffffffff81892df3, 0xffffffff81892e01): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                     [0xffffffff81892e01, 0xffffffff818932a9): DW_OP_reg3 RBX)
                  DW_AT_name    ("attr")
                  DW_AT_type    (0x01934d17 "bpf_attr *")
		  ...

0x0195211d:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x31a) loclist = 0x0034f9dc:
                     [0xffffffff81892345, 0xffffffff81892357): DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
                     [0xffffffff81892357, 0xffffffff81892f02): DW_OP_piece 0x8, DW_OP_breg7 RSP+20, DW_OP_deref_size 0x4, DW_OP_stack_value, DW_OP_piece 0x1
                     [0xffffffff81892f07, 0xffffffff818932a9): DW_OP_piece 0x8, DW_OP_breg7 RSP+20, DW_OP_deref_size 0x4, DW_OP_stack_value, DW_OP_piece 0x1)
                  DW_AT_name    ("uattr")
                  DW_AT_type    (0x019512ab "bpfptr_t")
		  ...

For parameter 'uattr', only second half of parameter is used. For such cases,
the name and the type is changed in pahole and eventually going to vmlinux btf.
  [55697] FUNC_PROTO '(anon)' ret_type_id=106780 vlen=2
          'attr' type_id=455
          'uattr__is_kernel' type_id=82014
  [82014] TYPEDEF 'bool' type_id=67434
  [113251] FUNC 'map_create' type_id=55697 linkage=static
You can see the new parameter name is 'uattr__is_kernel' and the type is 'bool'.

With this patch, the number of invalid true signatures is reduced from 83 to 18.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 dwarf_loader.c | 240 +++++++++++++++++++++++++++++++++++++++++++++++--
 dwarves.h      |   1 +
 2 files changed, 234 insertions(+), 7 deletions(-)

diff --git a/dwarf_loader.c b/dwarf_loader.c
index b888783..870c167 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1204,6 +1204,8 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 #define	PARM_UNEXPECTED		-2
 #define	PARM_OPTIMIZED_OUT	-3
 #define	PARM_CONTINUE		-4
+#define	PARM_TWO_ADDR_LEN	-5
+#define	PARM_TO_BE_IMPROVED	-6
 
 /* Max 20 register parameters, considering some parameters may be optimized out.  */
 #define	MAX_PRESCAN_PARAMS	20
@@ -1291,7 +1293,47 @@ static int parameter__peek_first_reg(Dwarf_Die *die)
 	return -1;
 }
 
-static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num)
+/* Traverse the parameter type until finding the member type which has expected
+ * struct type offset.
+ */
+static Dwarf_Die *get_member_with_offset(Dwarf_Die *die, int offset, Dwarf_Die *member_die)
+{
+	Dwarf_Attribute attr;
+	if (dwarf_attr(die, DW_AT_type, &attr) == NULL)
+		return NULL;
+
+	Dwarf_Die type_die;
+	if (dwarf_formref_die(&attr, &type_die) == NULL)
+		return NULL;
+
+	uint64_t bsize = attr_numeric(&type_die, DW_AT_byte_size);
+	if (bsize == 0)
+		return get_member_with_offset(&type_die, offset, member_die);
+
+	if (dwarf_tag(&type_die) != DW_TAG_structure_type)
+		return NULL;
+
+	if (!dwarf_haschildren(&type_die) || dwarf_child(&type_die, member_die) != 0)
+		return NULL;
+	do {
+		if (dwarf_tag(member_die) != DW_TAG_member)
+			continue;
+
+		int off = attr_numeric(member_die, DW_AT_data_bit_offset);
+		if (off == offset * 8)
+			return member_die;
+	} while (dwarf_siblingof(member_die, member_die) == 0);
+
+	return NULL;
+}
+
+/* For two address length case, first_half and second_half represents the parameter.
+ * The first_half and second_half accumulates field information across possible multiple
+ * location lists.
+ */
+static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num, struct cu *cu, size_t exprlen,
+				  Dwarf_Die *die, int expected_reg, int byte_size,
+				  unsigned long *first_half, unsigned long *second_half, int *ret)
 {
 	switch (expr[0].atom) {
 	case DW_OP_lit0 ... DW_OP_lit31:
@@ -1302,9 +1344,169 @@ static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num)
 		return PARM_OPTIMIZED_OUT;
 	}
 
+	if (byte_size <= cu->addr_size || !cu->agg_use_two_regs) {
+		/* parameter_size <= cu->addr_size */
+		switch (expr[0].atom) {
+		case DW_OP_reg0 ... DW_OP_reg31:
+			if (loc_num != 0)
+				break;
+			*ret = expr[0].atom;
+			if (*ret == expected_reg)
+				return *ret;
+			break;
+		case DW_OP_breg0 ... DW_OP_breg31:
+			if (loc_num != 0)
+				break;
+			bool has_op_stack_value = false;
+			for (int i = 1; i < exprlen; i++) {
+				if (expr[i].atom == DW_OP_stack_value) {
+					has_op_stack_value = true;
+					break;
+				}
+			}
+			if (!has_op_stack_value)
+				break;
+			/* The existence of DW_OP_stack_value means that
+			 * DW_OP_bregX register is used as value.
+			 */
+			*ret = expr[0].atom - DW_OP_breg0 + DW_OP_reg0;
+			if (*ret == expected_reg)
+				return *ret;
+		}
+	} else {
+		/* cu->addr < parameter_size <= cu->addr * 2
+		 * first_half encodes field starts for the first register.
+		 * second_half encodes field starts for the second register.
+		 *
+		 * For example:
+		 *   loclist 1: DW_OP_reg5 RDI, DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
+		 *   loclist 2: DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
+		 *   loclist 3: DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1)
+		 *
+		 * After iterating all the above three location lists (see PARM_CONTINUE below),
+		 * first_half encodes as 0x1 and second_half encodes as 0x1. The 'ret' value will
+		 * encode the first used register which is RDI. Each bit in first_half/second_half
+		 * represents a member field.
+		 *
+		 * Another example:
+		 *   loclist 1: DW_OP_reg5 RDI, DW_OP_piece 0x4
+		 *   loclist 2: DW_OP_piece 0x4, DW_OP_reg4 RDI, DW_OP_piece 0x4
+		 *
+		 * After iterating all the above two location lists, first_half encodes 0x11.
+		 * After loclist 1, first_half encoding is 0x1. After loclist 2, first_half encoding is 0x11.
+		 * second_half is 0. The 'ret' value is RDI.
+		 */
+		int off = 0;
+		for (int i = 0; i < exprlen; i++) {
+			if (expr[i].atom == DW_OP_piece) {
+				int num = expr[i].number;
+				if (i == 0) {
+					off = num;
+					continue;
+				}
+				if (off < cu->addr_size) (*first_half) |= (1 << off);
+				else (*second_half) |= (1 << (off - cu->addr_size));
+				off += num;
+			} else if (expr[i].atom >= DW_OP_reg0 && expr[i].atom <= DW_OP_reg31) {
+				if (off < cu->addr_size)
+					*ret = expr[i].atom;
+				else if (*ret < 0)
+					*ret = expr[i].atom;
+			}
+			/* FIXME: not handling DW_OP_bregX yet since we do not have
+			 * a use case for it yet for linux kernel.
+			 */
+		}
+	}
+
 	return PARM_CONTINUE;
 }
 
+/* The first_half and second_half, computed in parameter__multi_exprs(), are handled here. */
+static int parameter__handle_two_addr_len(int expected_reg, unsigned long first_half, unsigned long second_half,
+					  int ret, Dwarf_Die *die, struct conf_load *conf, struct cu *cu,
+					  struct parameter *parm, int param_idx, int reg_idx, int byte_size,
+					  struct func_info *info)
+{
+	if (!first_half && !second_half)
+		return ret;
+
+	if (ret != expected_reg)
+		return ret;
+
+	if (!conf->true_signature)
+		return PARM_DEFAULT_FAIL;
+
+	/* Both halves are used based on dwarf */
+	if (first_half && second_half)
+		return PARM_TWO_ADDR_LEN;
+
+	/* Only one half is used. Check if the next parameter's starting register
+	 * indicates the ABI still reserves the full register space for this
+	 * parameter. If so, the compiler only eliminated the dead half but the
+	 * register layout is preserved — keep the original source type.
+	 *
+	 * Use register_params[] array for the expected next register since
+	 * DW_OP_reg numbers are not necessarily sequential across architectures.
+	 */
+	if (param_idx + 1 < info->nr_params) {
+		int next_start = info->param_start_regs[param_idx + 1];
+
+		if (next_start >= 0) {
+			int num_regs = (byte_size + cu->addr_size - 1) / cu->addr_size;
+			int next_reg_idx = reg_idx + num_regs;
+
+			if (next_reg_idx < cu->nr_register_params &&
+			    next_start == cu->register_params[next_reg_idx])
+				return PARM_TWO_ADDR_LEN;
+		}
+	}
+
+	/* FIXME: parm->name may be NULL due to abstract origin. We do not want to
+	 * update abstract origin as the type in abstract origin may be used
+	 * in some other places. We could remove abstract origin in this parameter
+	 * and add name and type in parameter itself. Right now, for current bpf-next
+	 * repo, we do not have instances below where parm->name is NULL for x86_64 arch.
+	 */
+	if (!parm->name)
+		return PARM_TO_BE_IMPROVED;
+
+	/* FIXME: Only support single field now so we can have a good parameter name and
+	 * type for it. For more than one field, another option could be named as
+	 * <parameter_name>__first_half or <parameter_name>__second_half, but it is not
+	 * that intuitive.
+	 */
+	if (__builtin_popcountll(first_half) >= 2 || __builtin_popcountll(second_half) >= 2)
+		return PARM_TO_BE_IMPROVED;
+
+	int field_offset;
+	if (__builtin_popcountll(first_half) == 1)
+		field_offset = __builtin_ctzll(first_half);
+	else
+		field_offset = cu->addr_size + __builtin_ctzll(second_half);
+
+	/* FIXME: Only struct type is supported. */
+	Dwarf_Die member_die;
+	if (!get_member_with_offset(die, field_offset, &member_die))
+		return PARM_TO_BE_IMPROVED;
+
+	/* FIXME: cannot get a proper member_name, e.g. if the member type is a union. */
+	const char *member_name = attr_string(&member_die, DW_AT_name, conf);
+	if (!member_name)
+		return PARM_TO_BE_IMPROVED;
+
+	/* true_sig_member_name is the member name which will be used for later btf name
+	 * like <parameter_name>__<member_name>.
+	 */
+	parm->true_sig_member_name = member_name;
+
+	struct tag *tag = &parm->tag;
+	struct dwarf_tag *dtag = tag__dwarf(tag);
+	dwarf_tag__set_attr_type(dtag, type, &member_die, DW_AT_type);
+
+	return ret;
+}
+
 /* For DW_AT_location 'attr':
  * - if first location is DW_OP_regXX with expected number, return the register;
  *   otherwise save the register for later return
@@ -1313,15 +1515,18 @@ static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num)
  * - otherwise if no register was found for locations, return PARM_DEFAULT_FAIL.
  */
 static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_load *conf,
-			  struct func_info *info)
+			  struct func_info *info, struct cu *cu, Dwarf_Die *die,
+			  struct parameter *parm, int param_idx, int reg_idx)
 {
 	Dwarf_Addr base, start, end;
 	Dwarf_Op *expr, *entry_ops;
 	Dwarf_Attribute entry_attr;
 	size_t exprlen, entry_len;
 	ptrdiff_t offset = 0;
+	int byte_size = 0;
 	int loc_num = -1;
 	int ret = PARM_DEFAULT_FAIL;
+	unsigned long first_half = 0, second_half = 0;
 
 	/* use libdw__lock as dwarf_getlocation(s) has concurrency issues
 	 * when libdw is not compiled with experimental --enable-thread-safety
@@ -1341,8 +1546,17 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_l
 			if (!info->signature_changed || !conf->true_signature)
 				continue;
 
+			if (!byte_size)
+				byte_size = get_type_byte_size(die, cu);
+			/* This should not happen. */
+			if (!byte_size) {
+				ret = PARM_UNEXPECTED;
+				goto out;
+			}
+
 			int res;
-			res = parameter__multi_exprs(expr, loc_num);
+			res = parameter__multi_exprs(expr, loc_num, cu, exprlen, die, expected_reg,
+						     byte_size, &first_half, &second_half, &ret);
 			if (res == PARM_CONTINUE)
 				continue;
 			ret = res;
@@ -1391,6 +1605,11 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_l
 			break;
 		}
 	}
+
+	ret = parameter__handle_two_addr_len(expected_reg, first_half, second_half,
+					     ret, die, conf, cu, parm, param_idx, reg_idx,
+					     byte_size, info);
+
 out:
 	pthread_mutex_unlock(&libdw__lock);
 	return ret;
@@ -1415,10 +1634,9 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		if (!info->signature_changed) {
 			if (cu->producer_clang || param_idx >= cu->nr_register_params)
 				return parm;
+			reg_idx = param_idx;
 		} else {
 			reg_idx = param_idx - info->skip_idx;
-			if (reg_idx >= cu->nr_register_params)
-				return parm;
 		}
 
 		/* Parameters which use DW_AT_abstract_origin to point at
@@ -1459,15 +1677,23 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		true_sig_enabled = conf->true_signature && info->signature_changed;
 
 		if (parm->has_loc) {
+			if (reg_idx >= cu->nr_register_params)
+				return parm;
+
 			int expected_reg = cu->register_params[reg_idx];
-			int actual_reg = parameter__reg(&attr, expected_reg, conf, info);
+			int actual_reg = parameter__reg(&attr, expected_reg, conf, info, cu, die,
+							parm, param_idx, reg_idx);
 
 			if (actual_reg == PARM_DEFAULT_FAIL) {
 				parm->optimized = 1;
 			} else if (actual_reg == PARM_OPTIMIZED_OUT) {
 				parm->optimized = 1;
 				info->skip_idx++;
-			} else if (actual_reg == PARM_UNEXPECTED || (expected_reg >= 0 && expected_reg != actual_reg)) {
+			} else if (actual_reg == PARM_TWO_ADDR_LEN) {
+				/* account for parameter with two registers */
+				info->skip_idx--;
+			} else if (actual_reg == PARM_UNEXPECTED || actual_reg == PARM_TO_BE_IMPROVED ||
+				   (expected_reg >= 0 && expected_reg != actual_reg)) {
 				/* mark parameters that use an unexpected
 				 * register to hold a parameter; these will
 				 * be problematic for users of BTF as they
diff --git a/dwarves.h b/dwarves.h
index 2d94dff..2fc937a 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -948,6 +948,7 @@ size_t lexblock__fprintf(const struct lexblock *lexblock, const struct cu *cu,
 struct parameter {
 	struct tag tag;
 	const char *name;
+	const char *true_sig_member_name;
 	uint8_t optimized:1;
 	uint8_t unexpected_reg:1;
 	uint8_t has_loc:1;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 10/11] btf_encoder: Handle optimized parameter properly
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (8 preceding siblings ...)
  2026-05-23 16:58 ` [PATCH dwarves v5 09/11] dwarf_loader: Handle expression lists Yonghong Song
@ 2026-05-23 16:58 ` Yonghong Song
  2026-06-11  9:08   ` Alan Maguire
  2026-05-23 16:58 ` [PATCH dwarves v5 11/11] tests: Add a few clang true signature tests Yonghong Song
  2026-06-15 17:17 ` [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Alan Maguire
  11 siblings, 1 reply; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:58 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

Ensure to skip optimized parameter so btf can generate
proper true signatures.

In the first patch of the patch set, with DW_CC_nocall filtering, 875 functions
have signature changed. With a series of improvement, eventually only 18 functions
remain and unfortunately these functions cannot be converted to true signatures
due to locations. For example,

0x0242f1f7:   DW_TAG_subprogram
                DW_AT_name      ("memblock_find_in_range")
                DW_AT_calling_convention        (DW_CC_nocall)
                DW_AT_type      (0x0242decc "phys_addr_t")
                ...

0x0242f22e:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x14a) loclist = 0x005595bc:
                     [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
                     [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
                     [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                     [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
                  DW_AT_name    ("start")
                  DW_AT_type    (0x0242decc "phys_addr_t")
                  ...

0x0242f239:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x14b) loclist = 0x005595e6:
                     [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
                     [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
                     [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
                     [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
                  DW_AT_name    ("end")
                  DW_AT_type    (0x0242decc "phys_addr_t")
                  ...

0x0242f245:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x14c) loclist = 0x00559610:
                     [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
                  DW_AT_name    ("size")
                  DW_AT_type    (0x0242decc "phys_addr_t")
                  ...

0x0242f250:     DW_TAG_formal_parameter
                  DW_AT_const_value     (4096)
                  DW_AT_name    ("align")
                  DW_AT_type    (0x0242decc "phys_addr_t")
                  ...

The third parameter 'size' is not from RDX. Hence, true signature is not possible for this function.

I also did some experiments on arm64. The number of signature-changed funcitons
is 863 and finally there are 70 functions cannot be converted to true signatures.
Through dwarf comparison of x86_64 vs. arm64, llvm arm64 backend looks like having
more relaxation to compute parameter values for those signature-changed functions.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 btf_encoder.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 633bc61..26be31d 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -1257,15 +1257,21 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 	struct btf *btf = encoder->btf;
 	struct llvm_annotation *annot;
 	struct parameter *param;
-	uint8_t param_idx = 0;
+	uint8_t param_idx = 0, skip_idx = 0;
 	int str_off, err = 0;
 
 	if (!state)
 		return -ENOMEM;
 
+	if (encoder->true_signature && encoder->cu->producer_clang) {
+		ftype__for_each_parameter(ftype, param) {
+			if (param->optimized) skip_idx++;
+		}
+	}
+
 	state->addr = function__addr(fn);
 	state->elf = func;
-	state->nr_parms = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0);
+	state->nr_parms = ftype->nr_parms - skip_idx + (ftype->unspec_parms ? 1 : 0);
 	state->ret_type_id = ftype->tag.type == 0 ? 0 : encoder->type_id_off + ftype->tag.type;
 	if (state->nr_parms > 0) {
 		state->parms = zalloc(state->nr_parms * sizeof(*state->parms));
@@ -1297,14 +1303,34 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 	state->reordered_parm = ftype->reordered_parm;
 	ftype__for_each_parameter(ftype, param) {
 		const char *name;
+		char *final_name = NULL;
 
 		/* No location info/optimized + reordered means optimized out. */
 		if (ftype->reordered_parm && (!param->has_loc || param->optimized)) {
 			state->nr_parms--;
 			continue;
 		}
-		name = parameter__name(param) ?: "";
+		if (encoder->true_signature && encoder->cu->producer_clang && param->optimized)
+			continue;
+
+		name = parameter__name(param);
+		if (!name) {
+			name = "";
+		} else if (param->true_sig_member_name) {
+			/* Non-null param->true_sig_member_name indicates that the parameter
+			 * name is <parameter_name>__<field_name>.
+			 */
+			if (asprintf(&final_name, "%s__%s", name, param->true_sig_member_name) == -1) {
+				err = -ENOMEM;
+				goto out;
+			}
+			name = final_name;
+		}
+
 		str_off = btf__add_str(btf, name);
+		if (final_name)
+			free(final_name);
+
 		if (str_off < 0) {
 			err = str_off;
 			goto out;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH dwarves v5 11/11] tests: Add a few clang true signature tests
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (9 preceding siblings ...)
  2026-05-23 16:58 ` [PATCH dwarves v5 10/11] btf_encoder: Handle optimized parameter properly Yonghong Song
@ 2026-05-23 16:58 ` Yonghong Song
  2026-06-15 17:17 ` [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Alan Maguire
  11 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-05-23 16:58 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

Three tests are added.

Test 1: VERBOSE=1 ./clang_parm_optimized.sh
  BTF:   BTF: int foo(int a, int c);
  DWARF: DWARF: int foo(int a, int b, int c);
where parameber 'b' is unused.

Test 2: VERBOSE=1 ./clang_parm_optimized_stack.sh
  BTF:   BTF: int foo(int a, int i);
  DWARF: DWARF: int foo(int a, int b, int c, int d, int e, int f, int g, int h, int i);
where parameters 'b' to 'h' are unused.

Test 3: VERBOSE=1 ./clang_parm_aggregate.sh
  BTF (x86_64):  long foo(long a__f1, struct t b, int i);
  BTF (aarch64): long foo(struct t a, struct t b, int i);
  DWARF:         long foo(struct t a, struct t b, int i);
where the 'struct t' definition is 'struct t { long f1; long f2; };', and
a.f2 is not used in the function for x86_64 arch.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 tests/clang_parm_aggregate.sh       | 85 +++++++++++++++++++++++++++++
 tests/clang_parm_optimized.sh       | 63 +++++++++++++++++++++
 tests/clang_parm_optimized_stack.sh | 63 +++++++++++++++++++++
 3 files changed, 211 insertions(+)
 create mode 100755 tests/clang_parm_aggregate.sh
 create mode 100755 tests/clang_parm_optimized.sh
 create mode 100755 tests/clang_parm_optimized_stack.sh

diff --git a/tests/clang_parm_aggregate.sh b/tests/clang_parm_aggregate.sh
new file mode 100755
index 0000000..9502f8b
--- /dev/null
+++ b/tests/clang_parm_aggregate.sh
@@ -0,0 +1,85 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+
+source test_lib.sh
+
+outdir=$(make_tmpdir)
+
+# Comment this out to save test data.
+trap cleanup EXIT
+
+title_log "Validation of BTF encoding of true_signatures."
+
+clang_true="${outdir}/clang_true"
+CC=$(which clang 2>/dev/null)
+
+if [[ -z "$CC" ]]; then
+	info_log "skip: clang not available"
+	test_skip
+fi
+
+cat > ${clang_true}.c << EOF
+struct t { long f1; long f2; };
+__attribute__((noinline)) static long foo(struct t a, struct t b, int i)
+{
+        return a.f1 + b.f1 + b.f2 + i;
+}
+
+struct t p1, p2;
+int i;
+int main()
+{
+        return (int)foo(p1, p2, i);
+}
+EOF
+
+CFLAGS="$CFLAGS -g -O2"
+${CC} ${CFLAGS} -o $clang_true ${clang_true}.c
+if [[ $? -ne 0 ]]; then
+	error_log "Could not compile ${clang_true}.c"
+	test_fail
+fi
+LLVM_OBJCOPY=objcopy pahole -J --btf_features=+true_signature $clang_true
+if [[ $? -ne 0 ]]; then
+	error_log "Could not encode BTF for $clang_true"
+	test_fail
+fi
+
+btf_optimized=$(pfunct --all --format_path=btf $clang_true |grep "foo")
+if [[ -z "$btf_optimized" ]]; then
+	info_log "skip: no optimizations applied."
+	test_skip
+fi
+
+btf_cmp=$btf_optimized
+dwarf=$(pfunct --all $clang_true |grep "foo")
+
+verbose_log "BTF: $btf_optimized  DWARF: $dwarf"
+
+arch=$(uname -m)
+
+if [[ "$arch" == "x86_64" ]]; then
+	# On x86_64, clang emits DW_CC_nocall for optimized functions,
+	# so pahole should detect the optimization and produce a
+	# different BTF signature.
+	if [[ "$btf_cmp" == "$dwarf" ]]; then
+		error_log "BTF and DWARF signatures should be different and they are not: BTF: $btf_optimized ; DWARF $dwarf"
+		test_fail
+	fi
+elif [[ "$arch" == "aarch64" ]]; then
+	# On arm64, clang does not emit DW_CC_nocall, so pahole cannot
+	# detect the optimization. BTF and DWARF signatures are expected
+	# to be the same.
+	if [[ "$btf_cmp" != "$dwarf" ]]; then
+		error_log "On arm64, BTF and DWARF signatures should be the same but they are not: BTF: $btf_optimized ; DWARF $dwarf"
+		test_fail
+	fi
+else
+	# On other architectures, skip if we cannot determine the
+	# expected behavior.
+	if [[ "$btf_cmp" == "$dwarf" ]]; then
+		info_log "skip: no optimization detected on $arch"
+		test_skip
+	fi
+fi
+test_pass
diff --git a/tests/clang_parm_optimized.sh b/tests/clang_parm_optimized.sh
new file mode 100755
index 0000000..81d50af
--- /dev/null
+++ b/tests/clang_parm_optimized.sh
@@ -0,0 +1,63 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+
+source test_lib.sh
+
+outdir=$(make_tmpdir)
+
+# Comment this out to save test data.
+trap cleanup EXIT
+
+title_log "Validation of BTF encoding of true_signatures."
+
+clang_true="${outdir}/clang_true"
+CC=$(which clang 2>/dev/null)
+
+if [[ -z "$CC" ]]; then
+	info_log "skip: clang not available"
+	test_skip
+fi
+
+cat > ${clang_true}.c << EOF
+__attribute__((noinline)) static int foo(int a, int b, int c)
+{
+	return a * c - a - c;
+}
+
+int a, b, c;
+int main()
+{
+	return foo(a, b, c);
+}
+EOF
+
+CFLAGS="$CFLAGS -g -O2"
+${CC} ${CFLAGS} -o $clang_true ${clang_true}.c
+if [[ $? -ne 0 ]]; then
+	error_log "Could not compile ${clang_true}.c"
+	test_fail
+fi
+LLVM_OBJCOPY=objcopy pahole -J --btf_features=+true_signature $clang_true
+if [[ $? -ne 0 ]]; then
+	error_log "Could not encode BTF for $clang_true"
+	test_fail
+fi
+
+btf_optimized=$(pfunct --all --format_path=btf $clang_true |grep "foo")
+if [[ -z "$btf_optimized" ]]; then
+	info_log "skip: no optimizations applied."
+	test_skip
+fi
+
+btf_cmp=$btf_optimized
+dwarf=$(pfunct --all $clang_true |grep "foo")
+
+if [[ -n "$VERBOSE" ]]; then
+	printf "   BTF: %s  DWARF: %s\n" "$btf_optimized" "$dwarf"
+fi
+
+if [[ "$btf_cmp" == "$dwarf" ]]; then
+	error_log "BTF and DWARF signatures should be different and they are not: BTF: $btf_optimized ; DWARF $dwarf"
+	test_fail
+fi
+test_pass
diff --git a/tests/clang_parm_optimized_stack.sh b/tests/clang_parm_optimized_stack.sh
new file mode 100755
index 0000000..afdc355
--- /dev/null
+++ b/tests/clang_parm_optimized_stack.sh
@@ -0,0 +1,63 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+
+source test_lib.sh
+
+outdir=$(make_tmpdir)
+
+# Comment this out to save test data.
+trap cleanup EXIT
+
+title_log "Validation of BTF encoding of true_signatures."
+
+clang_true="${outdir}/clang_true"
+CC=$(which clang 2>/dev/null)
+
+if [[ -z "$CC" ]]; then
+	info_log "skip: clang not available"
+	test_skip
+fi
+
+cat > ${clang_true}.c << EOF
+__attribute__((noinline)) static int foo(int a, int b, int c, int d, int e, int f, int g, int h, int i)
+{
+        return a * i - a - i;
+}
+
+int a, b, c, d, e, f, g, h, i;
+int main()
+{
+        return foo(a, b, c, d, e, f, g, h, i);
+}
+EOF
+
+CFLAGS="$CFLAGS -g -O2"
+${CC} ${CFLAGS} -o $clang_true ${clang_true}.c
+if [[ $? -ne 0 ]]; then
+	error_log "Could not compile ${clang_true}.c"
+	test_fail
+fi
+LLVM_OBJCOPY=objcopy pahole -J --btf_features=+true_signature $clang_true
+if [[ $? -ne 0 ]]; then
+	error_log "Could not encode BTF for $clang_true"
+	test_fail
+fi
+
+btf_optimized=$(pfunct --all --format_path=btf $clang_true |grep "foo")
+if [[ -z "$btf_optimized" ]]; then
+	info_log "skip: no optimizations applied."
+	test_skip
+fi
+
+btf_cmp=$btf_optimized
+dwarf=$(pfunct --all $clang_true |grep "foo")
+
+if [[ -n "$VERBOSE" ]]; then
+	printf "   BTF: %s  DWARF: %s\n" "$btf_optimized" "$dwarf"
+fi
+
+if [[ "$btf_cmp" == "$dwarf" ]]; then
+	error_log "BTF and DWARF signatures should be different and they are not: BTF: $btf_optimized ; DWARF $dwarf"
+	test_fail
+fi
+test_pass
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH dwarves v5 10/11] btf_encoder: Handle optimized parameter properly
  2026-05-23 16:58 ` [PATCH dwarves v5 10/11] btf_encoder: Handle optimized parameter properly Yonghong Song
@ 2026-06-11  9:08   ` Alan Maguire
  0 siblings, 0 replies; 16+ messages in thread
From: Alan Maguire @ 2026-06-11  9:08 UTC (permalink / raw)
  To: Yonghong Song, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

On 23/05/2026 17:58, Yonghong Song wrote:
> Ensure to skip optimized parameter so btf can generate
> proper true signatures.
> 
> In the first patch of the patch set, with DW_CC_nocall filtering, 875 functions
> have signature changed. With a series of improvement, eventually only 18 functions
> remain and unfortunately these functions cannot be converted to true signatures
> due to locations. For example,
> 
> 0x0242f1f7:   DW_TAG_subprogram
>                 DW_AT_name      ("memblock_find_in_range")
>                 DW_AT_calling_convention        (DW_CC_nocall)
>                 DW_AT_type      (0x0242decc "phys_addr_t")
>                 ...
> 
> 0x0242f22e:     DW_TAG_formal_parameter
>                   DW_AT_location        (indexed (0x14a) loclist = 0x005595bc:
>                      [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
>                      [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
>                      [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
>                      [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
>                   DW_AT_name    ("start")
>                   DW_AT_type    (0x0242decc "phys_addr_t")
>                   ...
> 
> 0x0242f239:     DW_TAG_formal_parameter
>                   DW_AT_location        (indexed (0x14b) loclist = 0x005595e6:
>                      [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
>                      [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
>                      [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
>                      [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
>                   DW_AT_name    ("end")
>                   DW_AT_type    (0x0242decc "phys_addr_t")
>                   ...
> 
> 0x0242f245:     DW_TAG_formal_parameter
>                   DW_AT_location        (indexed (0x14c) loclist = 0x00559610:
>                      [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
>                   DW_AT_name    ("size")
>                   DW_AT_type    (0x0242decc "phys_addr_t")
>                   ...
> 
> 0x0242f250:     DW_TAG_formal_parameter
>                   DW_AT_const_value     (4096)
>                   DW_AT_name    ("align")
>                   DW_AT_type    (0x0242decc "phys_addr_t")
>                   ...
> 
> The third parameter 'size' is not from RDX. Hence, true signature is not possible for this function.
> 
> I also did some experiments on arm64. The number of signature-changed funcitons
> is 863 and finally there are 70 functions cannot be converted to true signatures.
> Through dwarf comparison of x86_64 vs. arm64, llvm arm64 backend looks like having
> more relaxation to compute parameter values for those signature-changed functions.
> 
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> ---
>  btf_encoder.c | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 633bc61..26be31d 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -1257,15 +1257,21 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
>  	struct btf *btf = encoder->btf;
>  	struct llvm_annotation *annot;
>  	struct parameter *param;
> -	uint8_t param_idx = 0;
> +	uint8_t param_idx = 0, skip_idx = 0;
>  	int str_off, err = 0;
>  
>  	if (!state)
>  		return -ENOMEM;
>  
> +	if (encoder->true_signature && encoder->cu->producer_clang) {
> +		ftype__for_each_parameter(ftype, param) {
> +			if (param->optimized) skip_idx++;
> +		}
> +	}
> +

the logic here is a bit confusing (to me at least). Later on in the loop below
we subtract out param->optimized parameters from the state->nr_parms count for 
the ftype->reordered_parm case. Why not just do the same for the clang optimized
out case below, i.e.
 
>  	state->addr = function__addr(fn);
>  	state->elf = func;
> -	state->nr_parms = ftype->nr_parms + (ftype->unspec_parms ? 1 : 0);
> +	state->nr_parms = ftype->nr_parms - skip_idx + (ftype->unspec_parms ? 1 : 0);
>  	state->ret_type_id = ftype->tag.type == 0 ? 0 : encoder->type_id_off + ftype->tag.type;
>  	if (state->nr_parms > 0) {
>  		state->parms = zalloc(state->nr_parms * sizeof(*state->parms));
> @@ -1297,14 +1303,34 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
>  	state->reordered_parm = ftype->reordered_parm;
>  	ftype__for_each_parameter(ftype, param) {
>  		const char *name;
> +		char *final_name = NULL;
>  
>  		/* No location info/optimized + reordered means optimized out. */
>  		if (ftype->reordered_parm && (!param->has_loc || param->optimized)) {
>  			state->nr_parms--;
>  			continue;
>  		}
> -		name = parameter__name(param) ?: "";
> +		if (encoder->true_signature && encoder->cu->producer_clang && param->optimized)

just add a "state->nr_parms--;" here and get rid of skip_idx.

> +			continue;
> +
> +		name = parameter__name(param);
> +		if (!name) {
> +			name = "";
do we see more parameter DIEs without parameter names for the nocall cases?


> +		} else if (param->true_sig_member_name) {
> +			/* Non-null param->true_sig_member_name indicates that the parameter
> +			 * name is <parameter_name>__<field_name>.
> +			 */
> +			if (asprintf(&final_name, "%s__%s", name, param->true_sig_member_name) == -1) {
> +				err = -ENOMEM;
> +				goto out;
> +			}
> +			name = final_name;
> +		}
> +
>  		str_off = btf__add_str(btf, name);
> +		if (final_name)
> +			free(final_name);
> +
>  		if (str_off < 0) {
>  			err = str_off;
>  			goto out;


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr
  2026-05-23 16:57 ` [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
@ 2026-06-11  9:15   ` Alan Maguire
  0 siblings, 0 replies; 16+ messages in thread
From: Alan Maguire @ 2026-06-11  9:15 UTC (permalink / raw)
  To: Yonghong Song, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

On 23/05/2026 17:57, Yonghong Song wrote:
> Currently every function is checked for its parameters to identify whether
> the signature changed or not. If signature indeed changed, pahole may do
> some adjustment for parameters for true signatures.
> 
> In clang, any function with the following attribute
>       DW_AT_calling_convention        (DW_CC_nocall)
> indicates this function having signature changed.
> pahole can take advantage of this to avoid parameter checking if
> DW_AT_calling_convention is not DW_CC_nocall.
> 
> But more importantly, DW_CC_nocall can identify signature-changed functions
> and parameters can be checked one-after-another to create the true
> signatures. Otherwise, it takes more effort to identify whether a
> function has signature changed or not. For example, for funciton
>   __bpf_kfunc static void bbr_main(struct sock *sk, u32 ack, int flag,
>      const struct rate_sample *rs) { ... }
> and bbr_main() is a callback function in
>   .cong_control   = bbr_main
> in 'struct tcp_congestion_ops tcp_bbr_cong_ops'.
> In the above bbr_main(...), parameter 'ack' and 'flag' are not used.
> The following are some details:
> 
> 0x0a713b8d:     DW_TAG_formal_parameter
>                   DW_AT_location        (indexed (0x28) loclist = 0x0166d452:
>                      [0xffffffff83e77fd9, 0xffffffff83e78016): DW_OP_reg5 RDI
>                      ...
>                   DW_AT_name    ("sk")
>                   DW_AT_type    (0x0a6f5b2b "sock *")
>                   ...
> 
> 0x0a713b98:     DW_TAG_formal_parameter
>                   DW_AT_name    ("ack")
>                   DW_AT_type    (0x0a6f58fd "u32")
>                   ...
> 
> 0x0a713ba2:     DW_TAG_formal_parameter
>                   DW_AT_name    ("flag")
>                   DW_AT_type    (0x0a6f57d1 "int")
>                   ...
> 
> 0x0a713bac:     DW_TAG_formal_parameter
>                   DW_AT_location        (indexed (0x29) loclist = 0x0166d4a8:
>                      [0xffffffff83e77fd9, 0xffffffff83e78016): DW_OP_reg2 RCX
>                      ...
>                   DW_AT_name    ("rs")
>                   DW_AT_type    (0x0a710da5 "const rate_sample *")
> 
> Some analysis for the above dwarf can conclude that the 'ark' and 'flag'
> may be related to RSI and RDX, considering the last one is RCX. Basically this
> requires all parameters are available to collectively decide whether the
> true signature can be found or not. In such case, DW_CC_nocall can make things
> easier as parameter can be checked one after another.
> 
> For a clang built bpf-next kernel with x86_64, in non-LTO setup,
> the number of kernel functions is 69103 and the number of signature changed
> functions is 875, based on
>       DW_AT_calling_convention        (DW_CC_nocall)
> indication.
> 
> Among 875 signature changed functions, after this patch, 343 functions
> can have proper true signatures, mostly due to simple dead argument
> elimination. The number of remaining functions, which cannot get the
> true signature, is 532 due to dead or additional-checked parameters.
> 
> They will be addressed in the subsequent commits.
> 
> In llvm23, I implemented [1] which added DW_CC_nocall for ArgumentPromotion pass.
> This compiler pass can add additional DW_CC_nocall cases for the following
> compilation:
>       - Flag -O3 or FullLTO
> So once llvm23 available, we may have more DW_CC_nocall cases, hence more
> potential true signatures if the kernel is built with -O3 or
> with FullLTO (CONFIG_LTO_CLANG_FULL).
> 
>   [1] https://github.com/llvm/llvm-project/pull/178973
> 
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> ---
>  dwarf_loader.c | 86 ++++++++++++++++++++++++++++++++++++++++++--------
>  dwarves.h      |  1 +
>  2 files changed, 73 insertions(+), 14 deletions(-)
> 
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 16fb7be..0bc4fc4 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -1190,6 +1190,10 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
>  	return ret;
>  }
>  
> +struct func_info {
> +	bool signature_changed;
> +};


Looking at the code, I wonder if we could simplify by passing in the ftype instead.

parameter__new() is called via

formal_paramater_pack__new
	formal_parameter_pack__load_params

and 

die__create_new_parameter

In both cases we can access the ftype (though in the case of the formal parameter
codepath we'd need to pass it through formal_parameter__pack_new()). If feels like
it might be a cleaner design to do that and set a signature_changed bitfield for
the ftype, since that's where we usually set attributes that apply the to signature,
what do you think?

The other aspects added in later patches I think would work there too. For example
the accumulation of parameters would likely be a bit easier as we could better
handle the skip_idx logic which is tricky. On that topic could we just add a
next_reg_idx directly to the ftype (or the func_info if we keep it) as the goal
of the skip_idx seems to be to figure out what the next expected reg idx from
the calling conventions should be. 
> +
>  /* For DW_AT_location 'attr':
>   * - if first location is DW_OP_regXX with expected number, return the register;
>   *   otherwise save the register for later return
> @@ -1252,7 +1256,8 @@ out:
>  }
>  
>  static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
> -					struct conf_load *conf, int param_idx)
> +					struct conf_load *conf, int param_idx,
> +					struct func_info *info)
>  {
>  	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
>  
> @@ -1263,8 +1268,15 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>  		tag__init(&parm->tag, cu, die);
>  		parm->name = attr_string(die, DW_AT_name, conf);
>  		parm->idx = param_idx;
> -		if (param_idx >= cu->nr_register_params || param_idx < 0)
> +		if (param_idx < 0)
>  			return parm;
> +		if (!info->signature_changed) {
> +			if (cu->producer_clang || param_idx >= cu->nr_register_params)
> +				return parm;
> +		} else if (param_idx >= cu->nr_register_params) {
> +			return parm;
> +		}
> +
>  		/* Parameters which use DW_AT_abstract_origin to point at
>  		 * the original parameter definition (with no name in the DIE)
>  		 * are the result of later DWARF generation during compilation
> @@ -1337,7 +1349,7 @@ static int formal_parameter_pack__load_params(struct formal_parameter_pack *pack
>  			continue;
>  		}
>  
> -		struct parameter *param = parameter__new(die, cu, conf, -1);
> +		struct parameter *param = parameter__new(die, cu, conf, -1, NULL);
>  
>  		if (param == NULL)
>  			return -1;
> @@ -1502,6 +1514,29 @@ static struct ftype *ftype__new(Dwarf_Die *die, struct cu *cu)
>  	return ftype;
>  }
>  
> +static bool function__signature_changed(struct function *func, Dwarf_Die *die)
> +{
> +	/* The inlined DW_TAG_subprogram typically has the original source type for
> +	 * abstract origin of a concrete function with address range, inlined subroutine,
> +	 * or call site.
> +	 */
> +	if (func->inlined)
> +		return false;
> +
> +	if (!func->abstract_origin)
> +		return attr_numeric(die, DW_AT_calling_convention) == DW_CC_nocall;
> +
> +	Dwarf_Attribute attr;
> +	if (dwarf_attr(die, DW_AT_abstract_origin, &attr)) {
> +		Dwarf_Die origin;
> +		if (dwarf_formref_die(&attr, &origin))
> +			return attr_numeric(&origin, DW_AT_calling_convention) == DW_CC_nocall;
> +	}
> +
> +	/* This should not happen */
> +	return false;
> +}
> +
>  static struct function *function__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
>  {
>  	struct function *func = tag__alloc(cu, sizeof(*func));
> @@ -1800,9 +1835,9 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
>  					     struct ftype *ftype,
>  					     struct lexblock *lexblock,
>  					     struct cu *cu, struct conf_load *conf,
> -					     int param_idx)
> +					     int param_idx, struct func_info *info)
>  {
> -	struct parameter *parm = parameter__new(die, cu, conf, param_idx);
> +	struct parameter *parm = parameter__new(die, cu, conf, param_idx, info);
>  
>  	if (parm == NULL)
>  		return NULL;
> @@ -1889,7 +1924,7 @@ static struct tag *die__create_new_subroutine_type(Dwarf_Die *die,
>  			tag__print_not_supported(die);
>  			continue;
>  		case DW_TAG_formal_parameter:
> -			tag = die__create_new_parameter(die, ftype, NULL, cu, conf, -1);
> +			tag = die__create_new_parameter(die, ftype, NULL, cu, conf, -1, NULL);
>  			break;
>  		case DW_TAG_unspecified_parameters:
>  			ftype->unspec_parms = 1;
> @@ -2118,7 +2153,8 @@ out_enomem:
>  }
>  
>  static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
> -				  struct lexblock *lexblock, struct cu *cu, struct conf_load *conf);
> +				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf,
> +				 struct func_info *info);
>  
>  static int die__create_new_lexblock(Dwarf_Die *die,
>  				    struct cu *cu, struct lexblock *father, struct conf_load *conf)
> @@ -2126,7 +2162,7 @@ static int die__create_new_lexblock(Dwarf_Die *die,
>  	struct lexblock *lexblock = lexblock__new(die, cu);
>  
>  	if (lexblock != NULL) {
> -		if (die__process_function(die, NULL, lexblock, cu, conf) != 0)
> +		if (die__process_function(die, NULL, lexblock, cu, conf, NULL) != 0)
>  			goto out_delete;
>  	}
>  	if (father != NULL)
> @@ -2246,7 +2282,8 @@ static struct tag *die__create_new_inline_expansion(Dwarf_Die *die,
>  }
>  
>  static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
> -				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf)
> +				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf,
> +				 struct func_info *info)
>  {
>  	int param_idx = 0;
>  	Dwarf_Die child;
> @@ -2320,7 +2357,7 @@ static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
>  			continue;
>  		}
>  		case DW_TAG_formal_parameter:
> -			tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++);
> +			tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++, info);
>  			break;
>  		case DW_TAG_variable:
>  			tag = die__create_new_variable(die, cu, conf, 0);
> @@ -2391,11 +2428,19 @@ out_enomem:
>  static struct tag *die__create_new_function(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
>  {
>  	struct function *function = function__new(die, cu, conf);
> +	struct func_info info = {};
>  
> -	if (function != NULL &&
> -	    die__process_function(die, &function->proto, &function->lexblock, cu, conf) != 0) {
> -		function__delete(function, cu);
> -		function = NULL;
> +	if (function != NULL) {
> +		/* For clang, we determine if function signature changes via DW_AT_calling_convention
> +		 * set to DW_CC_nocall.
> +		 */
> +		if (cu->producer_clang)
> +			info.signature_changed = function__signature_changed(function, die);
> +
> +		if (die__process_function(die, &function->proto, &function->lexblock, cu, conf, &info) != 0) {
> +			function__delete(function, cu);
> +			function = NULL;
> +		}
>  	}
>  
>  	return function ? &function->proto.tag : NULL;
> @@ -3045,6 +3090,17 @@ static unsigned long long dwarf_tag__orig_id(const struct tag *tag,
>  	return cu->extra_dbg_info ? dtag->id : 0;
>  }
>  
> +static bool attr_producer_clang(Dwarf_Die *die)
> +{
> +	const char *producer;
> +
> +	producer = attr_string(die, DW_AT_producer, NULL);
> +	if (!producer)
> +		return false;
> +
> +	return !!strstr(producer, "clang");
> +}
> +
>  struct debug_fmt_ops dwarf__ops;
>  
>  static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
> @@ -3082,6 +3138,7 @@ static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
>  	}
>  
>  	cu->language = attr_numeric(die, DW_AT_language);
> +	cu->producer_clang = attr_producer_clang(die);
>  
>  	if (conf->early_cu_filter)
>  		cu = conf->early_cu_filter(cu);
> @@ -3841,6 +3898,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
>  			cu->priv = dcu;
>  			cu->dfops = &dwarf__ops;
>  			cu->language = attr_numeric(cu_die, DW_AT_language);
> +			cu->producer_clang = attr_producer_clang(cu_die);
>  			cus__add(cus, cu);
>  		}
>  
> diff --git a/dwarves.h b/dwarves.h
> index 5ec16e7..b49e651 100644
> --- a/dwarves.h
> +++ b/dwarves.h
> @@ -306,6 +306,7 @@ struct cu {
>  	uint8_t		 has_addr_info:1;
>  	uint8_t		 uses_global_strings:1;
>  	uint8_t		 little_endian:1;
> +	uint8_t		 producer_clang:1;
>  	uint8_t		 nr_register_params;
>  	int		 register_params[ARCH_MAX_REGISTER_PARAMS];
>  	int		 functions_saved;


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF
  2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
                   ` (10 preceding siblings ...)
  2026-05-23 16:58 ` [PATCH dwarves v5 11/11] tests: Add a few clang true signature tests Yonghong Song
@ 2026-06-15 17:17 ` Alan Maguire
  2026-06-16  4:06   ` Yonghong Song
  11 siblings, 1 reply; 16+ messages in thread
From: Alan Maguire @ 2026-06-15 17:17 UTC (permalink / raw)
  To: Yonghong Song, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team

[-- Attachment #1: Type: text/plain, Size: 7659 bytes --]

On 23/05/2026 17:57, Yonghong Song wrote:
> Current vmlinux BTF encoding is based on the source level signatures.
> But the compiler may do some optimization and changed the signature.
> If the user tried with source level signature, their initial implementation
> may have wrong results and then the user need to check what is the
> problem and work around it, e.g. through kprobe since kprobe does not
> need vmlinux BTF.
> 
> Majority of changed signatures are due to dead argument elimination.
> The following is a more complex one. The original source signature:
>   typedef struct {
>         union {
>                 void            *kernel;
>                 void __user     *user;
>         };
>         bool            is_kernel : 1;
>   } sockptr_t;
>   typedef sockptr_t bpfptr_t;
>   static int map_create(union bpf_attr *attr, bpfptr_t uattr) { ... }
> After compiler optimization, the signature becomes:
>   static int map_create(union bpf_attr *attr, bool uattr__is_kernel) { ... }
> In the above, uattr__is_kernel corresponds to 'is_kernel' field in sockptr_t.
> This makes it easier for developers to understand what changed.
> 
> The new signature needs to properly follow ABI specification based on
> locations. Otherwise, that signature should be discarded. For example,
> 
>     0x0242f1f7:   DW_TAG_subprogram
>                     DW_AT_name      ("memblock_find_in_range")
>                     DW_AT_calling_convention        (DW_CC_nocall)
>                     DW_AT_type      (0x0242decc "phys_addr_t")
>                     ...
>     0x0242f22e:     DW_TAG_formal_parameter
>                       DW_AT_location        (indexed (0x14a) loclist = 0x005595bc:
>                          [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
>                          [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
>                          [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
>                          [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
>                       DW_AT_name    ("start")
>                       DW_AT_type    (0x0242decc "phys_addr_t")
>                       ...
>     0x0242f239:     DW_TAG_formal_parameter
>                       DW_AT_location        (indexed (0x14b) loclist = 0x005595e6:
>                          [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
>                          [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
>                          [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
>                          [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
>                       DW_AT_name    ("end")
>                       DW_AT_type    (0x0242decc "phys_addr_t")
>                       ...
>     0x0242f245:     DW_TAG_formal_parameter
>                       DW_AT_location        (indexed (0x14c) loclist = 0x00559610:
>                          [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
>                       DW_AT_name    ("size")
>                       DW_AT_type    (0x0242decc "phys_addr_t")
>                       ...
>     0x0242f250:     DW_TAG_formal_parameter
>                       DW_AT_const_value     (4096)
>                       DW_AT_name    ("align")
>                       DW_AT_type    (0x0242decc "phys_addr_t")
>                       ...
> 
> The third argument should correspond to RDX for x86_64. But the location suggests that
> the parameter value is stored in the address with 'RSI + 0'. It is not clear whether
> the parameter value is stored in RDX or not. So we have to discard this funciton in
> vmlinux BTF to avoid incorrect true signatures.
> 
> For llvm, any function having
>   DW_AT_calling_convention        (DW_CC_nocall)
> in dwarf DW_TAG_subprogram will indicate that this function has signature changed.
> I did experiment with latest bpf-next. For x86_64, there are 69103 kernel functions
> and 875 kernel functions having signature changed. A series of patches are intended
> to ensure true signatures are properly represented. Eventually, only 18 functions
> cannot have true signatures due to locations.
> 
> For arm64, there are 863 kernel functions having signature changed, and
> 70 functions cannot have true signatures due to locations. I checked those
> functions and look like llvm arm64 backend more relaxed to compute parameter
> values.
> 
> For full testing, I enabled true signature support in kernel scripts/Makefile.btf like below:
>   -pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes
>   +pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes --btf_features=+true_signature
> 
> For the patch set, Patch 1 introduced usage of DW_AT_calling_convention, which
> can precisely identify which function has signature changed. This can filter
> majority of functions where their signature won't change. Patch 2 did a prescan
> of parameter registers to accommodate some cases where the optimization could
> happen but didn't. Patches 3 to 9 tried to find functions with true signature.
> Patch 10 enables to btf encoder to properly generate BTF.
> Patch 11 includes a few tests.
> 
> Changelog:
>   v4 -> v5:
>     - v4: https://lore.kernel.org/bpf/20260326013144.2901265-1-yonghong.song@linux.dev/
>     - Check info.signature_changed only under clang.
>     - Fix an uninitialized varable issue (var reg_dix) for gcc.
>   v3 -> v4:
>     - v3: https://lore.kernel.org/bpf/20260320190917.1970524-1-yonghong.song@linux.dev/
>     - Add simple prescan of parameter registers in order to get true signatures
>       for those functions where optimization could happen but compiler didn't do it.
>     - Do not create a new name (e.g. "uattr__is_kernel") with malloc at parameter_reg()
>       stage. Instead remember both "uattr" and "is_kernel" and later generate the
>       name "uattr_is_kernel" in btf encoder.
>     - Add comments to explain how to handle parameters which may take two registers.
>     - Fix some test failures on aarch64.
>   v2 -> v3:
>     - v2: https://lore.kernel.org/bpf/20260309153215.1917033-1-yonghong.song@linux.dev/
>     - Change tests by using newly added test_lib.sh.
>     - Simplify to get bool variable producer_clang.
>     - Try to avoid producer_clang appearance in dwarf_loader.c in order to avoid
>       clear separation between clang and gcc.
>   v1 -> v2:
>     - v1: https://lore.kernel.org/bpf/20260305225455.1151066-1-yonghong.song@linux.dev/
>     - Added producer_clang guarding in btf_encoder. Otherwise, gcc kernel build
>       will crash pahole.
>     - Fix an early return in parameter__reg() which didn't do pthread_mutex_unlock()
>       which caused the deadlock for arm64.
>     - Add a few more places to guard with producer_clang and conf->true_signature
>       to maintain the previous behavior if not clang or conf->true_signature is false.
>

In order to be a bit more concrete about a proposed way forward, I'm thinking something
along the lines of the attached patch (which should apply on top of this whole series); 
rather than doing prescans etc, we record param info as we go as we do today, and once done
compute true signature info. This saves some complexity around prescan of params etc, so 
is a bit more  consistent with what's there today. Ideally we'd be able to enhance DWARF 
processing for both cases (you have some great improvements in that area in this series),
and unify the representation of modified signatures where feasible. Let me know what you think.

Thanks!

Alan

[-- Attachment #2: 0001-dwarf_loader-unify-true-signature-parameter-analysis.patch --]
[-- Type: text/x-patch, Size: 34215 bytes --]

From 370eb7cd61c4e56944c3fde5353f639e747b037c Mon Sep 17 00:00:00 2001
From: Alan Maguire <alan.maguire@oracle.com>
Date: Mon, 15 Jun 2026 15:44:41 +0100
Subject: [PATCH] dwarf_loader: unify true-signature parameter analysis

Move true-signature parameter decisions out of the clang-only prescan
path and into a post-recode function analysis pass.

Parameter loading now records DWARF location information on each
parameter: observed register, stack/constant locations, aggregate
piece use, and possible member replacement metadata. After DWARF type
recoding and abstract-origin resolution, a single pass walks the final
parameter list, matches it against ABI register order, and marks
optimized, unexpected, or member-shrunk parameters.

This removes the func_info prescan/skip-index representation and makes
clang DW_CC_nocall handling consume the same parameter-order machinery as
the existing GCC abstract/concrete reordered-parameter path. The BTF
encoder now keys off ftype->signature_changed instead of checking for
clang directly.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Assisted-by: Codex GPT-5.5
---
 btf_encoder.c  |   4 +-
 dwarf_loader.c | 721 ++++++++++++++++++++++---------------------------
 dwarves.h      |  10 +
 3 files changed, 332 insertions(+), 403 deletions(-)

diff --git a/btf_encoder.c b/btf_encoder.c
index 26be31d..ab667d0 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -1263,7 +1263,7 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 	if (!state)
 		return -ENOMEM;
 
-	if (encoder->true_signature && encoder->cu->producer_clang) {
+	if (encoder->true_signature && ftype->signature_changed) {
 		ftype__for_each_parameter(ftype, param) {
 			if (param->optimized) skip_idx++;
 		}
@@ -1310,7 +1310,7 @@ static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct functi
 			state->nr_parms--;
 			continue;
 		}
-		if (encoder->true_signature && encoder->cu->producer_clang && param->optimized)
+		if (encoder->true_signature && ftype->signature_changed && param->optimized)
 			continue;
 
 		name = parameter__name(param);
diff --git a/dwarf_loader.c b/dwarf_loader.c
index 870c167..a791693 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1200,22 +1200,7 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
 	return ret;
 }
 
-#define	PARM_DEFAULT_FAIL	-1
-#define	PARM_UNEXPECTED		-2
-#define	PARM_OPTIMIZED_OUT	-3
-#define	PARM_CONTINUE		-4
-#define	PARM_TWO_ADDR_LEN	-5
-#define	PARM_TO_BE_IMPROVED	-6
-
-/* Max 20 register parameters, considering some parameters may be optimized out.  */
-#define	MAX_PRESCAN_PARAMS	20
-
-struct func_info {
-	bool signature_changed;
-	int skip_idx;
-	int nr_params;
-	int param_start_regs[MAX_PRESCAN_PARAMS];
-};
+#define PARAMETER_UNKNOWN_REG -1
 
 static int __get_type_byte_size(Dwarf_Die *die, struct cu *cu)
 {
@@ -1268,31 +1253,6 @@ static int get_type_byte_size(Dwarf_Die *die, struct cu *cu)
 	return byte_size;
 }
 
-/* Get the first DW_OP_X (should be a register) from a parameter's DW_AT_location. */
-static int parameter__peek_first_reg(Dwarf_Die *die)
-{
-	Dwarf_Attribute attr;
-	if (dwarf_attr(die, DW_AT_location, &attr) == NULL)
-		return -1;
-
-	Dwarf_Addr base, start, end;
-	Dwarf_Op *expr;
-	size_t exprlen;
-	ptrdiff_t offset = 0;
-
-	pthread_mutex_lock(&libdw__lock);
-	offset = __dwarf_getlocations(&attr, offset, &base, &start, &end, &expr, &exprlen);
-	pthread_mutex_unlock(&libdw__lock);
-
-	if (offset <= 0 || exprlen == 0)
-		return -1;
-
-	if (expr[0].atom >= DW_OP_reg0 && expr[0].atom <= DW_OP_reg31)
-		return expr[0].atom;
-
-	return -1;
-}
-
 /* Traverse the parameter type until finding the member type which has expected
  * struct type offset.
  */
@@ -1319,325 +1279,224 @@ static Dwarf_Die *get_member_with_offset(Dwarf_Die *die, int offset, Dwarf_Die *
 		if (dwarf_tag(member_die) != DW_TAG_member)
 			continue;
 
-		int off = attr_numeric(member_die, DW_AT_data_bit_offset);
-		if (off == offset * 8)
+		Dwarf_Attribute attr;
+		Dwarf_Off bit_offset;
+
+		if (dwarf_attr(member_die, DW_AT_data_bit_offset, &attr) != NULL)
+			bit_offset = __attr_offset(&attr);
+		else if (dwarf_attr(member_die, DW_AT_data_member_location, &attr) != NULL)
+			bit_offset = __attr_offset(&attr) * 8;
+		else
+			continue;
+
+		if (bit_offset == offset * 8)
 			return member_die;
 	} while (dwarf_siblingof(member_die, member_die) == 0);
 
 	return NULL;
 }
 
-/* For two address length case, first_half and second_half represents the parameter.
- * The first_half and second_half accumulates field information across possible multiple
- * location lists.
- */
-static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num, struct cu *cu, size_t exprlen,
-				  Dwarf_Die *die, int expected_reg, int byte_size,
-				  unsigned long *first_half, unsigned long *second_half, int *ret)
+static bool dwarf_op__is_reg(unsigned int atom)
 {
-	switch (expr[0].atom) {
-	case DW_OP_lit0 ... DW_OP_lit31:
-	case DW_OP_constu:
-	case DW_OP_consts:
-		if (loc_num != 0)
-			break;
-		return PARM_OPTIMIZED_OUT;
-	}
+	return atom >= DW_OP_reg0 && atom <= DW_OP_reg31;
+}
 
-	if (byte_size <= cu->addr_size || !cu->agg_use_two_regs) {
-		/* parameter_size <= cu->addr_size */
-		switch (expr[0].atom) {
-		case DW_OP_reg0 ... DW_OP_reg31:
-			if (loc_num != 0)
-				break;
-			*ret = expr[0].atom;
-			if (*ret == expected_reg)
-				return *ret;
-			break;
-		case DW_OP_breg0 ... DW_OP_breg31:
-			if (loc_num != 0)
-				break;
-			bool has_op_stack_value = false;
-			for (int i = 1; i < exprlen; i++) {
-				if (expr[i].atom == DW_OP_stack_value) {
-					has_op_stack_value = true;
-					break;
-				}
-			}
-			if (!has_op_stack_value)
-				break;
-			/* The existence of DW_OP_stack_value means that
-			 * DW_OP_bregX register is used as value.
-			 */
-			*ret = expr[0].atom - DW_OP_breg0 + DW_OP_reg0;
-			if (*ret == expected_reg)
-				return *ret;
-		}
-	} else {
-		/* cu->addr < parameter_size <= cu->addr * 2
-		 * first_half encodes field starts for the first register.
-		 * second_half encodes field starts for the second register.
-		 *
-		 * For example:
-		 *   loclist 1: DW_OP_reg5 RDI, DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
-		 *   loclist 2: DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
-		 *   loclist 3: DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1)
-		 *
-		 * After iterating all the above three location lists (see PARM_CONTINUE below),
-		 * first_half encodes as 0x1 and second_half encodes as 0x1. The 'ret' value will
-		 * encode the first used register which is RDI. Each bit in first_half/second_half
-		 * represents a member field.
-		 *
-		 * Another example:
-		 *   loclist 1: DW_OP_reg5 RDI, DW_OP_piece 0x4
-		 *   loclist 2: DW_OP_piece 0x4, DW_OP_reg4 RDI, DW_OP_piece 0x4
-		 *
-		 * After iterating all the above two location lists, first_half encodes 0x11.
-		 * After loclist 1, first_half encoding is 0x1. After loclist 2, first_half encoding is 0x11.
-		 * second_half is 0. The 'ret' value is RDI.
-		 */
-		int off = 0;
-		for (int i = 0; i < exprlen; i++) {
-			if (expr[i].atom == DW_OP_piece) {
-				int num = expr[i].number;
-				if (i == 0) {
-					off = num;
-					continue;
-				}
-				if (off < cu->addr_size) (*first_half) |= (1 << off);
-				else (*second_half) |= (1 << (off - cu->addr_size));
-				off += num;
-			} else if (expr[i].atom >= DW_OP_reg0 && expr[i].atom <= DW_OP_reg31) {
-				if (off < cu->addr_size)
-					*ret = expr[i].atom;
-				else if (*ret < 0)
-					*ret = expr[i].atom;
-			}
-			/* FIXME: not handling DW_OP_bregX yet since we do not have
-			 * a use case for it yet for linux kernel.
-			 */
-		}
+static bool dwarf_expr__has_stack_value(Dwarf_Op *expr, size_t exprlen)
+{
+	for (size_t i = 1; i < exprlen; i++) {
+		if (expr[i].atom == DW_OP_stack_value)
+			return true;
 	}
-
-	return PARM_CONTINUE;
+	return false;
 }
 
-/* The first_half and second_half, computed in parameter__multi_exprs(), are handled here. */
-static int parameter__handle_two_addr_len(int expected_reg, unsigned long first_half, unsigned long second_half,
-					  int ret, Dwarf_Die *die, struct conf_load *conf, struct cu *cu,
-					  struct parameter *parm, int param_idx, int reg_idx, int byte_size,
-					  struct func_info *info)
+static void parameter__set_loc_reg(struct parameter *parm, int reg)
 {
-	if (!first_half && !second_half)
-		return ret;
-
-	if (ret != expected_reg)
-		return ret;
-
-	if (!conf->true_signature)
-		return PARM_DEFAULT_FAIL;
+	if (parm->loc_reg == PARAMETER_UNKNOWN_REG)
+		parm->loc_reg = reg;
+}
 
-	/* Both halves are used based on dwarf */
-	if (first_half && second_half)
-		return PARM_TWO_ADDR_LEN;
+static void parameter__set_field_bit(unsigned long *fields, int byte_offset)
+{
+	if (byte_offset >= 0 && byte_offset < (int)(sizeof(*fields) * 8))
+		*fields |= 1UL << byte_offset;
+}
 
-	/* Only one half is used. Check if the next parameter's starting register
-	 * indicates the ABI still reserves the full register space for this
-	 * parameter. If so, the compiler only eliminated the dead half but the
-	 * register layout is preserved — keep the original source type.
-	 *
-	 * Use register_params[] array for the expected next register since
-	 * DW_OP_reg numbers are not necessarily sequential across architectures.
-	 */
-	if (param_idx + 1 < info->nr_params) {
-		int next_start = info->param_start_regs[param_idx + 1];
+static void parameter__record_true_sig_member(struct parameter *parm, Dwarf_Die *die,
+					      int field_offset, struct conf_load *conf)
+{
+	Dwarf_Die member_die;
 
-		if (next_start >= 0) {
-			int num_regs = (byte_size + cu->addr_size - 1) / cu->addr_size;
-			int next_reg_idx = reg_idx + num_regs;
+	if (parm->true_sig_member_name)
+		return;
+	if (!parm->name)
+		return;
+	if (!get_member_with_offset(die, field_offset, &member_die))
+		return;
 
-			if (next_reg_idx < cu->nr_register_params &&
-			    next_start == cu->register_params[next_reg_idx])
-				return PARM_TWO_ADDR_LEN;
-		}
+	parm->true_sig_member_name = attr_string(&member_die, DW_AT_name, conf);
+	if (!parm->true_sig_member_name) {
+		parm->true_sig_member_name = NULL;
+		return;
 	}
 
-	/* FIXME: parm->name may be NULL due to abstract origin. We do not want to
-	 * update abstract origin as the type in abstract origin may be used
-	 * in some other places. We could remove abstract origin in this parameter
-	 * and add name and type in parameter itself. Right now, for current bpf-next
-	 * repo, we do not have instances below where parm->name is NULL for x86_64 arch.
-	 */
-	if (!parm->name)
-		return PARM_TO_BE_IMPROVED;
-
-	/* FIXME: Only support single field now so we can have a good parameter name and
-	 * type for it. For more than one field, another option could be named as
-	 * <parameter_name>__first_half or <parameter_name>__second_half, but it is not
-	 * that intuitive.
-	 */
-	if (__builtin_popcountll(first_half) >= 2 || __builtin_popcountll(second_half) >= 2)
-		return PARM_TO_BE_IMPROVED;
+	parm->true_sig_type_from_types = attr_type(&member_die, DW_AT_type, &parm->true_sig_type);
+	if (parm->true_sig_type == 0)
+		parm->true_sig_member_name = NULL;
+}
 
+static void parameter__finish_piece_decode(struct parameter *parm, Dwarf_Die *die,
+					   struct conf_load *conf, struct cu *cu)
+{
+	unsigned long first = parm->first_reg_fields;
+	unsigned long second = parm->second_reg_fields;
 	int field_offset;
-	if (__builtin_popcountll(first_half) == 1)
-		field_offset = __builtin_ctzll(first_half);
+
+	if (!first && !second)
+		return;
+	if (first && second)
+		return;
+	if (__builtin_popcountl(first) >= 2 || __builtin_popcountl(second) >= 2)
+		return;
+
+	if (__builtin_popcountl(first) == 1)
+		field_offset = __builtin_ctzl(first);
 	else
-		field_offset = cu->addr_size + __builtin_ctzll(second_half);
+		field_offset = cu->addr_size + __builtin_ctzl(second);
 
-	/* FIXME: Only struct type is supported. */
-	Dwarf_Die member_die;
-	if (!get_member_with_offset(die, field_offset, &member_die))
-		return PARM_TO_BE_IMPROVED;
+	parameter__record_true_sig_member(parm, die, field_offset, conf);
+}
 
-	/* FIXME: cannot get a proper member_name, e.g. if the member type is a union. */
-	const char *member_name = attr_string(&member_die, DW_AT_name, conf);
-	if (!member_name)
-		return PARM_TO_BE_IMPROVED;
+/* For aggregate parameters represented by pieces, first_reg_fields and
+ * second_reg_fields record the byte offsets materialized in each ABI register.
+ * The later function-level pass decides whether the source aggregate is still
+ * ABI-preserved or should be replaced by the single used member candidate.
+ */
+static void parameter__multi_exprs(Dwarf_Op *expr, int loc_num, struct cu *cu,
+				   size_t exprlen, struct parameter *parm)
+{
+	switch (expr[0].atom) {
+	case DW_OP_lit0 ... DW_OP_lit31:
+	case DW_OP_constu:
+	case DW_OP_consts:
+		if (loc_num == 0)
+			parm->loc_const_value = 1;
+		return;
+	}
 
-	/* true_sig_member_name is the member name which will be used for later btf name
-	 * like <parameter_name>__<member_name>.
-	 */
-	parm->true_sig_member_name = member_name;
+	if (parm->type_byte_size <= cu->addr_size || !cu->agg_use_two_regs) {
+		switch (expr[0].atom) {
+		case DW_OP_reg0 ... DW_OP_reg31:
+			if (loc_num == 0)
+				parameter__set_loc_reg(parm, expr[0].atom);
+			return;
+		case DW_OP_breg0 ... DW_OP_breg31:
+			if (loc_num == 0 && dwarf_expr__has_stack_value(expr, exprlen))
+				parameter__set_loc_reg(parm, expr[0].atom - DW_OP_breg0 + DW_OP_reg0);
+			return;
+		default:
+			return;
+		}
+	}
 
-	struct tag *tag = &parm->tag;
-	struct dwarf_tag *dtag = tag__dwarf(tag);
-	dwarf_tag__set_attr_type(dtag, type, &member_die, DW_AT_type);
+	int off = 0;
+	for (size_t i = 0; i < exprlen; i++) {
+		if (expr[i].atom == DW_OP_piece) {
+			int num = expr[i].number;
 
-	return ret;
+			if (i == 0) {
+				off = num;
+				continue;
+			}
+
+			if (off < cu->addr_size)
+				parameter__set_field_bit(&parm->first_reg_fields, off);
+			else
+				parameter__set_field_bit(&parm->second_reg_fields, off - cu->addr_size);
+			off += num;
+		} else if (dwarf_op__is_reg(expr[i].atom)) {
+			if (off < cu->addr_size || parm->loc_reg == PARAMETER_UNKNOWN_REG)
+				parameter__set_loc_reg(parm, expr[i].atom);
+		}
+		/* FIXME: not handling DW_OP_bregX pieces yet since we do not
+		 * have a use case for it yet in the Linux kernel.
+		 */
+	}
 }
 
-/* For DW_AT_location 'attr':
- * - if first location is DW_OP_regXX with expected number, return the register;
- *   otherwise save the register for later return
- * - if location DW_OP_entry_value(DW_OP_regXX) with expected number is in the
- *   list, return the register; otherwise save register for later return
- * - otherwise if no register was found for locations, return PARM_DEFAULT_FAIL.
- */
-static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct conf_load *conf,
-			  struct func_info *info, struct cu *cu, Dwarf_Die *die,
-			  struct parameter *parm, int param_idx, int reg_idx)
+static void parameter__decode_location(Dwarf_Attribute *attr, struct conf_load *conf,
+				       struct cu *cu, Dwarf_Die *die,
+				       struct parameter *parm)
 {
 	Dwarf_Addr base, start, end;
 	Dwarf_Op *expr, *entry_ops;
 	Dwarf_Attribute entry_attr;
 	size_t exprlen, entry_len;
 	ptrdiff_t offset = 0;
-	int byte_size = 0;
 	int loc_num = -1;
-	int ret = PARM_DEFAULT_FAIL;
-	unsigned long first_half = 0, second_half = 0;
 
-	/* use libdw__lock as dwarf_getlocation(s) has concurrency issues
-	 * when libdw is not compiled with experimental --enable-thread-safety
-	 */
 	pthread_mutex_lock(&libdw__lock);
 	while ((offset = __dwarf_getlocations(attr, offset, &base, &start, &end, &expr, &exprlen)) > 0) {
+		bool had_stack_value;
+
 		loc_num++;
+		if (exprlen == 0)
+			continue;
 
-		/* Convert expression list (XX DW_OP_stack_value) -> (XX).
-		 * DW_OP_stack_value instructs interpreter to pop current value from
-		 * DWARF expression evaluation stack, and thus is not important here.
-		 */
-		if (exprlen == 2 && expr[exprlen - 1].atom == DW_OP_stack_value)
+		had_stack_value = expr[exprlen - 1].atom == DW_OP_stack_value;
+		if (exprlen == 2 && had_stack_value)
 			exprlen--;
 
 		if (exprlen != 1) {
-			if (!info->signature_changed || !conf->true_signature)
-				continue;
-
-			if (!byte_size)
-				byte_size = get_type_byte_size(die, cu);
-			/* This should not happen. */
-			if (!byte_size) {
-				ret = PARM_UNEXPECTED;
-				goto out;
-			}
-
-			int res;
-			res = parameter__multi_exprs(expr, loc_num, cu, exprlen, die, expected_reg,
-						     byte_size, &first_half, &second_half, &ret);
-			if (res == PARM_CONTINUE)
-				continue;
-			ret = res;
-			goto out;
+			parameter__multi_exprs(expr, loc_num, cu, exprlen, parm);
+			continue;
 		}
 
 		switch (expr->atom) {
-		/* match DW_OP_regXX at first location */
 		case DW_OP_reg0 ... DW_OP_reg31:
-			if (loc_num != 0)
-				break;
-			ret = expr->atom;
-			if (ret == expected_reg)
-				goto out;
+			if (loc_num == 0)
+				parameter__set_loc_reg(parm, expr->atom);
+			break;
+		case DW_OP_breg0 ... DW_OP_breg31:
+			if (loc_num == 0 && had_stack_value)
+				parameter__set_loc_reg(parm, expr->atom - DW_OP_breg0 + DW_OP_reg0);
 			break;
 		case DW_OP_fbreg:
-			/* The location like
-			 *   DW_AT_location        (DW_OP_fbreg +<num>)
-			 * indicates that the parameter is on the stack. But it is possible
-			 * that the parameter can fit in register(s). So conservatively
-			 * mark this parameter not suitable for true signatures.
-			 */
-			if (info->signature_changed && conf->true_signature)
-				ret = PARM_UNEXPECTED;
+			parm->loc_stack = 1;
 			break;
 		case DW_OP_lit0 ... DW_OP_lit31:
 		case DW_OP_constu:
 		case DW_OP_consts:
-			if (info->signature_changed && conf->true_signature) {
-				if (loc_num != 0)
-					break;
-				ret = PARM_OPTIMIZED_OUT;
-				goto out;
-			}
+			if (loc_num == 0)
+				parm->loc_const_value = 1;
 			break;
-		/* match DW_OP_entry_value(DW_OP_regXX) at any location */
 		case DW_OP_entry_value:
 		case DW_OP_GNU_entry_value:
 			if (dwarf_getlocation_attr(attr, expr, &entry_attr) == 0 &&
 			    dwarf_getlocation(&entry_attr, &entry_ops, &entry_len) == 0 &&
-			    entry_len == 1) {
-				ret = entry_ops->atom;
-				if (ret == expected_reg)
-					goto out;
-			}
+			    entry_len == 1 && dwarf_op__is_reg(entry_ops->atom))
+				parameter__set_loc_reg(parm, entry_ops->atom);
 			break;
 		}
 	}
-
-	ret = parameter__handle_two_addr_len(expected_reg, first_half, second_half,
-					     ret, die, conf, cu, parm, param_idx, reg_idx,
-					     byte_size, info);
-
-out:
 	pthread_mutex_unlock(&libdw__lock);
-	return ret;
+
+	parameter__finish_piece_decode(parm, die, conf, cu);
 }
 
 static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
-					struct conf_load *conf, int param_idx,
-					struct func_info *info)
+					struct conf_load *conf, int param_idx)
 {
 	struct parameter *parm = tag__alloc(cu, sizeof(*parm));
 
 	if (parm != NULL) {
-		bool has_const_value, true_sig_enabled;
 		Dwarf_Attribute attr;
-		int reg_idx;
 
 		tag__init(&parm->tag, cu, die);
 		parm->name = attr_string(die, DW_AT_name, conf);
 		parm->idx = param_idx;
-		if (param_idx < 0)
-			return parm;
-		if (!info->signature_changed) {
-			if (cu->producer_clang || param_idx >= cu->nr_register_params)
-				return parm;
-			reg_idx = param_idx;
-		} else {
-			reg_idx = param_idx - info->skip_idx;
-		}
+		parm->loc_reg = PARAMETER_UNKNOWN_REG;
+		parm->type_byte_size = get_type_byte_size(die, cu);
 
 		/* Parameters which use DW_AT_abstract_origin to point at
 		 * the original parameter definition (with no name in the DIE)
@@ -1672,66 +1531,10 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
 		 * between these parameter representations.  See
 		 * ftype__recode_dwarf_types() below for how this is handled.
 		 */
-		has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
+		parm->has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
 		parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
-		true_sig_enabled = conf->true_signature && info->signature_changed;
-
-		if (parm->has_loc) {
-			if (reg_idx >= cu->nr_register_params)
-				return parm;
-
-			int expected_reg = cu->register_params[reg_idx];
-			int actual_reg = parameter__reg(&attr, expected_reg, conf, info, cu, die,
-							parm, param_idx, reg_idx);
-
-			if (actual_reg == PARM_DEFAULT_FAIL) {
-				parm->optimized = 1;
-			} else if (actual_reg == PARM_OPTIMIZED_OUT) {
-				parm->optimized = 1;
-				info->skip_idx++;
-			} else if (actual_reg == PARM_TWO_ADDR_LEN) {
-				/* account for parameter with two registers */
-				info->skip_idx--;
-			} else if (actual_reg == PARM_UNEXPECTED || actual_reg == PARM_TO_BE_IMPROVED ||
-				   (expected_reg >= 0 && expected_reg != actual_reg)) {
-				/* mark parameters that use an unexpected
-				 * register to hold a parameter; these will
-				 * be problematic for users of BTF as they
-				 * violate expectations about register
-				 * contents.
-				 */
-				parm->unexpected_reg = 1;
-			}
-		} else if (has_const_value && !cu->producer_clang) {
-			parm->optimized = 1;
-		} else if (true_sig_enabled) {
-			int byte_size, num_regs, next_reg_idx;
-
-			if (param_idx + 1 < info->nr_params) {
-				int next_start = info->param_start_regs[param_idx + 1];
-				if (next_start >= 0) {
-					/* check whether we should preserve the argument or not */
-					byte_size = get_type_byte_size(die, cu);
-					/* byte_size 0 should not happen. */
-					if (!byte_size) {
-						parm->unexpected_reg = 1;
-						return parm;
-					}
-
-					num_regs = (byte_size + cu->addr_size - 1) / cu->addr_size;
-					next_reg_idx = reg_idx + num_regs;
-					if (next_reg_idx < cu->nr_register_params &&
-					    next_start == cu->register_params[next_reg_idx]) {
-						if (byte_size > cu->addr_size)
-							info->skip_idx--;
-						return parm;
-					}
-				}
-			}
-
-			parm->optimized = 1;
-			info->skip_idx++;
-		}
+		if (parm->has_loc)
+			parameter__decode_location(&attr, conf, cu, die, parm);
 	}
 
 	return parm;
@@ -1751,7 +1554,7 @@ static int formal_parameter_pack__load_params(struct formal_parameter_pack *pack
 			continue;
 		}
 
-		struct parameter *param = parameter__new(die, cu, conf, -1, NULL);
+		struct parameter *param = parameter__new(die, cu, conf, -1);
 
 		if (param == NULL)
 			return -1;
@@ -2237,9 +2040,9 @@ static struct tag *die__create_new_parameter(Dwarf_Die *die,
 					     struct ftype *ftype,
 					     struct lexblock *lexblock,
 					     struct cu *cu, struct conf_load *conf,
-					     int param_idx, struct func_info *info)
+					     int param_idx)
 {
-	struct parameter *parm = parameter__new(die, cu, conf, param_idx, info);
+	struct parameter *parm = parameter__new(die, cu, conf, param_idx);
 
 	if (parm == NULL)
 		return NULL;
@@ -2326,7 +2129,7 @@ static struct tag *die__create_new_subroutine_type(Dwarf_Die *die,
 			tag__print_not_supported(die);
 			continue;
 		case DW_TAG_formal_parameter:
-			tag = die__create_new_parameter(die, ftype, NULL, cu, conf, -1, NULL);
+			tag = die__create_new_parameter(die, ftype, NULL, cu, conf, -1);
 			break;
 		case DW_TAG_unspecified_parameters:
 			ftype->unspec_parms = 1;
@@ -2555,8 +2358,7 @@ out_enomem:
 }
 
 static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
-				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf,
-				 struct func_info *info);
+				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf);
 
 static int die__create_new_lexblock(Dwarf_Die *die,
 				    struct cu *cu, struct lexblock *father, struct conf_load *conf)
@@ -2564,7 +2366,7 @@ static int die__create_new_lexblock(Dwarf_Die *die,
 	struct lexblock *lexblock = lexblock__new(die, cu);
 
 	if (lexblock != NULL) {
-		if (die__process_function(die, NULL, lexblock, cu, conf, NULL) != 0)
+		if (die__process_function(die, NULL, lexblock, cu, conf) != 0)
 			goto out_delete;
 	}
 	if (father != NULL)
@@ -2684,8 +2486,7 @@ static struct tag *die__create_new_inline_expansion(Dwarf_Die *die,
 }
 
 static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
-				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf,
-				 struct func_info *info)
+				 struct lexblock *lexblock, struct cu *cu, struct conf_load *conf)
 {
 	int param_idx = 0;
 	Dwarf_Die child;
@@ -2759,7 +2560,7 @@ static int die__process_function(Dwarf_Die *die, struct ftype *ftype,
 			continue;
 		}
 		case DW_TAG_formal_parameter:
-			tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++, info);
+			tag = die__create_new_parameter(die, ftype, lexblock, cu, conf, param_idx++);
 			break;
 		case DW_TAG_variable:
 			tag = die__create_new_variable(die, cu, conf, 0);
@@ -2827,58 +2628,18 @@ out_enomem:
 	return -ENOMEM;
 }
 
-/* Pre-scan all formal parameters to collect their starting registers.
- * This allows look-ahead when processing parameters sequentially, so that
- * a parameter can check the next parameter's register to determine if the
- * ABI register layout is preserved despite partial optimization.
- * For example, for a function like below:
- *  struct t { long f1; long f2; };
- *  __attribute__((noinline)) static long foo(struct t a, struct t b)
- *  {
- *      return a.f1 + b.f1 + b.f2;
- *  }
- * If dwarf has parameter 'a' at aarch64 register W0, and 'b' at register W2,
- * even compiler could optimize 'a' to 'a.f1'. To conform to ABI, the
- * parameter 'a' will keep 'struct t' type.
- */
-static void func_info__prescan_params(struct func_info *info, Dwarf_Die *die)
-{
-	Dwarf_Die child;
-	int idx = 0;
-
-	if (!info->signature_changed)
-		return;
-
-	if (!dwarf_haschildren(die) || dwarf_child(die, &child) != 0)
-		return;
-
-	do {
-		if (dwarf_tag(&child) != DW_TAG_formal_parameter)
-			continue;
-		if (idx >= MAX_PRESCAN_PARAMS)
-			break;
-		info->param_start_regs[idx] = parameter__peek_first_reg(&child);
-		idx++;
-	} while (dwarf_siblingof(&child, &child) == 0);
-
-	info->nr_params = idx;
-}
-
 static struct tag *die__create_new_function(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
 {
 	struct function *function = function__new(die, cu, conf);
-	struct func_info info = {};
 
 	if (function != NULL) {
 		/* For clang, we determine if function signature changes via DW_AT_calling_convention
 		 * set to DW_CC_nocall.
 		 */
-		if (cu->producer_clang) {
-			info.signature_changed = function__signature_changed(function, die);
-			func_info__prescan_params(&info, die);
-		}
+		if (cu->producer_clang)
+			function->proto.signature_changed = function__signature_changed(function, die);
 
-		if (die__process_function(die, &function->proto, &function->lexblock, cu, conf, &info) != 0) {
+		if (die__process_function(die, &function->proto, &function->lexblock, cu, conf) != 0) {
 			function__delete(function, cu);
 			function = NULL;
 		}
@@ -3133,6 +2894,23 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 			 */
 			if (pos->has_loc)
 				opos->has_loc = pos->has_loc;
+			if (pos->has_const_value)
+				opos->has_const_value = pos->has_const_value;
+			if (pos->loc_const_value)
+				opos->loc_const_value = pos->loc_const_value;
+			if (pos->loc_stack)
+				opos->loc_stack = pos->loc_stack;
+			if (pos->loc_reg != PARAMETER_UNKNOWN_REG)
+				opos->loc_reg = pos->loc_reg;
+			if (pos->type_byte_size != 0)
+				opos->type_byte_size = pos->type_byte_size;
+			opos->first_reg_fields |= pos->first_reg_fields;
+			opos->second_reg_fields |= pos->second_reg_fields;
+			if (pos->true_sig_member_name && !opos->true_sig_member_name) {
+				opos->true_sig_member_name = pos->true_sig_member_name;
+				opos->true_sig_type = pos->true_sig_type;
+				opos->true_sig_type_from_types = pos->true_sig_type_from_types;
+			}
 
 			if (pos->optimized)
 				opos->optimized = pos->optimized;
@@ -3150,6 +2928,145 @@ static void ftype__recode_dwarf_types(struct tag *tag, struct cu *cu)
 	}
 }
 
+static struct parameter *ftype__next_parameter(struct ftype *ftype, struct parameter *parm)
+{
+	if (parm->tag.node.next == &ftype->parms)
+		return NULL;
+	return list_entry(parm->tag.node.next, struct parameter, tag.node);
+}
+
+static int parameter__abi_slots(const struct parameter *parm, const struct cu *cu)
+{
+	int slots;
+
+	if (!cu->agg_use_two_regs || parm->type_byte_size <= cu->addr_size)
+		return 1;
+
+	slots = (parm->type_byte_size + cu->addr_size - 1) / cu->addr_size;
+	return slots > 0 ? slots : 1;
+}
+
+static bool parameter__has_piece_info(const struct parameter *parm)
+{
+	return parm->first_reg_fields || parm->second_reg_fields;
+}
+
+static bool parameter__uses_full_aggregate(const struct parameter *parm)
+{
+	return parm->first_reg_fields && parm->second_reg_fields;
+}
+
+static bool ftype__next_parameter_preserves_slots(struct ftype *ftype, struct parameter *parm,
+						  int reg_idx, int slots, struct cu *cu)
+{
+	struct parameter *next = ftype__next_parameter(ftype, parm);
+	int next_reg_idx;
+
+	if (!next || next->loc_reg == PARAMETER_UNKNOWN_REG)
+		return false;
+
+	next_reg_idx = reg_idx + slots;
+	return next_reg_idx < cu->nr_register_params &&
+	       next->loc_reg == cu->register_params[next_reg_idx];
+}
+
+static bool parameter__apply_true_sig_member(struct parameter *parm, struct cu *cu)
+{
+	struct dwarf_tag tmp = {};
+	struct dwarf_tag *dtype;
+
+	if (!parm->true_sig_member_name || parm->true_sig_type == 0)
+		return false;
+
+	tmp.type = parm->true_sig_type;
+	tmp.from_types_section.type = parm->true_sig_type_from_types;
+	dtype = __dwarf_cu__find_type_by_ref(cu->priv, tmp.type, tmp.from_types_section.type);
+	if (!dtype)
+		return false;
+
+	parm->tag.type = dtype->small_id;
+	return true;
+}
+
+static void function__analyze_parameter_locations(struct function *fn, struct cu *cu,
+						  struct conf_load *conf)
+{
+	struct ftype *ftype = &fn->proto;
+	struct parameter *pos;
+	bool true_sig_enabled = conf->true_signature && ftype->signature_changed;
+	bool check_registers = !cu->producer_clang || true_sig_enabled;
+	int reg_idx = 0;
+
+	if (!check_registers)
+		return;
+
+	ftype__for_each_parameter(ftype, pos) {
+		bool consumes_register = true;
+		int slots = parameter__abi_slots(pos, cu);
+		int expected_reg;
+
+		if (reg_idx >= cu->nr_register_params)
+			continue;
+
+		expected_reg = cu->register_params[reg_idx];
+
+		if (pos->has_loc) {
+			if (true_sig_enabled && pos->loc_const_value) {
+				pos->optimized = 1;
+				consumes_register = false;
+				goto next;
+			}
+
+			if (true_sig_enabled && pos->loc_stack) {
+				pos->unexpected_reg = 1;
+				goto next;
+			}
+
+			if (pos->loc_reg == PARAMETER_UNKNOWN_REG) {
+				pos->optimized = 1;
+				consumes_register = !true_sig_enabled;
+				goto next;
+			}
+
+			if (expected_reg >= 0 && expected_reg != pos->loc_reg) {
+				pos->unexpected_reg = 1;
+				goto next;
+			}
+
+			if (true_sig_enabled && parameter__has_piece_info(pos)) {
+				if (parameter__uses_full_aggregate(pos) ||
+				    ftype__next_parameter_preserves_slots(ftype, pos, reg_idx, slots, cu)) {
+					reg_idx += slots;
+					continue;
+				}
+
+				if (parameter__apply_true_sig_member(pos, cu)) {
+					reg_idx++;
+					continue;
+				}
+
+				pos->unexpected_reg = 1;
+				reg_idx += slots;
+				continue;
+			}
+		} else if (pos->has_const_value && !cu->producer_clang) {
+			pos->optimized = 1;
+		} else if (true_sig_enabled) {
+			if (ftype__next_parameter_preserves_slots(ftype, pos, reg_idx, slots, cu)) {
+				reg_idx += slots;
+				continue;
+			}
+
+			pos->optimized = 1;
+			consumes_register = false;
+		}
+
+next:
+		if (consumes_register)
+			reg_idx++;
+	}
+}
+
 static void lexblock__recode_dwarf_types(struct lexblock *tag, struct cu *cu)
 {
 	struct tag *pos;
@@ -3425,7 +3342,7 @@ static bool param__is_struct(struct cu *cu, struct tag *tag)
 	}
 }
 
-static int cu__resolve_func_ret_types_optimized(struct cu *cu)
+static int cu__resolve_func_ret_types_optimized(struct cu *cu, struct conf_load *conf)
 {
 	struct ptr_table *pt = &cu->functions_table;
 	uint32_t i;
@@ -3436,6 +3353,8 @@ static int cu__resolve_func_ret_types_optimized(struct cu *cu)
 		struct function *fn = tag__function(tag);
 		bool has_unexpected_reg = false, has_struct_param = false;
 
+		function__analyze_parameter_locations(fn, cu, conf);
+
 		/* mark function as optimized if parameter is, or
 		 * if parameter does not have a location; at this
 		 * point location presence has been marked in
@@ -3614,7 +3533,7 @@ static int die__process_and_recode(Dwarf_Die *die, struct cu *cu, struct conf_lo
 	if (ret != 0)
 		return ret;
 
-	return cu__resolve_func_ret_types_optimized(cu);
+	return cu__resolve_func_ret_types_optimized(cu, conf);
 }
 
 static int class_member__cache_byte_size(struct tag *tag, struct cu *cu,
@@ -4377,7 +4296,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
 	 * encoded in another subprogram through abstract_origin
 	 * tag. Let us visit all subprograms again to resolve this.
 	 */
-	if (cu__resolve_func_ret_types_optimized(cu) != LSK__KEEPIT)
+	if (cu__resolve_func_ret_types_optimized(cu, conf) != LSK__KEEPIT)
 		goto out_abort;
 
 	cu__finalize(cu, cus, conf);
diff --git a/dwarves.h b/dwarves.h
index 2fc937a..ed3f005 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -949,6 +949,15 @@ struct parameter {
 	struct tag tag;
 	const char *name;
 	const char *true_sig_member_name;
+	Dwarf_Off true_sig_type;
+	unsigned long first_reg_fields;
+	unsigned long second_reg_fields;
+	int loc_reg;
+	uint16_t type_byte_size;
+	uint8_t true_sig_type_from_types:1;
+	uint8_t has_const_value:1;
+	uint8_t loc_const_value:1;
+	uint8_t loc_stack:1;
 	uint8_t optimized:1;
 	uint8_t unexpected_reg:1;
 	uint8_t has_loc:1;
@@ -1033,6 +1042,7 @@ struct ftype {
 	uint8_t		 inconsistent_proto:1;
 	uint8_t		 uncertain_parm_loc:1;
 	uint8_t		 reordered_parm:1;
+	uint8_t		 signature_changed:1;
 	struct list_head template_type_params;
 	struct list_head template_value_params;
 	struct template_parameter_pack *template_parameter_pack;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF
  2026-06-15 17:17 ` [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Alan Maguire
@ 2026-06-16  4:06   ` Yonghong Song
  0 siblings, 0 replies; 16+ messages in thread
From: Yonghong Song @ 2026-06-16  4:06 UTC (permalink / raw)
  To: Alan Maguire, Arnaldo Carvalho de Melo, dwarves
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, kernel-team



On 6/15/26 10:17 AM, Alan Maguire wrote:
> On 23/05/2026 17:57, Yonghong Song wrote:
>> Current vmlinux BTF encoding is based on the source level signatures.
>> But the compiler may do some optimization and changed the signature.
>> If the user tried with source level signature, their initial implementation
>> may have wrong results and then the user need to check what is the
>> problem and work around it, e.g. through kprobe since kprobe does not
>> need vmlinux BTF.
>>
>> Majority of changed signatures are due to dead argument elimination.
>> The following is a more complex one. The original source signature:
>>    typedef struct {
>>          union {
>>                  void            *kernel;
>>                  void __user     *user;
>>          };
>>          bool            is_kernel : 1;
>>    } sockptr_t;
>>    typedef sockptr_t bpfptr_t;
>>    static int map_create(union bpf_attr *attr, bpfptr_t uattr) { ... }
>> After compiler optimization, the signature becomes:
>>    static int map_create(union bpf_attr *attr, bool uattr__is_kernel) { ... }
>> In the above, uattr__is_kernel corresponds to 'is_kernel' field in sockptr_t.
>> This makes it easier for developers to understand what changed.
>>
>> The new signature needs to properly follow ABI specification based on
>> locations. Otherwise, that signature should be discarded. For example,
>>
>>      0x0242f1f7:   DW_TAG_subprogram
>>                      DW_AT_name      ("memblock_find_in_range")
>>                      DW_AT_calling_convention        (DW_CC_nocall)
>>                      DW_AT_type      (0x0242decc "phys_addr_t")
>>                      ...
>>      0x0242f22e:     DW_TAG_formal_parameter
>>                        DW_AT_location        (indexed (0x14a) loclist = 0x005595bc:
>>                           [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
>>                           [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
>>                           [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
>>                           [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
>>                        DW_AT_name    ("start")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>      0x0242f239:     DW_TAG_formal_parameter
>>                        DW_AT_location        (indexed (0x14b) loclist = 0x005595e6:
>>                           [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
>>                           [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
>>                           [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
>>                           [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
>>                        DW_AT_name    ("end")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>      0x0242f245:     DW_TAG_formal_parameter
>>                        DW_AT_location        (indexed (0x14c) loclist = 0x00559610:
>>                           [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
>>                        DW_AT_name    ("size")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>      0x0242f250:     DW_TAG_formal_parameter
>>                        DW_AT_const_value     (4096)
>>                        DW_AT_name    ("align")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>
>> The third argument should correspond to RDX for x86_64. But the location suggests that
>> the parameter value is stored in the address with 'RSI + 0'. It is not clear whether
>> the parameter value is stored in RDX or not. So we have to discard this funciton in
>> vmlinux BTF to avoid incorrect true signatures.
>>
>> For llvm, any function having
>>    DW_AT_calling_convention        (DW_CC_nocall)
>> in dwarf DW_TAG_subprogram will indicate that this function has signature changed.
>> I did experiment with latest bpf-next. For x86_64, there are 69103 kernel functions
>> and 875 kernel functions having signature changed. A series of patches are intended
>> to ensure true signatures are properly represented. Eventually, only 18 functions
>> cannot have true signatures due to locations.
>>
>> For arm64, there are 863 kernel functions having signature changed, and
>> 70 functions cannot have true signatures due to locations. I checked those
>> functions and look like llvm arm64 backend more relaxed to compute parameter
>> values.
>>
>> For full testing, I enabled true signature support in kernel scripts/Makefile.btf like below:
>>    -pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes
>>    +pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes --btf_features=+true_signature
>>
>> For the patch set, Patch 1 introduced usage of DW_AT_calling_convention, which
>> can precisely identify which function has signature changed. This can filter
>> majority of functions where their signature won't change. Patch 2 did a prescan
>> of parameter registers to accommodate some cases where the optimization could
>> happen but didn't. Patches 3 to 9 tried to find functions with true signature.
>> Patch 10 enables to btf encoder to properly generate BTF.
>> Patch 11 includes a few tests.
>>
>> Changelog:
>>    v4 -> v5:
>>      - v4: https://lore.kernel.org/bpf/20260326013144.2901265-1-yonghong.song@linux.dev/
>>      - Check info.signature_changed only under clang.
>>      - Fix an uninitialized varable issue (var reg_dix) for gcc.
>>    v3 -> v4:
>>      - v3: https://lore.kernel.org/bpf/20260320190917.1970524-1-yonghong.song@linux.dev/
>>      - Add simple prescan of parameter registers in order to get true signatures
>>        for those functions where optimization could happen but compiler didn't do it.
>>      - Do not create a new name (e.g. "uattr__is_kernel") with malloc at parameter_reg()
>>        stage. Instead remember both "uattr" and "is_kernel" and later generate the
>>        name "uattr_is_kernel" in btf encoder.
>>      - Add comments to explain how to handle parameters which may take two registers.
>>      - Fix some test failures on aarch64.
>>    v2 -> v3:
>>      - v2: https://lore.kernel.org/bpf/20260309153215.1917033-1-yonghong.song@linux.dev/
>>      - Change tests by using newly added test_lib.sh.
>>      - Simplify to get bool variable producer_clang.
>>      - Try to avoid producer_clang appearance in dwarf_loader.c in order to avoid
>>        clear separation between clang and gcc.
>>    v1 -> v2:
>>      - v1: https://lore.kernel.org/bpf/20260305225455.1151066-1-yonghong.song@linux.dev/
>>      - Added producer_clang guarding in btf_encoder. Otherwise, gcc kernel build
>>        will crash pahole.
>>      - Fix an early return in parameter__reg() which didn't do pthread_mutex_unlock()
>>        which caused the deadlock for arm64.
>>      - Add a few more places to guard with producer_clang and conf->true_signature
>>        to maintain the previous behavior if not clang or conf->true_signature is false.
>>
> In order to be a bit more concrete about a proposed way forward, I'm thinking something
> along the lines of the attached patch (which should apply on top of this whole series);
> rather than doing prescans etc, we record param info as we go as we do today, and once done
> compute true signature info. This saves some complexity around prescan of params etc, so
> is a bit more  consistent with what's there today. Ideally we'd be able to enhance DWARF
> processing for both cases (you have some great improvements in that area in this series),
> and unify the representation of modified signatures where feasible. Let me know what you think.

Thanks Alan! I agree that we should have as much common codes as possible for clang and gcc.
I will check and try to understand your new patch. If everything is fine, I will incorporate
this patch into the patch series.

Yonghong

>
> Thanks!
>
> Alan


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-06-16  4:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
2026-06-11  9:15   ` Alan Maguire
2026-05-23 16:57 ` [PATCH dwarves v5 02/11] dwarf_loader: Prescan all parameters with expected registers Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 03/11] dwarf_loader: Handle signatures with dead arguments Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 04/11] dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 05/11] dwarf_laoder: Handle locations with DW_OP_fbreg Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 06/11] dwarf_loader: Change exprlen checking condition in parameter__reg() Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 07/11] dwarf_loader: Detect optimized parameters with locations having constant values Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 08/11] dwarf_loader: Check whether two-reg parameter actually use two regs or not Yonghong Song
2026-05-23 16:58 ` [PATCH dwarves v5 09/11] dwarf_loader: Handle expression lists Yonghong Song
2026-05-23 16:58 ` [PATCH dwarves v5 10/11] btf_encoder: Handle optimized parameter properly Yonghong Song
2026-06-11  9:08   ` Alan Maguire
2026-05-23 16:58 ` [PATCH dwarves v5 11/11] tests: Add a few clang true signature tests Yonghong Song
2026-06-15 17:17 ` [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Alan Maguire
2026-06-16  4:06   ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.