From: Yonghong Song <yonghong.song@linux.dev>
To: Alan Maguire <alan.maguire@oracle.com>,
Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
dwarves@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH dwarves 7/9] dwarf_loader: Handle expression lists
Date: Thu, 5 Mar 2026 14:55:31 -0800 [thread overview]
Message-ID: <20260305225531.1155994-1-yonghong.song@linux.dev> (raw)
In-Reply-To: <20260305225455.1151066-1-yonghong.song@linux.dev>
The corresponding type is checked for the parameter.
If the parameter size is less or equal to size of long,
the argument should match the corresponding ABI register.
For example:
0x0aba0808: DW_TAG_subprogram
DW_AT_name ("addrconf_ifdown")
DW_AT_calling_convention (DW_CC_nocall)
DW_AT_type (0x0ab7d8e9 "int")
...
0x0aba082b: DW_TAG_formal_parameter
DW_AT_location (indexed (0x32b) loclist = 0x016eabcd:
[0xffffffff83f6fef9, 0xffffffff83f6ff98): DW_OP_reg5 RDI
[0xffffffff83f6ff98, 0xffffffff83f70080): DW_OP_reg12 R12
[0xffffffff83f70080, 0xffffffff83f70111): DW_OP_breg7 RSP+112
[0xffffffff83f70111, 0xffffffff83f7014f): DW_OP_reg12 R12
[0xffffffff83f7014f, 0xffffffff83f7123c): DW_OP_breg7 RSP+112
[0xffffffff83f7123c, 0xffffffff83f7128c): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
[0xffffffff83f7128c, 0xffffffff83f712a9): DW_OP_reg12 R12
[0xffffffff83f712a9, 0xffffffff83f712cd): DW_OP_breg7 RSP+112
[0xffffffff83f712cd, 0xffffffff83f712d2): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
[0xffffffff83f712d2, 0xffffffff83f713dd): DW_OP_breg7 RSP+112)
DW_AT_name ("dev")
DW_AT_type (0x0ab7cb7d "net_device *")
...
0x0aba0836: DW_TAG_formal_parameter
DW_AT_location (indexed (0x32c) loclist = 0x016eac39:
[0xffffffff83f6fef9, 0xffffffff83f6ff15): DW_OP_breg4 RSI+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_convert (0x0ab7b571) "DW_ATE_unsigned_1", DW_OP_convert (0x0ab7b576) "DW_ATE_unsigned_8", DW_OP_stack_value
[0xffffffff83f6ff15, 0xffffffff83f7127c): DW_OP_breg7 RSP+36, DW_OP_deref_size 0x4, DW_OP_convert (0x0ab7b571) "DW_ATE_unsigned_1", DW_OP_convert (0x0ab7b576) "DW_ATE_unsigned_8", DW_OP_stack_value
[0xffffffff83f7128c, 0xffffffff83f713dd): DW_OP_breg7 RSP+36, DW_OP_deref_size 0x4, DW_OP_convert (0x0ab7b571) "DW_ATE_unsigned_1", DW_OP_convert (0x0ab7b576) "DW_ATE_unsigned_8", DW_OP_stack_value)
DW_AT_name ("unregister")
DW_AT_type (0x0ab7c933 "bool")
...
The parameter 'unregister' is the second argument which matches ABI register RSI.
So the function "addrconf_ifdown" signature is valid.
If the parameter size is '2 x size_of_long', more handling is necessary, e.g., below:
0x0a01e174: DW_TAG_subprogram
DW_AT_name ("check_zeroed_sockptr")
DW_AT_calling_convention (DW_CC_nocall)
DW_AT_type (0x09fead35 "int")
...
0x0a01e187: DW_TAG_formal_parameter
DW_AT_location (indexed (0x5b6) loclist = 0x0157f03f:
[0xffffffff83c941c0, 0xffffffff83c941c4): DW_OP_reg5 RDI, DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
[0xffffffff83c941c4, 0xffffffff83c941cc): DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
[0xffffffff83c941e1, 0xffffffff83c941e4): DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1)
DW_AT_name ("src")
DW_AT_type (0x09ff832d "sockptr_t")
...
0x0a01e193: DW_TAG_formal_parameter
DW_AT_const_value (64)
DW_AT_name ("offset")
DW_AT_type (0x09fee984 "size_t")
...
0x0a01e19e: DW_TAG_formal_parameter
DW_AT_location (indexed (0x5b7) loclist = 0x0157f06b:
[0xffffffff83c941c0, 0xffffffff83c941d1): DW_OP_reg1 RDX
[0xffffffff83c941d1, 0xffffffff83c941e1): DW_OP_entry_value(DW_OP_reg1 RDX), DW_OP_stack_value
[0xffffffff83c941e1, 0xffffffff83c941e9): DW_OP_reg1 RDX)
DW_AT_name ("size")
DW_AT_type (0x09fee984 "size_t")
...
The first parameter 'src' will take two ABI registers. This patch correctly detects such a pattern
to construct the true signature.
However, it is possible that only one 'size_of_long' is used from '2 x size_of_long'. For example
0x019520c6: DW_TAG_subprogram
DW_AT_name ("map_create")
DW_AT_calling_convention (DW_CC_nocall)
DW_AT_type (0x01934b29 "int")
...
0x01952111: DW_TAG_formal_parameter
DW_AT_location (indexed (0x31b) loclist = 0x0034fa0f:
[0xffffffff81892345, 0xffffffff8189237c): DW_OP_reg5 RDI
[0xffffffff8189237c, 0xffffffff818923bd): DW_OP_reg3 RBX
[0xffffffff818923bd, 0xffffffff818923d4): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
[0xffffffff818923d4, 0xffffffff81892dcb): DW_OP_reg3 RBX
[0xffffffff81892df3, 0xffffffff81892e01): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
[0xffffffff81892e01, 0xffffffff818932a9): DW_OP_reg3 RBX)
DW_AT_name ("attr")
DW_AT_type (0x01934d17 "bpf_attr *")
...
0x0195211d: DW_TAG_formal_parameter
DW_AT_location (indexed (0x31a) loclist = 0x0034f9dc:
[0xffffffff81892345, 0xffffffff81892357): DW_OP_piece 0x8, DW_OP_reg4 RSI, DW_OP_piece 0x1
[0xffffffff81892357, 0xffffffff81892f02): DW_OP_piece 0x8, DW_OP_breg7 RSP+20, DW_OP_deref_size 0x4, DW_OP_stack_value, DW_OP_piece 0x1
[0xffffffff81892f07, 0xffffffff818932a9): DW_OP_piece 0x8, DW_OP_breg7 RSP+20, DW_OP_deref_size 0x4, DW_OP_stack_value, DW_OP_piece 0x1)
DW_AT_name ("uattr")
DW_AT_type (0x019512ab "bpfptr_t")
...
For parameter 'uattr', only second half of parameter is used. For such cases,
the name and the type is changed in pahole and eventually going to vmlinux btf.
[55697] FUNC_PROTO '(anon)' ret_type_id=106780 vlen=2
'attr' type_id=455
'uattr__is_kernel' type_id=82014
[82014] TYPEDEF 'bool' type_id=67434
[113251] FUNC 'map_create' type_id=55697 linkage=static
You can see the new parameter name is 'uattr__is_kernel' and the type is 'bool'.
This makes thing easier for users to get the true signature.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
dwarf_loader.c | 225 +++++++++++++++++++++++++++++++++++++++++++++++--
dwarves.h | 1 +
2 files changed, 219 insertions(+), 7 deletions(-)
diff --git a/dwarf_loader.c b/dwarf_loader.c
index 712a957..3a9b2c0 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -1100,6 +1100,16 @@ static void arch__set_register_params(const GElf_Ehdr *ehdr, struct cu *cu)
}
}
+static bool arch__agg_use_two_regs(const GElf_Ehdr *ehdr)
+{
+ switch (ehdr->e_machine) {
+ case EM_S390:
+ return false;
+ default:
+ return true;
+ }
+}
+
static struct template_type_param *template_type_param__new(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
{
struct template_type_param *ttparm = tag__alloc(cu, sizeof(*ttparm));
@@ -1199,8 +1209,79 @@ struct func_info {
#define PARM_FBREG_FAIL -2
#define PARM_OPTIMIZED_CLANG -3
#define PARM_CONTINUE -4
+#define PARM_TWO_ADDR_LEN -5
+#define PARM_TO_BE_IMPROVED -6
+
+static int __get_type_byte_size(Dwarf_Die *die) {
+ Dwarf_Attribute attr;
+ if (dwarf_attr(die, DW_AT_type, &attr) == NULL)
+ return 0;
+
+ Dwarf_Die type_die;
+ if (dwarf_formref_die(&attr, &type_die) == NULL)
+ return 0;
+
+ uint64_t bsize = attr_numeric(&type_die, DW_AT_byte_size);
+ if (bsize == 0)
+ return __get_type_byte_size(&type_die);
+
+ return bsize;
+}
+
+static int get_type_byte_size(Dwarf_Die *die) {
+ int byte_size = 0;
+
+ Dwarf_Attribute attr;
+ if (dwarf_attr(die, DW_AT_abstract_origin, &attr)) {
+ Dwarf_Die origin;
+ if (dwarf_formref_die(&attr, &origin))
+ byte_size = __get_type_byte_size(&origin);
+ } else {
+ byte_size = __get_type_byte_size(die);
+ }
+ return byte_size;
+}
+
+/* Traverse the parameter type until finding the member type which has expected
+ * struct type offset.
+*/
+static Dwarf_Die *get_member_with_offset(Dwarf_Die *die, int offset, Dwarf_Die *member_die) {
+ Dwarf_Attribute attr;
+ if (dwarf_attr(die, DW_AT_type, &attr) == NULL)
+ return NULL;
+
+ Dwarf_Die type_die;
+ if (dwarf_formref_die(&attr, &type_die) == NULL)
+ return NULL;
+
+ uint64_t bsize = attr_numeric(&type_die, DW_AT_byte_size);
+ if (bsize == 0)
+ return get_member_with_offset(&type_die, offset, member_die);
+
+ if (dwarf_tag(&type_die) != DW_TAG_structure_type)
+ return NULL;
-static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num) {
+ if (!dwarf_haschildren(&type_die) || dwarf_child(&type_die, member_die) != 0)
+ return NULL;
+ do {
+ if (dwarf_tag(member_die) != DW_TAG_member)
+ continue;
+
+ int off = attr_numeric(member_die, DW_AT_data_bit_offset);
+ if (off == offset * 8)
+ return member_die;
+ } while (dwarf_siblingof(member_die, member_die) == 0);
+
+ return NULL;
+}
+
+/* For two address length case, lower_half and upper_half represents the parameter.
+ * The lower_half and upper_half accumulates field information across possible multiple
+ * location lists.
+ */
+static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num, struct cu *cu, size_t exprlen,
+ Dwarf_Die *die, int expected_reg, int byte_size,
+ unsigned long *lower_half, unsigned long *upper_half, int *ret) {
switch (expr[0].atom) {
case DW_OP_lit0 ... DW_OP_lit31:
case DW_OP_constu:
@@ -1210,9 +1291,119 @@ static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num) {
return PARM_OPTIMIZED_CLANG;
}
+ if (byte_size <= cu->addr_size || !cu->agg_use_two_regs) {
+ switch (expr[0].atom) {
+ case DW_OP_reg0 ... DW_OP_reg31:
+ if (loc_num != 0)
+ break;
+ *ret = expr[0].atom;
+ if (*ret == expected_reg)
+ return *ret;
+ break;
+ case DW_OP_breg0 ... DW_OP_breg31:
+ if (loc_num != 0)
+ break;
+ bool has_op_stack_value = false;
+ for (int i = 1; i < exprlen; i++) {
+ if (expr[i].atom == DW_OP_stack_value) {
+ has_op_stack_value = true;
+ break;
+ }
+ }
+ if (!has_op_stack_value)
+ break;
+ /* The existence of DW_OP_stack_value means that
+ * DW_OP_bregX register is used as value.
+ */
+ *ret = expr[0].atom - DW_OP_breg0 + DW_OP_reg0;
+ if (*ret == expected_reg)
+ return *ret;
+ }
+ } else {
+ /* cu->addr * 2 */
+ int off = 0;
+ for (int i = 0; i < exprlen; i++) {
+ if (expr[i].atom == DW_OP_piece) {
+ int num = expr[i].number;
+ if (i == 0) {
+ off = num;
+ continue;
+ }
+ if (off < cu->addr_size) (*lower_half) |= (1 << off);
+ else (*upper_half) |= (1 << (off - cu->addr_size));
+ off += num;
+ } else if (expr[i].atom >= DW_OP_reg0 && expr[i].atom <= DW_OP_reg31) {
+ if (off < cu->addr_size)
+ *ret = expr[i].atom;
+ else if (*ret < 0)
+ *ret = expr[i].atom;
+ }
+ /* FIXME: not handling DW_OP_bregX yet since we do not have
+ * a use case for it yet for linux kernel.
+ */
+ }
+ }
+
return PARM_CONTINUE;
}
+/* The lower_half and upper_half, computed in parameter__multiple_exprs(), are handled here.
+ */
+static int parameter__handle_two_addr_len(int expected_reg, unsigned long lower_half, unsigned long upper_half,
+ int ret, Dwarf_Die *die, struct conf_load *conf, struct cu *cu,
+ struct parameter *parm) {
+ if (!lower_half && !upper_half)
+ return ret;
+
+ if (ret != expected_reg)
+ return ret;
+
+ if (!conf->true_signature)
+ return PARM_DEFAULT_FAIL;
+
+ /* Both halfs are used based on dwarf */
+ if (lower_half && upper_half)
+ return PARM_TWO_ADDR_LEN;
+
+ /* FIXME: parm->name may be NULL due to abstract origin. We do not want to
+ * update abstract origin as the type in abstract origin may be used
+ * in some other places. We could remove abstract origin in this parameter
+ * and add name and type in parameter itself. Right now, for current bpf-next
+ * repo, we do not have instances below where parm->name is NULL for x86_64 arch.
+ */
+ if (!parm->name)
+ return PARM_TO_BE_IMPROVED;
+
+ /* FIXME: Only support single field now so we can have a good parameter name and
+ * type for it.
+ */
+ if (__builtin_popcountll(lower_half) >= 2 || __builtin_popcountll(upper_half) >= 2)
+ return PARM_TO_BE_IMPROVED;
+
+ int field_offset;
+ if (__builtin_popcountll(lower_half) == 1)
+ field_offset = __builtin_ctzll(lower_half);
+ else
+ field_offset = cu->addr_size + __builtin_ctzll(upper_half);
+
+ /* FIXME: Only struct type is supported. */
+ Dwarf_Die member_die;
+ if (!get_member_with_offset(die, field_offset, &member_die))
+ return PARM_TO_BE_IMPROVED;
+
+ const char *member_name = attr_string(&member_die, DW_AT_name, conf);
+ int len = sizeof(parm->name) + sizeof(member_name) + 3;
+ char *new_name = malloc(len);
+ sprintf(new_name, "%s__%s", parm->name, member_name);
+ parm->name = new_name;
+
+ struct tag *tag = &parm->tag;
+ struct dwarf_tag *dtag = tag__dwarf(tag);
+ dwarf_tag__set_attr_type(dtag, type, &member_die, DW_AT_type);
+
+ return ret;
+}
+
/* For DW_AT_location 'attr':
* - if first location is DW_OP_regXX with expected number, return the register;
* otherwise save the register for later return
@@ -1220,15 +1411,18 @@ static int parameter__multi_exprs(Dwarf_Op *expr, int loc_num) {
* list, return the register; otherwise save register for later return
* - otherwise if no register was found for locations, return PARM_DEFAULT_FAIL.
*/
-static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct cu *cu, struct conf_load *conf)
+static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct cu *cu, struct conf_load *conf,
+ Dwarf_Die *die, struct parameter *parm, struct func_info *info)
{
Dwarf_Addr base, start, end;
Dwarf_Op *expr, *entry_ops;
Dwarf_Attribute entry_attr;
size_t exprlen, entry_len;
ptrdiff_t offset = 0;
+ int byte_size = 0;
int loc_num = -1;
int ret = PARM_DEFAULT_FAIL;
+ unsigned long lower_half = 0, upper_half = 0;
/* use libdw__lock as dwarf_getlocation(s) has concurrency issues
* when libdw is not compiled with experimental --enable-thread-safety
@@ -1248,8 +1442,15 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct cu *cu
if (!cu->producer_clang || !conf->true_signature)
continue;
+ if (!byte_size)
+ byte_size = get_type_byte_size(die);
+ /* This should not happen. */
+ if (!byte_size)
+ return PARM_DEFAULT_FAIL;
+
int res;
- res = parameter__multi_exprs(expr, loc_num);
+ res = parameter__multi_exprs(expr, loc_num, cu, exprlen, die, expected_reg,
+ byte_size, &lower_half, &upper_half, &ret);
if (res == PARM_CONTINUE)
continue;
ret = res;
@@ -1297,6 +1498,10 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg, struct cu *cu
break;
}
}
+
+ ret = parameter__handle_two_addr_len(expected_reg, lower_half, upper_half,
+ ret, die, conf, cu, parm);
+
out:
pthread_mutex_unlock(&libdw__lock);
return ret;
@@ -1332,8 +1537,6 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
}
}
reg_idx = param_idx - info->skip_idx;
- if (reg_idx >= cu->nr_register_params)
- return parm;
/* Parameters which use DW_AT_abstract_origin to point at
* the original parameter definition (with no name in the DIE)
* are the result of later DWARF generation during compilation
@@ -1371,15 +1574,22 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
parm->has_loc = dwarf_attr(die, DW_AT_location, &attr) != NULL;
if (parm->has_loc) {
+ if (reg_idx >= cu->nr_register_params)
+ return parm;
+
int expected_reg = cu->register_params[reg_idx];
- int actual_reg = parameter__reg(&attr, expected_reg, cu, conf);
+ int actual_reg = parameter__reg(&attr, expected_reg, cu, conf, die, parm, info);
if (actual_reg == PARM_DEFAULT_FAIL) {
parm->optimized = 1;
} else if (actual_reg == PARM_OPTIMIZED_CLANG) {
parm->optimized = 1;
info->skip_idx++;
- } else if (actual_reg == PARM_FBREG_FAIL || (expected_reg >= 0 && expected_reg != actual_reg)) {
+ } else if (actual_reg == PARM_TWO_ADDR_LEN) {
+ /* account for parameter with two registers */
+ info->skip_idx--;
+ } else if (actual_reg == PARM_FBREG_FAIL || actual_reg == PARM_TO_BE_IMPROVED ||
+ (expected_reg >= 0 && expected_reg != actual_reg)) {
/* mark parameters that use an unexpected
* register to hold a parameter; these will
* be problematic for users of BTF as they
@@ -3419,6 +3629,7 @@ static int cu__set_common(struct cu *cu, struct conf_load *conf,
cu->little_endian = ehdr.e_ident[EI_DATA] == ELFDATA2LSB;
cu->nr_register_params = arch__nr_register_params(&ehdr);
+ cu->agg_use_two_regs = arch__agg_use_two_regs(&ehdr);
arch__set_register_params(&ehdr, cu);
return 0;
}
diff --git a/dwarves.h b/dwarves.h
index ad33828..b7bae87 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -303,6 +303,7 @@ struct cu {
uint8_t uses_global_strings:1;
uint8_t little_endian:1;
uint8_t producer_clang:1;
+ uint8_t agg_use_two_regs:1; /* An aggregate like {long a; long b;} */
uint8_t nr_register_params;
int register_params[ARCH_MAX_REGISTER_PARAMS];
int functions_saved;
--
2.47.3
next prev parent reply other threads:[~2026-03-05 22:55 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 22:54 [PATCH dwarves 0/9] pahole: Encode true signatures in kernel BTF Yonghong Song
2026-03-05 22:55 ` [PATCH dwarves 1/9] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
2026-03-19 12:32 ` Jiri Olsa
2026-03-19 17:31 ` Yonghong Song
2026-03-05 22:55 ` [PATCH dwarves 2/9] dwarf_loader: Handle signatures with dead arguments Yonghong Song
2026-03-19 18:55 ` Alan Maguire
2026-03-20 5:00 ` Yonghong Song
2026-03-20 19:20 ` Yonghong Song
2026-03-05 22:55 ` [PATCH dwarves 3/9] dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL Yonghong Song
2026-03-05 22:55 ` [PATCH dwarves 4/9] dwarf_laoder: Handle locations with DW_OP_fbreg Yonghong Song
2026-03-05 22:55 ` [PATCH dwarves 5/9] dwarf_loader: Change exprlen checking condition in parameter__reg() Yonghong Song
2026-03-05 22:55 ` [PATCH dwarves 6/9] dwarf_loader: Detect optimized parameters with locations having constant values Yonghong Song
2026-03-05 22:55 ` Yonghong Song [this message]
2026-03-05 22:55 ` [PATCH dwarves 8/9] btf_encoder: Handle optimized parameter properly Yonghong Song
2026-03-05 22:55 ` [PATCH dwarves 9/9] tests: Add a few clang true signature tests Yonghong Song
2026-03-19 18:48 ` Alan Maguire
2026-03-20 4:52 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260305225531.1155994-1-yonghong.song@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=arnaldo.melo@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=kernel-team@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.