From: Eduard Zingerman <eddyz87@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>, dwarves@vger.kernel.org
Cc: andrii@kernel.org, ast@kernel.org, daniel@iogearbox.net,
martin.lau@linux.dev, acme@kernel.org, ttreyer@meta.com,
yonghong.song@linux.dev, song@kernel.org,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, jolsa@kernel.org, qmo@kernel.org,
ihor.solodrai@linux.dev, david.faust@oracle.com,
jose.marchesi@oracle.com, bpf@vger.kernel.org
Subject: Re: [RFC dwarves 3/5] dwarf_loader: Collect inline expansion location information
Date: Fri, 24 Oct 2025 10:55:14 -0700 [thread overview]
Message-ID: <6558dc0590b174174321899af9981053db76845c.camel@gmail.com> (raw)
In-Reply-To: <20251024073328.370457-4-alan.maguire@oracle.com>
On Fri, 2025-10-24 at 08:33 +0100, Alan Maguire wrote:
> Collect location information for parameters, inline expansions and ensure it
> does not rely on aspects of the CU that go away when it is freed.
>
> (This is a slightly differerent approach from Thierry's but it was helped
> greatly by his series; would happily add a Co-developed by here or
> whatever suits)
>
> Signed-off-by: Alan Maguire <alan.maguire>
> ---
> dwarf_loader.c | 277 +++++++++++++++++++++++++++++++++++++++----------
> dwarves.h | 48 ++++++++-
> 2 files changed, 266 insertions(+), 59 deletions(-)
>
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 4656575..a7ae497 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -1185,29 +1185,54 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
> return ret;
> }
>
> -/* For DW_AT_location 'attr':
> - * - if first location is DW_OP_regXX with expected number, return the register;
> - * otherwise save the register for later return
> - * - if location DW_OP_entry_value(DW_OP_regXX) with expected number is in the
> - * list, return the register; otherwise save register for later return
> - * - otherwise if no register was found for locations, return -1.
> +/* Retrieve location information for parameter; focus on simple locations
> + * like constants and register values. Support multiple registers as
> + * it is possible for a value (struct) to be passed via multiple registers.
> + * Handle edge cases like multiple instances of same location value, but
> + * avoid cases with large (>1 size) expressions to keep things simple.
> + * This covers the vast majority of cases. The only unhandled atom is
> + * DW_OP_GNU_parameter_ref; future work could add that and improve
> + * location handling. In practice the below supports the majority
> + * of parameter locations.
> */
> -static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
> +static int parameter__locs(Dwarf_Die *die, Dwarf_Attribute *attr, struct parameter *parm)
> {
> - Dwarf_Addr base, start, end;
> - Dwarf_Op *expr, *entry_ops;
> - Dwarf_Attribute entry_attr;
> - size_t exprlen, entry_len;
> + Dwarf_Addr base, start, end, first = -1;
> + Dwarf_Attribute next_attr;
> ptrdiff_t offset = 0;
> - int loc_num = -1;
> + Dwarf_Op *expr;
> + size_t exprlen;
> int ret = -1;
>
> + /* parameter__locs() can be called recursively, but at toplevel
> + * die is non-NULL signalling we need to look up loc/const attrs.
> + */
> + if (die) {
> + if (dwarf_attr(die, DW_AT_const_value, attr) != NULL) {
> + parm->has_loc = 1;
> + parm->optimized = 1;
> + parm->locs[0].is_const = 1;
> + parm->nlocs = 1;
> + parm->locs[0].size = 8;
> + parm->locs[0].value = attr_numeric(die, DW_AT_const_value);
> + return 0;
> + }
> + if (dwarf_attr(die, DW_AT_location, attr) == NULL)
> + return 0;
> + }
> +
> /* use libdw__lock as dwarf_getlocation(s) has concurrency issues
> * when libdw is not compiled with experimental --enable-thread-safety
> */
> pthread_mutex_lock(&libdw__lock);
> while ((offset = __dwarf_getlocations(attr, offset, &base, &start, &end, &expr, &exprlen)) > 0) {
> - loc_num++;
> + /* We only want location info referring to start of function;
> + * assumes we get location info in address order; empirically
> + * this is the case. Only exception is DW_OP_*entry_value
> + * location info which always refers to the value on entry.
> + */
> + if (first == -1)
<moving comments from github>
Note: an alternative is to check that address range associated with
location corresponds to the starting address of the inline expansion,
e.g. like in [1]. I think it is a more correct approach.
[1] https://github.com/eddyz87/inline-address-printer/blob/master/main.c#L184
> + first = start;
>
> /* Convert expression list (XX DW_OP_stack_value) -> (XX).
> * DW_OP_stack_value instructs interpreter to pop current value from
> @@ -1216,33 +1241,154 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
> if (exprlen > 1 && expr[exprlen - 1].atom == DW_OP_stack_value)
> exprlen--;
>
> - if (exprlen != 1)
> - continue;
> + if (exprlen > 1) {
> + /* ignore complex exprs not at start of function,
> + * but bail if we hit a complex loc expr at the start.
> + */
> + if (start != first)
> + continue;
> + ret = -1;
> + goto out;
> + }
>
> switch (expr->atom) {
> - /* match DW_OP_regXX at first location */
> + case DW_OP_deref:
> + if (parm->nlocs > 0)
> + parm->locs[parm->nlocs - 1].is_deref = 1;
> + else
> + ret = -1;
> + break;
> case DW_OP_reg0 ... DW_OP_reg31:
> - if (loc_num != 0)
> + if (start != first || parm->nlocs > 1)
> + break;
> + /* avoid duplicate location value */
> + if (parm->nlocs > 0 && parm->locs[parm->nlocs - 1].reg ==
> + (expr->atom - DW_OP_reg0))
> + break;
> + parm->locs[parm->nlocs].reg = expr->atom - DW_OP_reg0;
> + parm->locs[parm->nlocs].is_deref = 0;
> + parm->locs[parm->nlocs].size = 8;
> + parm->locs[parm->nlocs++].offset = 0;
> + ret = 0;
> + break;
> + case DW_OP_fbreg:
> + case DW_OP_breg0 ... DW_OP_breg31:
> + if (start != first || parm->nlocs > 1)
> break;
> - ret = expr->atom;
> - if (ret == expected_reg)
> - goto out;
> + /* avoid duplicate location value */
> + if (parm->nlocs > 0 && parm->locs[parm->nlocs - 1].reg ==
> + (expr->atom - DW_OP_breg0)) {
> + if (parm->locs[parm->nlocs - 1].offset != expr->offset)
> + ret = -1;
> + break;
> + }
> + parm->locs[parm->nlocs].reg = expr->atom - DW_OP_breg0;
> + parm->locs[parm->nlocs].is_deref = 1;
> + parm->locs[parm->nlocs].size = 8;
> + parm->locs[parm->nlocs++].offset = expr->offset;
I think this should be `expr->number`:
/* One operation in a DWARF location expression.
A location expression is an array of these. */
typedef struct
{
uint8_t atom; /* Operation */
Dwarf_Word number; /* Operand */
Dwarf_Word number2; /* Possible second operand */
Dwarf_Word offset; /* Offset in location expression */
} Dwarf_Op;
> + ret = 0;
> + break;
> + case DW_OP_lit0 ... DW_OP_lit31:
> + if (start != first)
> + break;
> +
> + if (parm->nlocs > 0 && (expr->atom - DW_OP_lit0) ==
> + parm->locs[parm->nlocs - 1].value)
> + break;
> + parm->locs[parm->nlocs].is_const = 1;
> + parm->locs[parm->nlocs].size = 1;
> + parm->locs[parm->nlocs++].value = expr->atom - DW_OP_lit0;
> + ret = 0;
> + break;
> + case DW_OP_const1u ... DW_OP_consts:
> + if (start != first)
> + break;
> + if (parm->nlocs > 0 && (parm->locs[parm->nlocs - 1].is_const &&
> + expr->number == parm->locs[parm->nlocs - 1].value))
> + break;
> + parm->locs[parm->nlocs].is_const = 1;
> + parm->locs[parm->nlocs].value = expr->number;
> + switch (expr->atom) {
> + case DW_OP_const1u:
> + parm->locs[parm->nlocs].size = 1;
> + break;
> + case DW_OP_const1s:
> + parm->locs[parm->nlocs].size = -1;
> + break;
> + case DW_OP_const2u:
> + parm->locs[parm->nlocs].size = 2;
> + break;
> + case DW_OP_const2s:
> + parm->locs[parm->nlocs].size = -2;
> + break;
> + case DW_OP_const4u:
> + parm->locs[parm->nlocs].size = 4;
> + break;
> + case DW_OP_const4s:
> + parm->locs[parm->nlocs].size = -4;
> + break;
> + case DW_OP_const8u:
> + case DW_OP_constu:
> + parm->locs[parm->nlocs].size = 8;
> + break;
> + case DW_OP_const8s:
> + case DW_OP_consts:
> + parm->locs[parm->nlocs].size = -8;
> + break;
> + }
> + parm->nlocs++;
> + ret = 0;
> + break;
> + case DW_OP_addr:
> + if (start != first || parm->nlocs > 0)
> + break;
> + parm->locs[parm->nlocs].is_const = 1;
> + parm->locs[parm->nlocs].is_addr = 1;
> + parm->locs[parm->nlocs].size = 8;
> + parm->locs[parm->nlocs++].value = expr->number;
> + ret = 0;
> break;
> - /* match DW_OP_entry_value(DW_OP_regXX) at any location */
> case DW_OP_entry_value:
> case DW_OP_GNU_entry_value:
> - if (dwarf_getlocation_attr(attr, expr, &entry_attr) == 0 &&
> - dwarf_getlocation(&entry_attr, &entry_ops, &entry_len) == 0 &&
> - entry_len == 1) {
> - ret = entry_ops->atom;
> - if (ret == expected_reg)
> - goto out;
> + /* Match DW_OP_entry_value(DW_OP_regXX) at any offset
> + * in function since it always describes value on entry.
> + */
> + if (dwarf_getlocation_attr(attr, expr, &next_attr) == 0) {
> + pthread_mutex_unlock(&libdw__lock);
> + return parameter__locs(NULL, &next_attr, parm);
> }
> + ret = -1;
> + break;
> + case DW_OP_implicit_pointer:
> + if (start != first)
> + break;
> + if (dwarf_getlocation_implicit_pointer(attr, expr, &next_attr) == 0) {
> + pthread_mutex_unlock(&libdw__lock);
> + return parameter__locs(NULL, &next_attr, parm);
> + }
> + ret = -1;
> + break;
> + case DW_OP_implicit_value:
> + if (start != first)
> + break;
> + if (dwarf_getlocation_attr(attr, expr, &next_attr) == 0) {
> + pthread_mutex_unlock(&libdw__lock);
> + return parameter__locs(NULL, &next_attr, parm);
> + }
> + ret = -1;
> + break;
> + default:
> + /* unhandled op */
> + ret = -1;
> break;
> }
> + if (ret == -1)
> + break;
> }
> out:
> pthread_mutex_unlock(&libdw__lock);
> + if (ret == 0)
> + parm->has_loc = 1;
> return ret;
> }
>
[...]
next prev parent reply other threads:[~2025-10-24 17:55 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-24 7:33 [RFC dwarves 0/5] pahole: support BTF inline encoding Alan Maguire
2025-10-24 7:33 ` [RFC dwarves 1/5] dwarf_loader: Add parameters list to inlined expansion Alan Maguire
2025-10-24 7:33 ` [RFC dwarves 2/5] dwarf_loader: Add name to inline expansion Alan Maguire
2025-10-24 7:33 ` [RFC dwarves 3/5] dwarf_loader: Collect inline expansion location information Alan Maguire
2025-10-24 17:55 ` Eduard Zingerman [this message]
2025-10-29 17:40 ` Alan Maguire
2025-10-29 18:32 ` Eduard Zingerman
2025-10-29 18:46 ` Alan Maguire
2025-10-24 7:33 ` [RFC dwarves 4/5] btf_encoder: Support encoding of inline " Alan Maguire
2025-10-24 18:04 ` Eduard Zingerman
2025-10-24 7:33 ` [RFC dwarves 5/5] pahole: Support inline encoding with inline[.extra] BTF feature Alan Maguire
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6558dc0590b174174321899af9981053db76845c.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=acme@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=david.faust@oracle.com \
--cc=dwarves@vger.kernel.org \
--cc=haoluo@google.com \
--cc=ihor.solodrai@linux.dev \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=jose.marchesi@oracle.com \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=qmo@kernel.org \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=ttreyer@meta.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox