BPF List
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>, dwarves@vger.kernel.org
Cc: andrii@kernel.org, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, 	acme@kernel.org, ttreyer@meta.com,
	yonghong.song@linux.dev, song@kernel.org,
		john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, 	jolsa@kernel.org, qmo@kernel.org,
	ihor.solodrai@linux.dev, david.faust@oracle.com,
	 jose.marchesi@oracle.com, bpf@vger.kernel.org
Subject: Re: [RFC dwarves 3/5] dwarf_loader: Collect inline expansion location information
Date: Fri, 24 Oct 2025 10:55:14 -0700	[thread overview]
Message-ID: <6558dc0590b174174321899af9981053db76845c.camel@gmail.com> (raw)
In-Reply-To: <20251024073328.370457-4-alan.maguire@oracle.com>

On Fri, 2025-10-24 at 08:33 +0100, Alan Maguire wrote:
> Collect location information for parameters, inline expansions and ensure it
> does not rely on aspects of the CU that go away when it is freed.
> 
> (This is a slightly differerent approach from Thierry's but it was helped
> greatly by his series; would happily add a Co-developed by here or
> whatever suits)
> 
> Signed-off-by: Alan Maguire <alan.maguire>
> ---
>  dwarf_loader.c | 277 +++++++++++++++++++++++++++++++++++++++----------
>  dwarves.h      |  48 ++++++++-
>  2 files changed, 266 insertions(+), 59 deletions(-)
> 
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 4656575..a7ae497 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -1185,29 +1185,54 @@ static ptrdiff_t __dwarf_getlocations(Dwarf_Attribute *attr,
>  	return ret;
>  }
>  
> -/* For DW_AT_location 'attr':
> - * - if first location is DW_OP_regXX with expected number, return the register;
> - *   otherwise save the register for later return
> - * - if location DW_OP_entry_value(DW_OP_regXX) with expected number is in the
> - *   list, return the register; otherwise save register for later return
> - * - otherwise if no register was found for locations, return -1.
> +/* Retrieve location information for parameter; focus on simple locations
> + * like constants and register values.  Support multiple registers as
> + * it is possible for a value (struct) to be passed via multiple registers.
> + * Handle edge cases like multiple instances of same location value, but
> + * avoid cases with large (>1 size) expressions to keep things simple.
> + * This covers the vast majority of cases.  The only unhandled atom is
> + * DW_OP_GNU_parameter_ref; future work could add that and improve
> + * location handling.  In practice the below supports the majority
> + * of parameter locations.
>   */
> -static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
> +static int parameter__locs(Dwarf_Die *die, Dwarf_Attribute *attr, struct parameter *parm)
>  {
> -	Dwarf_Addr base, start, end;
> -	Dwarf_Op *expr, *entry_ops;
> -	Dwarf_Attribute entry_attr;
> -	size_t exprlen, entry_len;
> +	Dwarf_Addr base, start, end, first = -1;
> +	Dwarf_Attribute next_attr;
>  	ptrdiff_t offset = 0;
> -	int loc_num = -1;
> +	Dwarf_Op *expr;
> +	size_t exprlen;
>  	int ret = -1;
>  
> +	/* parameter__locs() can be called recursively, but at toplevel
> +	 * die is non-NULL signalling we need to look up loc/const attrs.
> +	 */
> +	if (die) {
> +		if (dwarf_attr(die, DW_AT_const_value, attr) != NULL) {
> +			parm->has_loc = 1;
> +			parm->optimized = 1;
> +			parm->locs[0].is_const = 1;
> +			parm->nlocs = 1;
> +			parm->locs[0].size = 8;
> +			parm->locs[0].value = attr_numeric(die, DW_AT_const_value);
> +			return 0;
> +		}
> +		if (dwarf_attr(die, DW_AT_location, attr) == NULL)
> +			return 0;
> +	}
> +
>  	/* use libdw__lock as dwarf_getlocation(s) has concurrency issues
>  	 * when libdw is not compiled with experimental --enable-thread-safety
>  	 */
>  	pthread_mutex_lock(&libdw__lock);
>  	while ((offset = __dwarf_getlocations(attr, offset, &base, &start, &end, &expr, &exprlen)) > 0) {
> -		loc_num++;
> +		/* We only want location info referring to start of function;
> +		 * assumes we get location info in address order; empirically
> +		 * this is the case.  Only exception is DW_OP_*entry_value
> +		 * location info which always refers to the value on entry.
> +		 */
> +		if (first == -1)

<moving comments from github>

Note: an alternative is to check that address range associated with
location corresponds to the starting address of the inline expansion,
e.g. like in [1]. I think it is a more correct approach.

[1] https://github.com/eddyz87/inline-address-printer/blob/master/main.c#L184

> +			first = start;
>  
>  		/* Convert expression list (XX DW_OP_stack_value) -> (XX).
>  		 * DW_OP_stack_value instructs interpreter to pop current value from
> @@ -1216,33 +1241,154 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
>  		if (exprlen > 1 && expr[exprlen - 1].atom == DW_OP_stack_value)
>  			exprlen--;
>  
> -		if (exprlen != 1)
> -			continue;
> +		if (exprlen > 1) {
> +			/* ignore complex exprs not at start of function,
> +			 * but bail if we hit a complex loc expr at the start.
> +			 */
> +			if (start != first)
> +				continue;
> +			ret = -1;
> +			goto out;
> +		}
>  
>  		switch (expr->atom) {
> -		/* match DW_OP_regXX at first location */
> +		case DW_OP_deref:
> +			if (parm->nlocs > 0)
> +				parm->locs[parm->nlocs - 1].is_deref = 1;
> +			else
> +				ret = -1;
> +			break;
>  		case DW_OP_reg0 ... DW_OP_reg31:
> -			if (loc_num != 0)
> +			if (start != first || parm->nlocs > 1)
> +				break;
> +			/* avoid duplicate location value */
> +			if (parm->nlocs > 0 && parm->locs[parm->nlocs - 1].reg ==
> +					       (expr->atom - DW_OP_reg0))
> +				break;
> +			parm->locs[parm->nlocs].reg = expr->atom - DW_OP_reg0;
> +			parm->locs[parm->nlocs].is_deref = 0;
> +			parm->locs[parm->nlocs].size = 8;
> +			parm->locs[parm->nlocs++].offset = 0;
> +			ret = 0;
> +			break;
> +		case DW_OP_fbreg:
> +		case DW_OP_breg0 ... DW_OP_breg31:
> +			if (start != first || parm->nlocs > 1)
>  				break;
> -			ret = expr->atom;
> -			if (ret == expected_reg)
> -				goto out;
> +			/* avoid duplicate location value */
> +			if (parm->nlocs > 0 && parm->locs[parm->nlocs - 1].reg ==
> +					       (expr->atom - DW_OP_breg0)) {
> +				if (parm->locs[parm->nlocs - 1].offset != expr->offset)
> +					ret = -1;
> +				break;
> +			}
> +			parm->locs[parm->nlocs].reg = expr->atom - DW_OP_breg0;
> +			parm->locs[parm->nlocs].is_deref = 1;
> +			parm->locs[parm->nlocs].size = 8;
> +			parm->locs[parm->nlocs++].offset = expr->offset;

I think this should be `expr->number`:

  /* One operation in a DWARF location expression.
     A location expression is an array of these.  */
  typedef struct
  {
    uint8_t atom;                 /* Operation */
    Dwarf_Word number;            /* Operand */
    Dwarf_Word number2;           /* Possible second operand */
    Dwarf_Word offset;            /* Offset in location expression */
  } Dwarf_Op;

> +			ret = 0;
> +			break;
> +		case DW_OP_lit0 ... DW_OP_lit31:
> +			if (start != first)
> +				break;
> +
> +			if (parm->nlocs > 0 && (expr->atom - DW_OP_lit0) ==
> +					       parm->locs[parm->nlocs - 1].value)
> +				break;
> +			parm->locs[parm->nlocs].is_const = 1;
> +			parm->locs[parm->nlocs].size = 1;
> +			parm->locs[parm->nlocs++].value = expr->atom - DW_OP_lit0;
> +			ret = 0;
> +			break;
> +		case DW_OP_const1u ... DW_OP_consts:
> +			if (start != first)
> +				break;
> +			if (parm->nlocs > 0 && (parm->locs[parm->nlocs - 1].is_const &&
> +			    expr->number == parm->locs[parm->nlocs - 1].value))
> +				break;
> +			parm->locs[parm->nlocs].is_const = 1;
> +			parm->locs[parm->nlocs].value = expr->number;
> +			switch (expr->atom) {
> +			case DW_OP_const1u:
> +				parm->locs[parm->nlocs].size = 1;
> +				break;
> +			case DW_OP_const1s:
> +				parm->locs[parm->nlocs].size = -1;
> +				break;
> +			case DW_OP_const2u:
> +				parm->locs[parm->nlocs].size = 2;
> +				break;
> +			case DW_OP_const2s:
> +				parm->locs[parm->nlocs].size = -2;
> +				break;
> +			case DW_OP_const4u:
> +				parm->locs[parm->nlocs].size = 4;
> +				break;
> +			case DW_OP_const4s:
> +				parm->locs[parm->nlocs].size = -4;
> +				break;
> +			case DW_OP_const8u:
> +			case DW_OP_constu:
> +				parm->locs[parm->nlocs].size = 8;
> +				break;
> +			case DW_OP_const8s:
> +			case DW_OP_consts:
> +				parm->locs[parm->nlocs].size = -8;
> +				break;
> +			}
> +			parm->nlocs++;
> +			ret = 0;
> +			break;
> +		case DW_OP_addr:
> +			if (start != first || parm->nlocs > 0)
> +				break;
> +			parm->locs[parm->nlocs].is_const = 1;
> +			parm->locs[parm->nlocs].is_addr = 1;
> +			parm->locs[parm->nlocs].size = 8;
> +			parm->locs[parm->nlocs++].value = expr->number;
> +			ret = 0;
>  			break;
> -		/* match DW_OP_entry_value(DW_OP_regXX) at any location */
>  		case DW_OP_entry_value:
>  		case DW_OP_GNU_entry_value:
> -			if (dwarf_getlocation_attr(attr, expr, &entry_attr) == 0 &&
> -			    dwarf_getlocation(&entry_attr, &entry_ops, &entry_len) == 0 &&
> -			    entry_len == 1) {
> -				ret = entry_ops->atom;
> -				if (ret == expected_reg)
> -					goto out;
> +			/* Match DW_OP_entry_value(DW_OP_regXX) at any offset
> +			 * in function since it always describes value on entry.
> +			 */
> +			if (dwarf_getlocation_attr(attr, expr, &next_attr) == 0) {
> +				pthread_mutex_unlock(&libdw__lock);
> +				return parameter__locs(NULL, &next_attr, parm);
>  			}
> +			ret = -1;
> +			break;
> +		case DW_OP_implicit_pointer:
> +			if (start != first)
> +				break;
> +			if (dwarf_getlocation_implicit_pointer(attr, expr, &next_attr) == 0) {
> +				pthread_mutex_unlock(&libdw__lock);
> +				return parameter__locs(NULL, &next_attr, parm);
> +			}
> +			ret = -1;
> +			break;
> +		case DW_OP_implicit_value:
> +			if (start != first)
> +				break;
> +			if (dwarf_getlocation_attr(attr, expr, &next_attr) == 0) {
> +				pthread_mutex_unlock(&libdw__lock);
> +				return parameter__locs(NULL, &next_attr, parm);
> +			}
> +			ret = -1;
> +			break;
> +		default:
> +			/* unhandled op */
> +			ret = -1;
>  			break;
>  		}
> +		if (ret == -1)
> +			break;
>  	}
>  out:
>  	pthread_mutex_unlock(&libdw__lock);
> +	if (ret == 0)
> +		parm->has_loc = 1;
>  	return ret;
>  }
>  

[...]

  reply	other threads:[~2025-10-24 17:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-24  7:33 [RFC dwarves 0/5] pahole: support BTF inline encoding Alan Maguire
2025-10-24  7:33 ` [RFC dwarves 1/5] dwarf_loader: Add parameters list to inlined expansion Alan Maguire
2025-10-24  7:33 ` [RFC dwarves 2/5] dwarf_loader: Add name to inline expansion Alan Maguire
2025-10-24  7:33 ` [RFC dwarves 3/5] dwarf_loader: Collect inline expansion location information Alan Maguire
2025-10-24 17:55   ` Eduard Zingerman [this message]
2025-10-29 17:40     ` Alan Maguire
2025-10-29 18:32       ` Eduard Zingerman
2025-10-29 18:46         ` Alan Maguire
2025-10-24  7:33 ` [RFC dwarves 4/5] btf_encoder: Support encoding of inline " Alan Maguire
2025-10-24 18:04   ` Eduard Zingerman
2025-10-24  7:33 ` [RFC dwarves 5/5] pahole: Support inline encoding with inline[.extra] BTF feature Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6558dc0590b174174321899af9981053db76845c.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=david.faust@oracle.com \
    --cc=dwarves@vger.kernel.org \
    --cc=haoluo@google.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jose.marchesi@oracle.com \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=qmo@kernel.org \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=ttreyer@meta.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox