public inbox for dwarves@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Yao Zi <ziyao@disroot.org>, Alan Maguire <alan.maguire@oracle.com>
Cc: dwarves@vger.kernel.org, bpf@vger.kernel.org, q66 <me@q66.moe>
Subject: Re: [PATCH dwarves] dwarf_loader: Handle DW_AT_location attrs containing DW_OP_plus_uconst
Date: Wed, 3 Dec 2025 16:46:20 -0800	[thread overview]
Message-ID: <a3f82302-09d2-45e1-a30a-38a32ddbf947@linux.dev> (raw)
In-Reply-To: <20251130032113.4938-2-ziyao@disroot.org>



On 11/29/25 7:21 PM, Yao Zi wrote:
> LLVM has a GlobalMerge pass, which tries to group multiple global
> variables together and address them with through a single register with
> offsets coded in instructions, to reduce register pressure. Address of
> symbols transformed by the pass may be represented by an DWARF
> expression consisting of DW_OP_addrx and DW_OP_plus_uconst, which
> naturally matches the way a merged variable is addressed.
>
> However, our dwarf_loader currently ignores anything but the first in
> the location expression, including the DW_OP_plus_uconst atom, which
> appears the second operation in this case. This could result in broken
> BTF information produced by pahole, where several merged symbols are
> given the same offset, even though in fact they don't overlap.
>
> LLVM has enabled MergeGlobal pass for PowerPC[1] and RISC-V[2] by
> default since version 20, let's handle DW_OP_plus_uconst operations in
> DW_AT_location attributes correctly to ensure correct BTF could be
> produced for LLVM-built kernels.
>
> Fixes: a6ea527aab91 ("variable: Add ->addr member")
> Reported-by: q66 <me@q66.moe>
> Closes: https://github.com/ClangBuiltLinux/linux/issues/2089
> Link: https://github.com/llvm/llvm-project/commit/aaa37d6755e6 # [1]
> Link: https://github.com/llvm/llvm-project/commit/9d02264b03ea # [2]
> Signed-off-by: Yao Zi <ziyao@disroot.org>
> ---
>
> The problem is found by several distros building Linux kernel with LLVM
> and BTF enabled, after upgrading to LLVM 20 or later, kernels built for
> RISC-V and PowerPC issue errors like
>
> [    1.296358] BPF:      type_id=4457 offset=4224 size=8
> [    1.296767] BPF:
> [    1.296919] BPF: Invalid offset
>
> on startup, and loading any modules fails with -EINVAL unless
> CONFIG_MODULE_ALLOW_BTF_MISMATCH is turned on,
>
> # insmod tun.ko
> [   12.892421] failed to validate module [tun] BTF: -22
> [   12.936971] failed to validate module [tun] BTF: -22
> insmod: can't insert 'tun.ko': Invalid argument
>
> By comparing DWARF dump and BTF dump, it's found BTF contains symbols
> with the same offset,
>
> type_id=4148 offset=4208 size=8 (VAR 'vector_misaligned_access')
> type_id=4147 offset=4208 size=8 (VAR 'misaligned_access_speed')
>
> while the same symbols are described with different DW_AT_location
> attributes,
>
> 0x0011ade7:   DW_TAG_variable
>                  DW_AT_name      ("misaligned_access_speed")
>                  DW_AT_type      (0x0011adf2 "long")
> 		DW_AT_decl_file	("...")
>                  DW_AT_external  (true)
>                  DW_AT_decl_line (24)
>                  DW_AT_location  (DW_OP_addrx 0x0)
>
> ...
>
> 0x0011adf6:   DW_TAG_variable
>                  DW_AT_name      ("vector_misaligned_access")
>                  DW_AT_type      (0x0011adf2 "long")
>                  DW_AT_external  (true)
>                  DW_AT_decl_file ("...")
>                  DW_AT_decl_line (25)
>                  DW_AT_location  (DW_OP_addrx 0x0, DW_OP_plus_uconst 0x8)
>
> For more detailed analysis and kernel config for reproducing the issue,
> please refer to the Closes link. Thanks for your time and review.
>
>   dwarf_loader.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 79be3f516a26..635015676389 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -708,6 +708,11 @@ static enum vscope dwarf__location(Dwarf_Die *die, uint64_t *addr, struct locati
>   		case DW_OP_addrx:
>   			scope = VSCOPE_GLOBAL;
>   			*addr = expr[0].number;
> +
> +			if (location->exprlen == 2 &&
> +			    expr[1].atom == DW_OP_plus_uconst)
> +				addr += expr[1].number;

This does not work. 'addr' is the parameter and the above new 'addr' value won't
pass back to caller so the above is effectively a noop.

I think we need to add an additional parameter to pass the 'expr[1].number' back
to the caller, e.g.,

static enum vscope dwarf__location(Dwarf_Die *die, uint64_t *addr, uint32_t *offset, struct location *location) { ... }

and

    in the above
        *offset = expr[1].number.

Now the caller has the following information:
   . The deference of *addr stores the index to .debug_addr
   . The offset to the address in .debug_addr
and the final address will be debug_addr[*addr] + offset.


> +
>   			break;
>   		case DW_OP_reg1 ... DW_OP_reg31:
>   		case DW_OP_breg0 ... DW_OP_breg31:


  reply	other threads:[~2025-12-04  0:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-30  3:21 [PATCH dwarves] dwarf_loader: Handle DW_AT_location attrs containing DW_OP_plus_uconst Yao Zi
2025-12-04  0:46 ` Yonghong Song [this message]
2025-12-13  8:20   ` Yao Zi
2025-12-17  4:12     ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a3f82302-09d2-45e1-a30a-38a32ddbf947@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alan.maguire@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=me@q66.moe \
    --cc=ziyao@disroot.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox