All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Yao Zi <ziyao@disroot.org>, Alan Maguire <alan.maguire@oracle.com>
Cc: dwarves@vger.kernel.org, bpf@vger.kernel.org, q66 <me@q66.moe>
Subject: Re: [PATCH dwarves] dwarf_loader: Handle DW_AT_location attrs containing DW_OP_plus_uconst
Date: Wed, 3 Dec 2025 16:46:20 -0800	[thread overview]
Message-ID: <a3f82302-09d2-45e1-a30a-38a32ddbf947@linux.dev> (raw)
In-Reply-To: <20251130032113.4938-2-ziyao@disroot.org>



On 11/29/25 7:21 PM, Yao Zi wrote:
> LLVM has a GlobalMerge pass, which tries to group multiple global
> variables together and address them with through a single register with
> offsets coded in instructions, to reduce register pressure. Address of
> symbols transformed by the pass may be represented by an DWARF
> expression consisting of DW_OP_addrx and DW_OP_plus_uconst, which
> naturally matches the way a merged variable is addressed.
>
> However, our dwarf_loader currently ignores anything but the first in
> the location expression, including the DW_OP_plus_uconst atom, which
> appears the second operation in this case. This could result in broken
> BTF information produced by pahole, where several merged symbols are
> given the same offset, even though in fact they don't overlap.
>
> LLVM has enabled MergeGlobal pass for PowerPC[1] and RISC-V[2] by
> default since version 20, let's handle DW_OP_plus_uconst operations in
> DW_AT_location attributes correctly to ensure correct BTF could be
> produced for LLVM-built kernels.
>
> Fixes: a6ea527aab91 ("variable: Add ->addr member")
> Reported-by: q66 <me@q66.moe>
> Closes: https://github.com/ClangBuiltLinux/linux/issues/2089
> Link: https://github.com/llvm/llvm-project/commit/aaa37d6755e6 # [1]
> Link: https://github.com/llvm/llvm-project/commit/9d02264b03ea # [2]
> Signed-off-by: Yao Zi <ziyao@disroot.org>
> ---
>
> The problem is found by several distros building Linux kernel with LLVM
> and BTF enabled, after upgrading to LLVM 20 or later, kernels built for
> RISC-V and PowerPC issue errors like
>
> [    1.296358] BPF:      type_id=4457 offset=4224 size=8
> [    1.296767] BPF:
> [    1.296919] BPF: Invalid offset
>
> on startup, and loading any modules fails with -EINVAL unless
> CONFIG_MODULE_ALLOW_BTF_MISMATCH is turned on,
>
> # insmod tun.ko
> [   12.892421] failed to validate module [tun] BTF: -22
> [   12.936971] failed to validate module [tun] BTF: -22
> insmod: can't insert 'tun.ko': Invalid argument
>
> By comparing DWARF dump and BTF dump, it's found BTF contains symbols
> with the same offset,
>
> type_id=4148 offset=4208 size=8 (VAR 'vector_misaligned_access')
> type_id=4147 offset=4208 size=8 (VAR 'misaligned_access_speed')
>
> while the same symbols are described with different DW_AT_location
> attributes,
>
> 0x0011ade7:   DW_TAG_variable
>                  DW_AT_name      ("misaligned_access_speed")
>                  DW_AT_type      (0x0011adf2 "long")
> 		DW_AT_decl_file	("...")
>                  DW_AT_external  (true)
>                  DW_AT_decl_line (24)
>                  DW_AT_location  (DW_OP_addrx 0x0)
>
> ...
>
> 0x0011adf6:   DW_TAG_variable
>                  DW_AT_name      ("vector_misaligned_access")
>                  DW_AT_type      (0x0011adf2 "long")
>                  DW_AT_external  (true)
>                  DW_AT_decl_file ("...")
>                  DW_AT_decl_line (25)
>                  DW_AT_location  (DW_OP_addrx 0x0, DW_OP_plus_uconst 0x8)
>
> For more detailed analysis and kernel config for reproducing the issue,
> please refer to the Closes link. Thanks for your time and review.
>
>   dwarf_loader.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 79be3f516a26..635015676389 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -708,6 +708,11 @@ static enum vscope dwarf__location(Dwarf_Die *die, uint64_t *addr, struct locati
>   		case DW_OP_addrx:
>   			scope = VSCOPE_GLOBAL;
>   			*addr = expr[0].number;
> +
> +			if (location->exprlen == 2 &&
> +			    expr[1].atom == DW_OP_plus_uconst)
> +				addr += expr[1].number;

This does not work. 'addr' is the parameter and the above new 'addr' value won't
pass back to caller so the above is effectively a noop.

I think we need to add an additional parameter to pass the 'expr[1].number' back
to the caller, e.g.,

static enum vscope dwarf__location(Dwarf_Die *die, uint64_t *addr, uint32_t *offset, struct location *location) { ... }

and

    in the above
        *offset = expr[1].number.

Now the caller has the following information:
   . The deference of *addr stores the index to .debug_addr
   . The offset to the address in .debug_addr
and the final address will be debug_addr[*addr] + offset.


> +
>   			break;
>   		case DW_OP_reg1 ... DW_OP_reg31:
>   		case DW_OP_breg0 ... DW_OP_breg31:


  reply	other threads:[~2025-12-04  0:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-30  3:21 [PATCH dwarves] dwarf_loader: Handle DW_AT_location attrs containing DW_OP_plus_uconst Yao Zi
2025-12-04  0:46 ` Yonghong Song [this message]
2025-12-13  8:20   ` Yao Zi
2025-12-17  4:12     ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a3f82302-09d2-45e1-a30a-38a32ddbf947@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alan.maguire@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=me@q66.moe \
    --cc=ziyao@disroot.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.