From: Yonghong Song <yonghong.song@linux.dev>
To: Yao Zi <ziyao@disroot.org>, Alan Maguire <alan.maguire@oracle.com>
Cc: dwarves@vger.kernel.org, bpf@vger.kernel.org, q66 <me@q66.moe>
Subject: Re: [PATCH dwarves] dwarf_loader: Handle DW_AT_location attrs containing DW_OP_plus_uconst
Date: Wed, 3 Dec 2025 16:46:20 -0800 [thread overview]
Message-ID: <a3f82302-09d2-45e1-a30a-38a32ddbf947@linux.dev> (raw)
In-Reply-To: <20251130032113.4938-2-ziyao@disroot.org>
On 11/29/25 7:21 PM, Yao Zi wrote:
> LLVM has a GlobalMerge pass, which tries to group multiple global
> variables together and address them with through a single register with
> offsets coded in instructions, to reduce register pressure. Address of
> symbols transformed by the pass may be represented by an DWARF
> expression consisting of DW_OP_addrx and DW_OP_plus_uconst, which
> naturally matches the way a merged variable is addressed.
>
> However, our dwarf_loader currently ignores anything but the first in
> the location expression, including the DW_OP_plus_uconst atom, which
> appears the second operation in this case. This could result in broken
> BTF information produced by pahole, where several merged symbols are
> given the same offset, even though in fact they don't overlap.
>
> LLVM has enabled MergeGlobal pass for PowerPC[1] and RISC-V[2] by
> default since version 20, let's handle DW_OP_plus_uconst operations in
> DW_AT_location attributes correctly to ensure correct BTF could be
> produced for LLVM-built kernels.
>
> Fixes: a6ea527aab91 ("variable: Add ->addr member")
> Reported-by: q66 <me@q66.moe>
> Closes: https://github.com/ClangBuiltLinux/linux/issues/2089
> Link: https://github.com/llvm/llvm-project/commit/aaa37d6755e6 # [1]
> Link: https://github.com/llvm/llvm-project/commit/9d02264b03ea # [2]
> Signed-off-by: Yao Zi <ziyao@disroot.org>
> ---
>
> The problem is found by several distros building Linux kernel with LLVM
> and BTF enabled, after upgrading to LLVM 20 or later, kernels built for
> RISC-V and PowerPC issue errors like
>
> [ 1.296358] BPF: type_id=4457 offset=4224 size=8
> [ 1.296767] BPF:
> [ 1.296919] BPF: Invalid offset
>
> on startup, and loading any modules fails with -EINVAL unless
> CONFIG_MODULE_ALLOW_BTF_MISMATCH is turned on,
>
> # insmod tun.ko
> [ 12.892421] failed to validate module [tun] BTF: -22
> [ 12.936971] failed to validate module [tun] BTF: -22
> insmod: can't insert 'tun.ko': Invalid argument
>
> By comparing DWARF dump and BTF dump, it's found BTF contains symbols
> with the same offset,
>
> type_id=4148 offset=4208 size=8 (VAR 'vector_misaligned_access')
> type_id=4147 offset=4208 size=8 (VAR 'misaligned_access_speed')
>
> while the same symbols are described with different DW_AT_location
> attributes,
>
> 0x0011ade7: DW_TAG_variable
> DW_AT_name ("misaligned_access_speed")
> DW_AT_type (0x0011adf2 "long")
> DW_AT_decl_file ("...")
> DW_AT_external (true)
> DW_AT_decl_line (24)
> DW_AT_location (DW_OP_addrx 0x0)
>
> ...
>
> 0x0011adf6: DW_TAG_variable
> DW_AT_name ("vector_misaligned_access")
> DW_AT_type (0x0011adf2 "long")
> DW_AT_external (true)
> DW_AT_decl_file ("...")
> DW_AT_decl_line (25)
> DW_AT_location (DW_OP_addrx 0x0, DW_OP_plus_uconst 0x8)
>
> For more detailed analysis and kernel config for reproducing the issue,
> please refer to the Closes link. Thanks for your time and review.
>
> dwarf_loader.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 79be3f516a26..635015676389 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -708,6 +708,11 @@ static enum vscope dwarf__location(Dwarf_Die *die, uint64_t *addr, struct locati
> case DW_OP_addrx:
> scope = VSCOPE_GLOBAL;
> *addr = expr[0].number;
> +
> + if (location->exprlen == 2 &&
> + expr[1].atom == DW_OP_plus_uconst)
> + addr += expr[1].number;
This does not work. 'addr' is the parameter and the above new 'addr' value won't
pass back to caller so the above is effectively a noop.
I think we need to add an additional parameter to pass the 'expr[1].number' back
to the caller, e.g.,
static enum vscope dwarf__location(Dwarf_Die *die, uint64_t *addr, uint32_t *offset, struct location *location) { ... }
and
in the above
*offset = expr[1].number.
Now the caller has the following information:
. The deference of *addr stores the index to .debug_addr
. The offset to the address in .debug_addr
and the final address will be debug_addr[*addr] + offset.
> +
> break;
> case DW_OP_reg1 ... DW_OP_reg31:
> case DW_OP_breg0 ... DW_OP_breg31:
next prev parent reply other threads:[~2025-12-04 0:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-30 3:21 [PATCH dwarves] dwarf_loader: Handle DW_AT_location attrs containing DW_OP_plus_uconst Yao Zi
2025-12-04 0:46 ` Yonghong Song [this message]
2025-12-13 8:20 ` Yao Zi
2025-12-17 4:12 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a3f82302-09d2-45e1-a30a-38a32ddbf947@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alan.maguire@oracle.com \
--cc=bpf@vger.kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=me@q66.moe \
--cc=ziyao@disroot.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox