From: Yonghong Song <yonghong.song@linux.dev>
To: Alan Maguire <alan.maguire@oracle.com>,
Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
dwarves@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF
Date: Mon, 15 Jun 2026 21:06:48 -0700 [thread overview]
Message-ID: <bfa1e3a0-515f-4c08-ae53-469003d5509c@linux.dev> (raw)
In-Reply-To: <84234995-4664-4f03-81fe-38b22482a319@oracle.com>
On 6/15/26 10:17 AM, Alan Maguire wrote:
> On 23/05/2026 17:57, Yonghong Song wrote:
>> Current vmlinux BTF encoding is based on the source level signatures.
>> But the compiler may do some optimization and changed the signature.
>> If the user tried with source level signature, their initial implementation
>> may have wrong results and then the user need to check what is the
>> problem and work around it, e.g. through kprobe since kprobe does not
>> need vmlinux BTF.
>>
>> Majority of changed signatures are due to dead argument elimination.
>> The following is a more complex one. The original source signature:
>> typedef struct {
>> union {
>> void *kernel;
>> void __user *user;
>> };
>> bool is_kernel : 1;
>> } sockptr_t;
>> typedef sockptr_t bpfptr_t;
>> static int map_create(union bpf_attr *attr, bpfptr_t uattr) { ... }
>> After compiler optimization, the signature becomes:
>> static int map_create(union bpf_attr *attr, bool uattr__is_kernel) { ... }
>> In the above, uattr__is_kernel corresponds to 'is_kernel' field in sockptr_t.
>> This makes it easier for developers to understand what changed.
>>
>> The new signature needs to properly follow ABI specification based on
>> locations. Otherwise, that signature should be discarded. For example,
>>
>> 0x0242f1f7: DW_TAG_subprogram
>> DW_AT_name ("memblock_find_in_range")
>> DW_AT_calling_convention (DW_CC_nocall)
>> DW_AT_type (0x0242decc "phys_addr_t")
>> ...
>> 0x0242f22e: DW_TAG_formal_parameter
>> DW_AT_location (indexed (0x14a) loclist = 0x005595bc:
>> [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
>> [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
>> [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
>> [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
>> DW_AT_name ("start")
>> DW_AT_type (0x0242decc "phys_addr_t")
>> ...
>> 0x0242f239: DW_TAG_formal_parameter
>> DW_AT_location (indexed (0x14b) loclist = 0x005595e6:
>> [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
>> [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
>> [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
>> [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
>> DW_AT_name ("end")
>> DW_AT_type (0x0242decc "phys_addr_t")
>> ...
>> 0x0242f245: DW_TAG_formal_parameter
>> DW_AT_location (indexed (0x14c) loclist = 0x00559610:
>> [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
>> DW_AT_name ("size")
>> DW_AT_type (0x0242decc "phys_addr_t")
>> ...
>> 0x0242f250: DW_TAG_formal_parameter
>> DW_AT_const_value (4096)
>> DW_AT_name ("align")
>> DW_AT_type (0x0242decc "phys_addr_t")
>> ...
>>
>> The third argument should correspond to RDX for x86_64. But the location suggests that
>> the parameter value is stored in the address with 'RSI + 0'. It is not clear whether
>> the parameter value is stored in RDX or not. So we have to discard this funciton in
>> vmlinux BTF to avoid incorrect true signatures.
>>
>> For llvm, any function having
>> DW_AT_calling_convention (DW_CC_nocall)
>> in dwarf DW_TAG_subprogram will indicate that this function has signature changed.
>> I did experiment with latest bpf-next. For x86_64, there are 69103 kernel functions
>> and 875 kernel functions having signature changed. A series of patches are intended
>> to ensure true signatures are properly represented. Eventually, only 18 functions
>> cannot have true signatures due to locations.
>>
>> For arm64, there are 863 kernel functions having signature changed, and
>> 70 functions cannot have true signatures due to locations. I checked those
>> functions and look like llvm arm64 backend more relaxed to compute parameter
>> values.
>>
>> For full testing, I enabled true signature support in kernel scripts/Makefile.btf like below:
>> -pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes
>> +pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes --btf_features=+true_signature
>>
>> For the patch set, Patch 1 introduced usage of DW_AT_calling_convention, which
>> can precisely identify which function has signature changed. This can filter
>> majority of functions where their signature won't change. Patch 2 did a prescan
>> of parameter registers to accommodate some cases where the optimization could
>> happen but didn't. Patches 3 to 9 tried to find functions with true signature.
>> Patch 10 enables to btf encoder to properly generate BTF.
>> Patch 11 includes a few tests.
>>
>> Changelog:
>> v4 -> v5:
>> - v4: https://lore.kernel.org/bpf/20260326013144.2901265-1-yonghong.song@linux.dev/
>> - Check info.signature_changed only under clang.
>> - Fix an uninitialized varable issue (var reg_dix) for gcc.
>> v3 -> v4:
>> - v3: https://lore.kernel.org/bpf/20260320190917.1970524-1-yonghong.song@linux.dev/
>> - Add simple prescan of parameter registers in order to get true signatures
>> for those functions where optimization could happen but compiler didn't do it.
>> - Do not create a new name (e.g. "uattr__is_kernel") with malloc at parameter_reg()
>> stage. Instead remember both "uattr" and "is_kernel" and later generate the
>> name "uattr_is_kernel" in btf encoder.
>> - Add comments to explain how to handle parameters which may take two registers.
>> - Fix some test failures on aarch64.
>> v2 -> v3:
>> - v2: https://lore.kernel.org/bpf/20260309153215.1917033-1-yonghong.song@linux.dev/
>> - Change tests by using newly added test_lib.sh.
>> - Simplify to get bool variable producer_clang.
>> - Try to avoid producer_clang appearance in dwarf_loader.c in order to avoid
>> clear separation between clang and gcc.
>> v1 -> v2:
>> - v1: https://lore.kernel.org/bpf/20260305225455.1151066-1-yonghong.song@linux.dev/
>> - Added producer_clang guarding in btf_encoder. Otherwise, gcc kernel build
>> will crash pahole.
>> - Fix an early return in parameter__reg() which didn't do pthread_mutex_unlock()
>> which caused the deadlock for arm64.
>> - Add a few more places to guard with producer_clang and conf->true_signature
>> to maintain the previous behavior if not clang or conf->true_signature is false.
>>
> In order to be a bit more concrete about a proposed way forward, I'm thinking something
> along the lines of the attached patch (which should apply on top of this whole series);
> rather than doing prescans etc, we record param info as we go as we do today, and once done
> compute true signature info. This saves some complexity around prescan of params etc, so
> is a bit more consistent with what's there today. Ideally we'd be able to enhance DWARF
> processing for both cases (you have some great improvements in that area in this series),
> and unify the representation of modified signatures where feasible. Let me know what you think.
Thanks Alan! I agree that we should have as much common codes as possible for clang and gcc.
I will check and try to understand your new patch. If everything is fine, I will incorporate
this patch into the patch series.
Yonghong
>
> Thanks!
>
> Alan
prev parent reply other threads:[~2026-06-16 4:06 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-23 16:57 [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
2026-06-11 9:15 ` Alan Maguire
2026-05-23 16:57 ` [PATCH dwarves v5 02/11] dwarf_loader: Prescan all parameters with expected registers Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 03/11] dwarf_loader: Handle signatures with dead arguments Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 04/11] dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 05/11] dwarf_laoder: Handle locations with DW_OP_fbreg Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 06/11] dwarf_loader: Change exprlen checking condition in parameter__reg() Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 07/11] dwarf_loader: Detect optimized parameters with locations having constant values Yonghong Song
2026-05-23 16:57 ` [PATCH dwarves v5 08/11] dwarf_loader: Check whether two-reg parameter actually use two regs or not Yonghong Song
2026-05-23 16:58 ` [PATCH dwarves v5 09/11] dwarf_loader: Handle expression lists Yonghong Song
2026-05-23 16:58 ` [PATCH dwarves v5 10/11] btf_encoder: Handle optimized parameter properly Yonghong Song
2026-06-11 9:08 ` Alan Maguire
2026-05-23 16:58 ` [PATCH dwarves v5 11/11] tests: Add a few clang true signature tests Yonghong Song
2026-06-15 17:17 ` [PATCH dwarves v5 00/11] pahole: Encode true signatures in kernel BTF Alan Maguire
2026-06-16 4:06 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bfa1e3a0-515f-4c08-ae53-469003d5509c@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=arnaldo.melo@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=kernel-team@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox