public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Alan Maguire <alan.maguire@oracle.com>,
	Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
	dwarves@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH dwarves v4 00/11] pahole: Encode true signatures in kernel BTF
Date: Fri, 27 Mar 2026 12:38:27 -0700	[thread overview]
Message-ID: <8424ac77-a469-4064-bfca-542d671d5ea6@linux.dev> (raw)
In-Reply-To: <c2143369-7b31-45d2-9c34-ae311582e074@oracle.com>



On 3/27/26 9:02 AM, Alan Maguire wrote:
> On 26/03/2026 01:31, Yonghong Song wrote:
>> Current vmlinux BTF encoding is based on the source level signatures.
>> But the compiler may do some optimization and changed the signature.
>> If the user tried with source level signature, their initial implementation
>> may have wrong results and then the user need to check what is the
>> problem and work around it, e.g. through kprobe since kprobe does not
>> need vmlinux BTF.
>>
>> Majority of changed signatures are due to dead argument elimination.
>> The following is a more complex one. The original source signature:
>>    typedef struct {
>>          union {
>>                  void            *kernel;
>>                  void __user     *user;
>>          };
>>          bool            is_kernel : 1;
>>    } sockptr_t;
>>    typedef sockptr_t bpfptr_t;
>>    static int map_create(union bpf_attr *attr, bpfptr_t uattr) { ... }
>> After compiler optimization, the signature becomes:
>>    static int map_create(union bpf_attr *attr, bool uattr__is_kernel) { ... }
>> In the above, uattr__is_kernel corresponds to 'is_kernel' field in sockptr_t.
>> This makes it easier for developers to understand what changed.
>>
>> The new signature needs to properly follow ABI specification based on
>> locations. Otherwise, that signature should be discarded. For example,
>>
>>      0x0242f1f7:   DW_TAG_subprogram
>>                      DW_AT_name      ("memblock_find_in_range")
>>                      DW_AT_calling_convention        (DW_CC_nocall)
>>                      DW_AT_type      (0x0242decc "phys_addr_t")
>>                      ...
>>      0x0242f22e:     DW_TAG_formal_parameter
>>                        DW_AT_location        (indexed (0x14a) loclist = 0x005595bc:
>>                           [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
>>                           [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
>>                           [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
>>                           [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
>>                        DW_AT_name    ("start")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>      0x0242f239:     DW_TAG_formal_parameter
>>                        DW_AT_location        (indexed (0x14b) loclist = 0x005595e6:
>>                           [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
>>                           [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
>>                           [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
>>                           [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
>>                        DW_AT_name    ("end")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>      0x0242f245:     DW_TAG_formal_parameter
>>                        DW_AT_location        (indexed (0x14c) loclist = 0x00559610:
>>                           [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
>>                        DW_AT_name    ("size")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>      0x0242f250:     DW_TAG_formal_parameter
>>                        DW_AT_const_value     (4096)
>>                        DW_AT_name    ("align")
>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>                        ...
>>
>> The third argument should correspond to RDX for x86_64. But the location suggests that
>> the parameter value is stored in the address with 'RSI + 0'. It is not clear whether
>> the parameter value is stored in RDX or not. So we have to discard this funciton in
>> vmlinux BTF to avoid incorrect true signatures.
>>
>> For llvm, any function having
>>    DW_AT_calling_convention        (DW_CC_nocall)
>> in dwarf DW_TAG_subprogram will indicate that this function has signature changed.
>> I did experiment with latest bpf-next. For x86_64, there are 69103 kernel functions
>> and 875 kernel functions having signature changed. A series of patches are intended
>> to ensure true signatures are properly represented. Eventually, only 18 functions
>> cannot have true signatures due to locations.
>>
>> For arm64, there are 863 kernel functions having signature changed, and
>> 70 functions cannot have true signatures due to locations. I checked those
>> functions and look like llvm arm64 backend more relaxed to compute parameter
>> values.
>>
>> For full testing, I enabled true signature support in kernel scripts/Makefile.btf like below:
>>    -pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes
>>    +pahole-flags-$(call test-ge, $(pahole-ver), 130) += --btf_features=attributes --btf_features=+true_signature
>>
>> For the patch set, Patch 1 introduced usage of DW_AT_calling_convention, which
>> can precisely identify which function has signature changed. This can filter
>> majority of functions where their signature won't change. Patch 2 did a prescan
>> of parameter registers to accommodate some cases where the optimization could
>> happen but didn't. Patches 3 to 9 tried to find functions with true signature.
>> Patch 10 enables to btf encoder to properly generate BTF.
>> Patch 11 includes a few tests.
>>
> I'm looking through this now, but FYI the test run with clang kernel build
> is passing now, great work! The log [1] shows 23 additional functions for
> clang kernel builds (see the "Compare functiokns generated" step). Interestingly
> it shows a few additional functions for x86_64 too, I suspect a side effect of
> better handling of parameter location info, but I need to confirm:
>
> ### Compare vmlinux BTF functions generated with this change vs baseline (none means no differences).
> 1345a1346
>> void __do_notify(struct mqueue_inode_info * info);

It would be great if both this change and baseline are showed. For example, for this __do_notify() function.
The kernel source:
   static void __do_notify(struct mqueue_inode_info *info) { ... }

The dwarf generated by clang23 build:

0x04442f84:   DW_TAG_subprogram
                 DW_AT_low_pc    (0xffffffff82409720)
                 DW_AT_high_pc   (0xffffffff82409db7)
                 DW_AT_frame_base        (DW_OP_call_frame_cfa, DW_OP_consts -128, DW_OP_plus)
                 DW_AT_call_all_calls    (true)
                 DW_AT_name      ("__do_notify")
                 DW_AT_decl_file ("/home/yhs/work/bpf-next/ipc/mqueue.c")
                 DW_AT_decl_line (777)
                 DW_AT_prototyped        (true)
                     
0x04442f98:     DW_TAG_formal_parameter
                   DW_AT_location        (indexed (0x19d) loclist = 0x00c7000c:
                      [0xffffffff82409725, 0xffffffff82409763): DW_OP_reg5 RDI
                      [0xffffffff82409763, 0xffffffff82409ac6): DW_OP_reg3 RBX
                      [0xffffffff82409ac6, 0xffffffff82409b01): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                      [0xffffffff82409b01, 0xffffffff82409cdf): DW_OP_reg3 RBX
                      [0xffffffff82409cdf, 0xffffffff82409ce4): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
                      [0xffffffff82409ce4, 0xffffffff82409db7): DW_OP_reg3 RBX)
                   DW_AT_name    ("info")
                   DW_AT_decl_file       ("/home/yhs/work/bpf-next/ipc/mqueue.c")
                   DW_AT_decl_line       (777)
                   DW_AT_type    (0x0443d068 "mqueue_inode_info *")

The function __do_notify() does not have nocall attribute so the signature should
be preserved as the original.

> 4994a4996
>> int __vxlan_fdb_delete(struct vxlan_dev * vxlan, const unsigned char  * addr, union vxlan_addr ip, __be16 port, __be32 src_vni, __be32 vni, u32 ifindex, bool swdev_notify);

__vxlan_fdb_delete is a global function. See the signature:

int __vxlan_fdb_delete(struct vxlan_dev *vxlan,
                        const unsigned char *addr, union vxlan_addr ip,
                        __be16 port, __be32 src_vni, __be32 vni,
                        u32 ifindex, bool swdev_notify)
{ ... }

and dwarf

0x06c44a41:   DW_TAG_subprogram
                 DW_AT_low_pc    (0xffffffff82fa0aa0)
                 DW_AT_high_pc   (0xffffffff82fa0f9a)
                 DW_AT_frame_base        (DW_OP_reg7 RSP)
                 DW_AT_call_all_calls    (true)
                 DW_AT_name      ("__vxlan_fdb_delete")
                 DW_AT_decl_file ("/home/yhs/work/bpf-next/drivers/net/vxlan/vxlan_core.c")
                 DW_AT_decl_line (1273)
                 DW_AT_prototyped        (true)
                 DW_AT_type      (0x06c21271 "int")
                 DW_AT_external  (true)

So there should have no difference. Again, it would be great to show both base and this patch set.

> 15185a15188
>> int devlink_nl_param_value_put(struct sk_buff * msg, enum devlink_param_type type, int nla_type, union devlink_param_value val, bool flag_as_u8);
> 18862a18866
>> struct cpio_data find_microcode_in_initrd(const char  * path);
> 48240a48245
>> struct dst_entry * xfrm6_dst_lookup(const struct xfrm_dst_lookup_params  * params);
> I'll push the CI change to enable clang builds so we have it by default from now on.
>
> [1] https://github.com/alan-maguire/dwarves/actions/runs/23638670169
>   

Sounds good. Thanks!


  reply	other threads:[~2026-03-27 19:38 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26  1:31 [PATCH dwarves v4 00/11] pahole: Encode true signatures in kernel BTF Yonghong Song
2026-03-26  1:31 ` [PATCH dwarves v4 01/11] dwarf_loader: Reduce parameter checking with clang DW_AT_calling_convention attr Yonghong Song
2026-03-30  8:31   ` Alan Maguire
2026-03-26  1:31 ` [PATCH dwarves v4 02/11] dwarf_loader: Prescan all parameters with expected registers Yonghong Song
2026-03-26  1:31 ` [PATCH dwarves v4 03/11] dwarf_loader: Handle signatures with dead arguments Yonghong Song
2026-03-30 10:13   ` Alan Maguire
2026-03-26  1:32 ` [PATCH dwarves v4 04/11] dwarf_loader: Refactor initial ret -1 to be macro PARM_DEFAULT_FAIL Yonghong Song
2026-03-26  1:32 ` [PATCH dwarves v4 05/11] dwarf_laoder: Handle locations with DW_OP_fbreg Yonghong Song
2026-03-26  1:32 ` [PATCH dwarves v4 06/11] dwarf_loader: Change exprlen checking condition in parameter__reg() Yonghong Song
2026-03-26  1:32 ` [PATCH dwarves v4 07/11] dwarf_loader: Detect optimized parameters with locations having constant values Yonghong Song
2026-03-26  1:32 ` [PATCH dwarves v4 08/11] dwarf_loader: Check whether two-reg parameter actually use two regs or not Yonghong Song
2026-03-26  1:32 ` [PATCH dwarves v4 09/11] dwarf_loader: Handle expression lists Yonghong Song
2026-03-26  1:33 ` [PATCH dwarves v4 10/11] btf_encoder: Handle optimized parameter properly Yonghong Song
2026-03-26  1:33 ` [PATCH dwarves v4 11/11] tests: Add a few clang true signature tests Yonghong Song
2026-03-27 16:02 ` [PATCH dwarves v4 00/11] pahole: Encode true signatures in kernel BTF Alan Maguire
2026-03-27 19:38   ` Yonghong Song [this message]
2026-03-30  9:56     ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8424ac77-a469-4064-bfca-542d671d5ea6@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=kernel-team@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox