All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Alan Maguire <alan.maguire@oracle.com>, Jiri Olsa <olsajiri@gmail.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
	dwarves@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH dwarves v7 0/5] pahole: Encode true signatures in kernel BTF
Date: Tue, 23 Jun 2026 09:49:35 -0700	[thread overview]
Message-ID: <60d2b741-b520-4f63-99ea-5a5e9825fcc5@linux.dev> (raw)
In-Reply-To: <b1b73cef-f25a-443d-8557-4cfe92938b46@oracle.com>



On 6/23/26 9:02 AM, Alan Maguire wrote:
> On 23/06/2026 14:11, Alan Maguire wrote:
>> On 23/06/2026 13:28, Jiri Olsa wrote:
>>> On Mon, Jun 22, 2026 at 09:07:04PM -0700, Yonghong Song wrote:
>>>> Current vmlinux BTF encoding is based on the source level signatures.
>>>> But the compiler may do some optimization and changed the signature.
>>>> If the user tried with source level signature, their initial implementation
>>>> may have wrong results and then the user need to check what is the
>>>> problem and work around it, e.g. through kprobe since kprobe does not
>>>> need vmlinux BTF.
>>>>
>>>> Majority of changed signatures are due to dead argument elimination.
>>>> The following is a more complex one. The original source signature:
>>>>    typedef struct {
>>>>          union {
>>>>                  void            *kernel;
>>>>                  void __user     *user;
>>>>          };
>>>>          bool            is_kernel : 1;
>>>>    } sockptr_t;
>>>>    typedef sockptr_t bpfptr_t;
>>>>    static int map_create(union bpf_attr *attr, bpfptr_t uattr) { ... }
>>>> After compiler optimization, the signature becomes:
>>>>    static int map_create(union bpf_attr *attr, bool uattr__is_kernel) { ... }
>>>> In the above, uattr__is_kernel corresponds to 'is_kernel' field in sockptr_t.
>>>> This makes it easier for developers to understand what changed.
>>>>
>>>> The new signature needs to properly follow ABI specification based on
>>>> locations. Otherwise, that signature should be discarded. For example,
>>>>
>>>>      0x0242f1f7:   DW_TAG_subprogram
>>>>                      DW_AT_name      ("memblock_find_in_range")
>>>>                      DW_AT_calling_convention        (DW_CC_nocall)
>>>>                      DW_AT_type      (0x0242decc "phys_addr_t")
>>>>                      ...
>>>>      0x0242f22e:     DW_TAG_formal_parameter
>>>>                        DW_AT_location        (indexed (0x14a) loclist = 0x005595bc:
>>>>                           [0xffffffff87a000f9, 0xffffffff87a00178): DW_OP_reg5 RDI
>>>>                           [0xffffffff87a00178, 0xffffffff87a001be): DW_OP_reg14 R14
>>>>                           [0xffffffff87a001be, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
>>>>                           [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg14 R14)
>>>>                        DW_AT_name    ("start")
>>>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>>>                        ...
>>>>      0x0242f239:     DW_TAG_formal_parameter
>>>>                        DW_AT_location        (indexed (0x14b) loclist = 0x005595e6:
>>>>                           [0xffffffff87a000f9, 0xffffffff87a00175): DW_OP_reg4 RSI
>>>>                           [0xffffffff87a00175, 0xffffffff87a001b8): DW_OP_reg3 RBX
>>>>                           [0xffffffff87a001b8, 0xffffffff87a001c7): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
>>>>                           [0xffffffff87a001c7, 0xffffffff87a00214): DW_OP_reg3 RBX)
>>>>                        DW_AT_name    ("end")
>>>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>>>                        ...
>>>>      0x0242f245:     DW_TAG_formal_parameter
>>>>                        DW_AT_location        (indexed (0x14c) loclist = 0x00559610:
>>>>                           [0xffffffff87a001e3, 0xffffffff87a001ef): DW_OP_breg4 RSI+0)
>>>>                        DW_AT_name    ("size")
>>>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>>>                        ...
>>>>      0x0242f250:     DW_TAG_formal_parameter
>>>>                        DW_AT_const_value     (4096)
>>>>                        DW_AT_name    ("align")
>>>>                        DW_AT_type    (0x0242decc "phys_addr_t")
>>>>                        ...
>>>>
>>>> The third argument should correspond to RDX for x86_64. But the location suggests that
>>>> the parameter value is stored in the address with 'RSI + 0'. It is not clear whether
>>>> the parameter value is stored in RDX or not. So we have to discard this funciton in
>>>> vmlinux BTF to avoid incorrect true signatures.
>>>>
>>>> For llvm, any function having
>>>>    DW_AT_calling_convention        (DW_CC_nocall)
>>>> in dwarf DW_TAG_subprogram will indicate that this function has signature changed.
>>>> I did experiment with latest bpf-next. For x86_64, there are 69103 kernel functions
>>>> and 875 kernel functions having signature changed. A series of patches are intended
>>>> to ensure true signatures are properly represented. Eventually, only 20 functions
>>>> cannot have true signatures due to locations.
>>> hi,
>>> I tried to get the numbers from my setup and noticed that some new
>>> functions were included in BTF compared to the current version
>>> (functions diff attached below)
>>>
>>> like for "arp_process" function the current pahole gives me:
>>>
>>>    arp_process : skipping BTF encoding of function due to unexpected register usage for parameter
>>>
>>> but it's included in BTF generated with the new pahole.
>>>
>>> in addition to your explanation above also one of the commit says:
>>>
>>>       - a parameter with no location, a constant value, or (for non-clang) no
>>>         register found is marked optimized out
>>>
>>> please check below, it seems like 2nd argument of arp_process has no location,
>>> so iiuc it should not be included in BTF, right?
>>>
>>> thanks,
>>> jirka
>>>
>>>
>> thanks for catching this; it looks like we return a bit early before detecting
>> missing locations in the non-true-signature code. If you get a chance, would you
>> mind trying the attached patch to see if it fixes the problem?
>>
>> If the fix works and Yonghong is happy with it we can add it as a followup
>> and land the true signature series to save another round.
> actually sorry that patch leaked true signature partial names for gcc; updated
> patch attached.

Thanks, Alan. I think we should preserve the previous change in dwarf_loader.c
(the patch you applied Jiri).

The following is a dwarf example in clang built kernel:

0x053c4bb4:   DW_TAG_subprogram
                 DW_AT_low_pc    (0xffffffff828bce90)
                 DW_AT_high_pc   (0xffffffff828bde6d)
                 DW_AT_frame_base        (DW_OP_reg6 RBP)
                 DW_AT_call_all_calls    (true)
                 DW_AT_name      ("ZSTD_buildSequencesStatistics")
                 DW_AT_decl_file ("/home/yhs/work/bpf-next/lib/zstd/compress/zstd_compress.c")
                 DW_AT_decl_line (2677)
                 DW_AT_prototyped        (true)
                 DW_AT_type      (0x053c46b5 "ZSTD_symbolEncodingTypeStats_t")
                     
0x053c4bc6:     DW_TAG_formal_parameter
                   DW_AT_location        (indexed (0x8e8) loclist = 0x00e1597f:
                      [0xffffffff828bce99, 0xffffffff828bcf3d): DW_OP_reg4 RSI
                      [0xffffffff828bcf3d, 0xffffffff828bd045): DW_OP_reg12 R12
                      [0xffffffff828bd045, 0xffffffff828bd562): DW_OP_breg7 RSP+40
                      [0xffffffff828bd5bd, 0xffffffff828bdb0e): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
                      [0xffffffff828bdb0e, 0xffffffff828bdbdc): DW_OP_breg7 RSP+40
                      [0xffffffff828bdbdc, 0xffffffff828bdbff): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value
                      [0xffffffff828bdbff, 0xffffffff828bdcfd): DW_OP_breg7 RSP+40
                      [0xffffffff828bdcfd, 0xffffffff828bde6d): DW_OP_entry_value(DW_OP_reg4 RSI), DW_OP_stack_value)
                   DW_AT_name    ("seqStorePtr")
                   DW_AT_decl_file       ("/home/yhs/work/bpf-next/lib/zstd/compress/zstd_compress.c")
                   DW_AT_decl_line       (2678)
                   DW_AT_type    (0x053c4697 "const SeqStore_t *")
                 
0x053c4bd2:     DW_TAG_formal_parameter
                   DW_AT_location        (indexed (0x8e9) loclist = 0x00e159d4:
                      [0xffffffff828bce99, 0xffffffff828bcec1): DW_OP_reg1 RDX
                      [0xffffffff828bcec1, 0xffffffff828bd58b): DW_OP_breg7 RSP+72
                      [0xffffffff828bd5bd, 0xffffffff828bd5e8): DW_OP_breg7 RSP+72
                      [0xffffffff828bd61b, 0xffffffff828bd72d): DW_OP_breg7 RSP+72
                      [0xffffffff828bd76d, 0xffffffff828bd799): DW_OP_breg7 RSP+72
                      [0xffffffff828bd7cb, 0xffffffff828bd8d8): DW_OP_breg7 RSP+72
                      [0xffffffff828bd91b, 0xffffffff828bd943): DW_OP_breg7 RSP+72
                      [0xffffffff828bd974, 0xffffffff828bdafe): DW_OP_breg7 RSP+72
                      [0xffffffff828bdb0e, 0xffffffff828bde6d): DW_OP_breg7 RSP+72)
                   DW_AT_name    ("nbSeq")
                   DW_AT_decl_file       ("/home/yhs/work/bpf-next/lib/zstd/compress/zstd_compress.c")
                   DW_AT_decl_line       (2678)
                   DW_AT_type    (0x053be0a9 "size_t")
0x053c4bde:     DW_TAG_formal_parameter
                   DW_AT_location        (indexed (0x8ea) loclist = 0x00e15a36:
                      [0xffffffff828bce99, 0xffffffff828bcf0d): DW_OP_reg2 RCX
                      [0xffffffff828bcf0d, 0xffffffff828bd037): DW_OP_reg15 R15
                      [0xffffffff828bd037, 0xffffffff828bd58b): DW_OP_breg7 RSP+88
                      [0xffffffff828bd5bd, 0xffffffff828bd5e8): DW_OP_breg7 RSP+88
                      [0xffffffff828bd61b, 0xffffffff828bd72d): DW_OP_breg7 RSP+88
                      [0xffffffff828bd76d, 0xffffffff828bd799): DW_OP_breg7 RSP+88
                      [0xffffffff828bd7cb, 0xffffffff828bd8d8): DW_OP_breg7 RSP+88
                      [0xffffffff828bd91b, 0xffffffff828bd943): DW_OP_breg7 RSP+88
                      [0xffffffff828bd974, 0xffffffff828bdafe): DW_OP_breg7 RSP+88
                      [0xffffffff828bdb0e, 0xffffffff828bde6d): DW_OP_breg7 RSP+88)
                   DW_AT_name    ("prevEntropy")
                   DW_AT_decl_file       ("/home/yhs/work/bpf-next/lib/zstd/compress/zstd_compress.c")
                   DW_AT_decl_line       (2679)
                   DW_AT_type    (0x053c46a1 "const ZSTD_fseCTables_t *")

0x053c4bea:     DW_TAG_formal_parameter
                   DW_AT_location        (indexed (0x8eb) loclist = 0x00e15aa1:
                      [0xffffffff828bce99, 0xffffffff828bceb9): DW_OP_reg8 R8
                      [0xffffffff828bceb9, 0xffffffff828bd58b): DW_OP_breg7 RSP+48
                      [0xffffffff828bd5bd, 0xffffffff828bd5e8): DW_OP_breg7 RSP+48
                      [0xffffffff828bd61b, 0xffffffff828bd72d): DW_OP_breg7 RSP+48
                      [0xffffffff828bd76d, 0xffffffff828bd799): DW_OP_breg7 RSP+48
                      [0xffffffff828bd7cb, 0xffffffff828bd8aa): DW_OP_breg7 RSP+48
                      [0xffffffff828bd91b, 0xffffffff828bda06): DW_OP_entry_value(DW_OP_reg8 R8), DW_OP_stack_value
                      [0xffffffff828bda06, 0xffffffff828bda86): DW_OP_breg7 RSP+48
                      [0xffffffff828bda86, 0xffffffff828bdb0e): DW_OP_entry_value(DW_OP_reg8 R8), DW_OP_stack_value
                      [0xffffffff828bdb0e, 0xffffffff828bdbfa): DW_OP_breg7 RSP+48
                      [0xffffffff828bdbfa, 0xffffffff828bdbff): DW_OP_entry_value(DW_OP_reg8 R8), DW_OP_stack_value
                      [0xffffffff828bdbff, 0xffffffff828bde15): DW_OP_breg7 RSP+48
                      [0xffffffff828bde15, 0xffffffff828bde6d): DW_OP_entry_value(DW_OP_reg8 R8), DW_OP_stack_value)
                   DW_AT_name    ("nextEntropy")
                   DW_AT_decl_file       ("/home/yhs/work/bpf-next/lib/zstd/compress/zstd_compress.c")
                   DW_AT_decl_line       (2679)
                   DW_AT_type    (0x053c46ab "ZSTD_fseCTables_t *")

...

In this example, signature is not changed with compiler optimization so there does not have 'nocall'.
But the corresponding registers do not conform ABI's. So we have to reject such functions in BTF.


  reply	other threads:[~2026-06-23 16:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  4:07 [PATCH dwarves v7 0/5] pahole: Encode true signatures in kernel BTF Yonghong Song
2026-06-23  4:07 ` [PATCH dwarves v7 1/5] dwarf_loader: Detect aggregate ABI register usage and signature changes Yonghong Song
2026-06-23  4:07 ` [PATCH dwarves v7 2/5] dwarf_loader: Collect per-parameter information Yonghong Song
2026-06-23  4:07 ` [PATCH dwarves v7 3/5] dwarf_loader: Analyze per-parameter information for true signatures Yonghong Song
2026-06-23  4:07 ` [PATCH dwarves v7 4/5] btf_encoder: Emit true function signatures Yonghong Song
2026-06-23  4:07 ` [PATCH dwarves v7 5/5] tests: Add BTF true_signature encoding tests Yonghong Song
2026-06-23 12:28 ` [PATCH dwarves v7 0/5] pahole: Encode true signatures in kernel BTF Jiri Olsa
2026-06-23 13:11   ` Alan Maguire
2026-06-23 16:02     ` Alan Maguire
2026-06-23 16:49       ` Yonghong Song [this message]
2026-06-23 16:58       ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60d2b741-b520-4f63-99ea-5a5e9825fcc5@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=kernel-team@fb.com \
    --cc=olsajiri@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.