public inbox for dwarves@vger.kernel.org
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: Yonghong Song <yonghong.song@linux.dev>,
	Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
	dwarves@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, David Faust <david.faust@oracle.com>,
	"Jose E . Marchesi" <jose.marchesi@oracle.com>,
	kernel-team@fb.com
Subject: Re: [PATCH dwarves 0/3] pahole: Replace or add functions with true signatures in btf
Date: Thu, 13 Nov 2025 16:45:24 +0000	[thread overview]
Message-ID: <a9ebf236-78c8-439a-b427-cb817efe23ae@oracle.com> (raw)
In-Reply-To: <20251111170424.286892-1-yonghong.song@linux.dev>

On 11/11/2025 17:04, Yonghong Song wrote:
> Current vmlinux BTF encoding is based on the source level signatures.
> But the compiler may do some optimization and changed the signature.
> If the user tried with source level signature, their initial implementation
> may have wrong results and then the user need to check what is the
> problem and work around it, e.g. through kprobe since kprobe does not
> need vmlinux BTF. 
> 
> The following is a concrete example for [1].
> The original source signature:
>   typedef struct {
>         union {
>                 void            *kernel;
>                 void __user     *user;
>         };
>         bool            is_kernel : 1;
>   } sockptr_t;
>   typedef sockptr_t bpfptr_t;
>   static int map_create(union bpf_attr *attr, bpfptr_t uattr) { ... }
> After compiler optimization, the signature becomes:
>   static int map_create(union bpf_attr *attr, bool uattr__coerce1) { ... }
>   
> In the above, uattr__coerce1 corresponds to 'is_kernel' field in sockptr_t.
> Here, the suffix '__coerce1' refers to the second 64bit value in
> sockptr_t. The first 64bit value will be '__coerce0' if that value
> is used instead.
> 
> To do proper tracing, it would be good for the users to know the
> changed signature. With the actual signature, both kprobe and fentry
> should work as usual. This can avoid user surprise and improve
> developer productivity.
> 
> The llvm compiler patch [1] collects true signature and encoded those
> functions in dwarf. pahole will process these functions and
> replace old signtures with true signatures. Additionally,
> new functions (e.g., foo.llvm.<hash>) can be encoded in
> vmlinux BTF as well.
> 
> Patches 1/2 are refactor patches. Patch 3 has the detailed explanation
> in commit message and implements the logic to encode replaced or new
> signatures to vmlinux BTF. Please see Patch 3 for details.
>


Thanks for sending the series Yonghong! I think the thing we need to
discuss at a high level is this; what is the proposed relationship
between source code and BTF function encoding? The approach we have
taken thus far is to use source level as the basis for encoding, and as
part of that we attempt to identify cases where the source-level
expectations are violated by the compiled (optimized) code. We currently
do not encode those cases as in the case of optimized-out parameters,
source-level expectations of parameter position could lead to bad
behaviour. There are probably cases we miss in this, but that is the
intent at least.

There are however cases where .isra-suffixed functions retain the
expected parameter representations; in such cases we encode with the
prefix name ("foo" not "foo.isra.0") as DWARF does.

So in moving away from that, I think we need to make a clear decision
and have handling in place. My practical worry is that users trying to
write BPF progs cannot easily predict if a parameter is optimized out
and so on, so it's hard to write stable BPF programs for such
signatures. Less of a problem if using a high-level tracer I suppose.

The approach I had been thinking about was to utilize BTF location
information for such cases, but the RFC [1] didn't get around to
implementing the support. So the idea would be have location info with
parameter types and locations, but because we don't encode a function
fentry can't be used (but kprobes still could as for inline sites). So
under that scheme the foo.llvm.hash functions could still be called
"foo" since we have address information for the sites we can match foo
to foo.llvm.hash.

Anyway I'd appreciate other perspectives here. We have implicitly tied
BTF function encoding thus for to source-level representation for
reasons of fentry safety, but we could potentially move away from that.
Doing so though would I think at a minimum require machinery for fentry
safety to preserved, but we could find other ways to flag this in the
BTF function representation potentially. Thanks!

Alan


[1]
https://lore.kernel.org/bpf/20251024073328.370457-1-alan.maguire@oracle.com/

>   [1] https://github.com/llvm/llvm-project/pull/165310
> 
> Yonghong Song (3):
>   btf_encoder: Refactor elf_functions__new() with struct btf_encoder as
>     argument
>   bpf_encoder: Refactor a helper elf_function__check_and_push_sym()
>   pahole: Replace or add functions with true signatures in btf
> 
>  btf_encoder.c  | 79 +++++++++++++++++++++++++++++++++++---------
>  dwarf_loader.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  dwarves.h      |  4 +++
>  pahole.c       |  9 +++++
>  4 files changed, 165 insertions(+), 16 deletions(-)
> 


  parent reply	other threads:[~2025-11-13 16:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-11 17:04 [PATCH dwarves 0/3] pahole: Replace or add functions with true signatures in btf Yonghong Song
2025-11-11 17:04 ` [PATCH dwarves 1/3] btf_encoder: Refactor elf_functions__new() with struct btf_encoder as argument Yonghong Song
2025-11-11 17:04 ` [PATCH dwarves 2/3] bpf_encoder: Refactor a helper elf_function__check_and_push_sym() Yonghong Song
2025-11-11 17:04 ` [PATCH dwarves 3/3] pahole: Replace or add functions with true signatures in btf Yonghong Song
2025-11-13 16:45 ` Alan Maguire [this message]
2025-11-13 17:36   ` [PATCH dwarves 0/3] " Alexei Starovoitov
2025-11-14 15:57     ` Alan Maguire
2025-11-14 20:11       ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9ebf236-78c8-439a-b427-cb817efe23ae@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=david.faust@oracle.com \
    --cc=dwarves@vger.kernel.org \
    --cc=jose.marchesi@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox