linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Florent Revest <revest@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	linux-trace-kernel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	bpf <bpf@vger.kernel.org>, Sven Schnelle <svens@linux.ibm.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alan Maguire <alan.maguire@oracle.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH v4 3/9] bpf/btf: Add a function to search a member of a struct/union
Date: Mon, 7 Aug 2023 22:48:29 +0200	[thread overview]
Message-ID: <ZNFYnbqlyOjo5dkH@krava> (raw)
In-Reply-To: <20230804004206.9fdfae0b9270b9acca2c096f@kernel.org>

On Fri, Aug 04, 2023 at 12:42:06AM +0900, Masami Hiramatsu wrote:

SNIP

> > 
> > On the other hand, untangling all code paths that come from
> > trampolines (with a light regs structure) from those that come from an
> > exception (with a pt_regs) could lead to a lot of duplicated code, and
> > converting between each subsystem's idea of a light regs structure
> > (what if perf introduces a perf_regs now ?) would be tedious and slow
> > (lots of copies ?).
> 
> This is one discussion point I think. Actually, using pt_regs in kretprobe
> (and rethook) is histrical accident. Originally, it had put a kprobe on
> the function return trampoline to hook it. So keep the API compatiblity
> I made the hand assembled code to save the pt_regs on the stack.
> 
> My another question is if we have the fprobe to trace (hook) the function
> return, why we still need the kretprobe itself. I think we can remove
> kretprobe and use fprobe exit handler, because "function" probing will
> be done by fprobe, not kprobe. And then, we can simplify the kprobe
> interface and clarify what it is -- "kprobe is a wrapper of software
> breakpoint". And we don't need to think about duplicated code anymore :)

1+ sounds like good idea

> 
> >  
> > > Otherwise, ftrace_regs() has support on arm64 for getting to the argument
> > > registers and the stack. Even live kernel patching now uses ftrace_regs().
> > >
> > > >
> > > > If you guys decide to convert fprobe to ftrace_regs please
> > > > make it depend on kconfig or something.
> > > > bpf side needs full pt_regs.
> > 
> > Some wild ideas that I brought up once in a BPF office hour: BPF
> > "multi_kprobe" could provide a fake pt_regs (either by constructing a
> > sparse one on the stack or by JIT-ing different offset accesses and/or
> > by having the verifier deny access to unpopulated fields) or break the
> > current API (is it conceivable to phase out BPF "multi_kprobe"
> > programs in favor of BPF "fprobe" programs that don't lie about their
> > API and guarantees and just provide a ftrace_regs ?)
> 
> +1 :)

so multi_kprobe link was created to allow fast attach of BPF kprobe-type
programs to multiple functions.. I don't think there's need for new fprobe
program

> 
> > 
> > > Then use kprobes. When I asked Masami what the difference between fprobes
> > > and kprobes was, he told me that it would be that it would no longer rely
> > > on the slower FTRACE_WITH_REGS. But currently, it still does.
> > 
> > Actually... Moving fprobe to ftrace_regs should get even more spicy!
> > :) Fprobe also wraps "rethook" which is basically the same thing as
> > kretprobe: a return trampoline that saves a pt_regs, to the point that
> > on x86 kretprobe's trampoline got dropped in favor of rethook's
> > trampoline. But for the same reasons that we don't want ftrace to save
> > pt_regs on arm64, rethook should probably also just save a ftrace_regs
> > ? (also, to keep the fprobe callback signatures consistent between
> > pre- and post- handlers). But if we want fprobe "post" callbacks to
> > save a ftrace_regs now, either we need to re-introduce the kretprobe
> > trampoline or also change the API of kretprobe (and break its symmetry
> > with kprobe and we'd have the same problem all over again with BPF
> > kretprobe program types...). All of this is "beautifully" entangled...
> > :)
> 
> As I said, I would like to phase out the kretprobe itself because it
> provides the same feature of fprobe, which is confusing. jprobe was
> removed a while ago, and now kretprobe is. But we can not phase out
> it at once. So I think we will keep current kretprobe trampoline on
> arm64 and just add new ftrace_regs based rethook. Then remove the
> API next release. (after all users including systemtap is moved) 
> 
> > 
> > > The reason I started the FTRACE_WITH_ARGS (which gave us ftrace_regs) in
> > > the first place, was because of the overhead you reported to me with
> > > ftrace_regs_caller and why you wanted to go the direct trampoline approach.
> > > That's when I realized I could use a subset because those registers were
> > > already being saved. The only reason FTRACE_WITH_REGS was created was it
> > > had to supply full pt_regs (including flags) and emulate a breakpoint for
> > > the kprobes interface. But in reality, nothing really needs all that.
> > >
> > > > It's not about access to args.
> > > > pt_regs is passed from bpf prog further into all kinds of perf event
> > > > functions including stack walking.
> > 
> > If all accesses are done in BPF bytecode, we could (theoretically)
> > have the verifier and JIT work together to deny accesses to
> > unpopulated fields, or relocate pt_regs accesses to ftrace_regs
> > accesses to keep backward compatibility with existing multi_kprobe BPF
> > programs.
> 
> Yeah, that is what I would like to suggest, and what my patch does.
> (let me update rethook too, it'll be a bit tricky since I don't want
> break anything) 
> 
> Thanks,
> 
> > 
> > Is there a risk that a "multi_kprobe" program could call into a BPF
> > helper or kfunc that reads this pt_regs pointer and expect certain
> > fields to be set ? I suppose we could also deny giving that "pt_regs"
> > pointer to a helper... :/

I think Alexei answered this earlier in the thread:

 >From bpf side we don't care that such pt_regs is 100% filled in or
 >only partial as long as this pt_regs pointer is valid for perf_event_output
 >and stack walking that consume pt_regs.
 >I believe that was and still is the case for both x86 and arm64.


jirka

  parent reply	other threads:[~2023-08-07 20:48 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-31  7:30 [PATCH v4 0/9] tracing: Improbe BTF support on probe events Masami Hiramatsu (Google)
2023-07-31  7:30 ` [PATCH v4 1/9] tracing/probes: Support BTF argument on module functions Masami Hiramatsu (Google)
2023-07-31  7:30 ` [PATCH v4 2/9] bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF Masami Hiramatsu (Google)
2023-07-31  7:30 ` [PATCH v4 3/9] bpf/btf: Add a function to search a member of a struct/union Masami Hiramatsu (Google)
2023-07-31 21:59   ` Alexei Starovoitov
2023-07-31 23:57     ` Masami Hiramatsu
2023-08-01  0:29       ` Alexei Starovoitov
2023-08-01 15:02         ` Masami Hiramatsu
2023-08-01 15:20           ` Steven Rostedt
2023-08-01 15:32             ` Steven Rostedt
2023-08-01 22:18               ` Alexei Starovoitov
2023-08-01 23:09                 ` Steven Rostedt
2023-08-01 23:44                   ` Alexei Starovoitov
2023-08-02  0:21                   ` Masami Hiramatsu
2023-08-02  0:40                     ` Steven Rostedt
2023-08-02  0:44                       ` Steven Rostedt
2023-08-02  2:22                         ` Alexei Starovoitov
2023-08-02  2:32                           ` Steven Rostedt
2023-08-02 14:07                           ` Masami Hiramatsu
2023-08-02 15:08                             ` Florent Revest
2023-08-02 13:56                       ` Masami Hiramatsu
2023-08-02 14:48                         ` Florent Revest
2023-08-02 15:47                         ` Florent Revest
2023-08-03  1:55                           ` Masami Hiramatsu
2023-08-02 18:24                         ` Alexei Starovoitov
2023-08-02 18:38                           ` Steven Rostedt
2023-08-02 19:48                             ` Alexei Starovoitov
2023-08-02 20:12                               ` Steven Rostedt
2023-08-02 21:28                                 ` Alexei Starovoitov
2023-08-02 14:44                   ` Florent Revest
2023-08-02 16:11                     ` Steven Rostedt
2023-08-03 15:42                     ` Masami Hiramatsu
2023-08-03 16:37                       ` Florent Revest
2023-08-07 20:48                       ` Jiri Olsa [this message]
2023-08-08 14:32                         ` Masami Hiramatsu
2023-08-01  1:15     ` Steven Rostedt
2023-08-01  2:24       ` Alexei Starovoitov
2023-08-01 13:35         ` Steven Rostedt
2023-08-01 15:18         ` Masami Hiramatsu
2023-08-01 22:21           ` Alexei Starovoitov
2023-08-01 23:17             ` Masami Hiramatsu
2023-07-31  7:30 ` [PATCH v4 4/9] tracing/probes: Support BTF based data structure field access Masami Hiramatsu (Google)
2023-07-31  7:30 ` [PATCH v4 5/9] tracing/probes: Support BTF field access from $retval Masami Hiramatsu (Google)
2023-07-31  7:31 ` [PATCH v4 6/9] tracing/probes: Add string type check with BTF Masami Hiramatsu (Google)
2023-07-31  7:31 ` [PATCH v4 7/9] tracing/fprobe-event: Assume fprobe is a return event by $retval Masami Hiramatsu (Google)
2023-07-31  7:31 ` [PATCH v4 8/9] selftests/ftrace: Add BTF fields access testcases Masami Hiramatsu (Google)
2023-07-31  7:31 ` [PATCH v4 9/9] Documentation: tracing: Update fprobe event example with BTF field Masami Hiramatsu (Google)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZNFYnbqlyOjo5dkH@krava \
    --to=olsajiri@gmail.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=revest@chromium.org \
    --cc=rostedt@goodmis.org \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).