From: Jiri Olsa <olsajiri@gmail.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Florent Revest <revest@chromium.org>,
Steven Rostedt <rostedt@goodmis.org>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
linux-trace-kernel@vger.kernel.org,
LKML <linux-kernel@vger.kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
bpf <bpf@vger.kernel.org>, Sven Schnelle <svens@linux.ibm.com>,
Alexei Starovoitov <ast@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Alan Maguire <alan.maguire@oracle.com>,
Mark Rutland <mark.rutland@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH v4 3/9] bpf/btf: Add a function to search a member of a struct/union
Date: Mon, 7 Aug 2023 22:48:29 +0200 [thread overview]
Message-ID: <ZNFYnbqlyOjo5dkH@krava> (raw)
In-Reply-To: <20230804004206.9fdfae0b9270b9acca2c096f@kernel.org>
On Fri, Aug 04, 2023 at 12:42:06AM +0900, Masami Hiramatsu wrote:
SNIP
> >
> > On the other hand, untangling all code paths that come from
> > trampolines (with a light regs structure) from those that come from an
> > exception (with a pt_regs) could lead to a lot of duplicated code, and
> > converting between each subsystem's idea of a light regs structure
> > (what if perf introduces a perf_regs now ?) would be tedious and slow
> > (lots of copies ?).
>
> This is one discussion point I think. Actually, using pt_regs in kretprobe
> (and rethook) is histrical accident. Originally, it had put a kprobe on
> the function return trampoline to hook it. So keep the API compatiblity
> I made the hand assembled code to save the pt_regs on the stack.
>
> My another question is if we have the fprobe to trace (hook) the function
> return, why we still need the kretprobe itself. I think we can remove
> kretprobe and use fprobe exit handler, because "function" probing will
> be done by fprobe, not kprobe. And then, we can simplify the kprobe
> interface and clarify what it is -- "kprobe is a wrapper of software
> breakpoint". And we don't need to think about duplicated code anymore :)
1+ sounds like good idea
>
> >
> > > Otherwise, ftrace_regs() has support on arm64 for getting to the argument
> > > registers and the stack. Even live kernel patching now uses ftrace_regs().
> > >
> > > >
> > > > If you guys decide to convert fprobe to ftrace_regs please
> > > > make it depend on kconfig or something.
> > > > bpf side needs full pt_regs.
> >
> > Some wild ideas that I brought up once in a BPF office hour: BPF
> > "multi_kprobe" could provide a fake pt_regs (either by constructing a
> > sparse one on the stack or by JIT-ing different offset accesses and/or
> > by having the verifier deny access to unpopulated fields) or break the
> > current API (is it conceivable to phase out BPF "multi_kprobe"
> > programs in favor of BPF "fprobe" programs that don't lie about their
> > API and guarantees and just provide a ftrace_regs ?)
>
> +1 :)
so multi_kprobe link was created to allow fast attach of BPF kprobe-type
programs to multiple functions.. I don't think there's need for new fprobe
program
>
> >
> > > Then use kprobes. When I asked Masami what the difference between fprobes
> > > and kprobes was, he told me that it would be that it would no longer rely
> > > on the slower FTRACE_WITH_REGS. But currently, it still does.
> >
> > Actually... Moving fprobe to ftrace_regs should get even more spicy!
> > :) Fprobe also wraps "rethook" which is basically the same thing as
> > kretprobe: a return trampoline that saves a pt_regs, to the point that
> > on x86 kretprobe's trampoline got dropped in favor of rethook's
> > trampoline. But for the same reasons that we don't want ftrace to save
> > pt_regs on arm64, rethook should probably also just save a ftrace_regs
> > ? (also, to keep the fprobe callback signatures consistent between
> > pre- and post- handlers). But if we want fprobe "post" callbacks to
> > save a ftrace_regs now, either we need to re-introduce the kretprobe
> > trampoline or also change the API of kretprobe (and break its symmetry
> > with kprobe and we'd have the same problem all over again with BPF
> > kretprobe program types...). All of this is "beautifully" entangled...
> > :)
>
> As I said, I would like to phase out the kretprobe itself because it
> provides the same feature of fprobe, which is confusing. jprobe was
> removed a while ago, and now kretprobe is. But we can not phase out
> it at once. So I think we will keep current kretprobe trampoline on
> arm64 and just add new ftrace_regs based rethook. Then remove the
> API next release. (after all users including systemtap is moved)
>
> >
> > > The reason I started the FTRACE_WITH_ARGS (which gave us ftrace_regs) in
> > > the first place, was because of the overhead you reported to me with
> > > ftrace_regs_caller and why you wanted to go the direct trampoline approach.
> > > That's when I realized I could use a subset because those registers were
> > > already being saved. The only reason FTRACE_WITH_REGS was created was it
> > > had to supply full pt_regs (including flags) and emulate a breakpoint for
> > > the kprobes interface. But in reality, nothing really needs all that.
> > >
> > > > It's not about access to args.
> > > > pt_regs is passed from bpf prog further into all kinds of perf event
> > > > functions including stack walking.
> >
> > If all accesses are done in BPF bytecode, we could (theoretically)
> > have the verifier and JIT work together to deny accesses to
> > unpopulated fields, or relocate pt_regs accesses to ftrace_regs
> > accesses to keep backward compatibility with existing multi_kprobe BPF
> > programs.
>
> Yeah, that is what I would like to suggest, and what my patch does.
> (let me update rethook too, it'll be a bit tricky since I don't want
> break anything)
>
> Thanks,
>
> >
> > Is there a risk that a "multi_kprobe" program could call into a BPF
> > helper or kfunc that reads this pt_regs pointer and expect certain
> > fields to be set ? I suppose we could also deny giving that "pt_regs"
> > pointer to a helper... :/
I think Alexei answered this earlier in the thread:
>From bpf side we don't care that such pt_regs is 100% filled in or
>only partial as long as this pt_regs pointer is valid for perf_event_output
>and stack walking that consume pt_regs.
>I believe that was and still is the case for both x86 and arm64.
jirka
next prev parent reply other threads:[~2023-08-07 20:48 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-31 7:30 [PATCH v4 0/9] tracing: Improbe BTF support on probe events Masami Hiramatsu (Google)
2023-07-31 7:30 ` [PATCH v4 1/9] tracing/probes: Support BTF argument on module functions Masami Hiramatsu (Google)
2023-07-31 7:30 ` [PATCH v4 2/9] bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF Masami Hiramatsu (Google)
2023-07-31 7:30 ` [PATCH v4 3/9] bpf/btf: Add a function to search a member of a struct/union Masami Hiramatsu (Google)
2023-07-31 21:59 ` Alexei Starovoitov
2023-07-31 23:57 ` Masami Hiramatsu
2023-08-01 0:29 ` Alexei Starovoitov
2023-08-01 15:02 ` Masami Hiramatsu
2023-08-01 15:20 ` Steven Rostedt
2023-08-01 15:32 ` Steven Rostedt
2023-08-01 22:18 ` Alexei Starovoitov
2023-08-01 23:09 ` Steven Rostedt
2023-08-01 23:44 ` Alexei Starovoitov
2023-08-02 0:21 ` Masami Hiramatsu
2023-08-02 0:40 ` Steven Rostedt
2023-08-02 0:44 ` Steven Rostedt
2023-08-02 2:22 ` Alexei Starovoitov
2023-08-02 2:32 ` Steven Rostedt
2023-08-02 14:07 ` Masami Hiramatsu
2023-08-02 15:08 ` Florent Revest
2023-08-02 13:56 ` Masami Hiramatsu
2023-08-02 14:48 ` Florent Revest
2023-08-02 15:47 ` Florent Revest
2023-08-03 1:55 ` Masami Hiramatsu
2023-08-02 18:24 ` Alexei Starovoitov
2023-08-02 18:38 ` Steven Rostedt
2023-08-02 19:48 ` Alexei Starovoitov
2023-08-02 20:12 ` Steven Rostedt
2023-08-02 21:28 ` Alexei Starovoitov
2023-08-02 14:44 ` Florent Revest
2023-08-02 16:11 ` Steven Rostedt
2023-08-03 15:42 ` Masami Hiramatsu
2023-08-03 16:37 ` Florent Revest
2023-08-07 20:48 ` Jiri Olsa [this message]
2023-08-08 14:32 ` Masami Hiramatsu
2023-08-01 1:15 ` Steven Rostedt
2023-08-01 2:24 ` Alexei Starovoitov
2023-08-01 13:35 ` Steven Rostedt
2023-08-01 15:18 ` Masami Hiramatsu
2023-08-01 22:21 ` Alexei Starovoitov
2023-08-01 23:17 ` Masami Hiramatsu
2023-07-31 7:30 ` [PATCH v4 4/9] tracing/probes: Support BTF based data structure field access Masami Hiramatsu (Google)
2023-07-31 7:30 ` [PATCH v4 5/9] tracing/probes: Support BTF field access from $retval Masami Hiramatsu (Google)
2023-07-31 7:31 ` [PATCH v4 6/9] tracing/probes: Add string type check with BTF Masami Hiramatsu (Google)
2023-07-31 7:31 ` [PATCH v4 7/9] tracing/fprobe-event: Assume fprobe is a return event by $retval Masami Hiramatsu (Google)
2023-07-31 7:31 ` [PATCH v4 8/9] selftests/ftrace: Add BTF fields access testcases Masami Hiramatsu (Google)
2023-07-31 7:31 ` [PATCH v4 9/9] Documentation: tracing: Update fprobe event example with BTF field Masami Hiramatsu (Google)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZNFYnbqlyOjo5dkH@krava \
--to=olsajiri@gmail.com \
--cc=acme@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=martin.lau@linux.dev \
--cc=mhiramat@kernel.org \
--cc=peterz@infradead.org \
--cc=revest@chromium.org \
--cc=rostedt@goodmis.org \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).