From: Jiri Olsa <jolsa@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
"Masami Hiramatsu" <mhiramat@kernel.org>,
"Andrii Nakryiko" <andrii.nakryiko@gmail.com>,
"Jiri Olsa" <jolsa@kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andriin@fb.com>,
Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
"Martin KaFai Lau" <kafai@fb.com>,
"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@chromium.org>, "Daniel Xu" <dxu@dxuuu.xyz>,
"Jesper Brouer" <jbrouer@redhat.com>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Viktor Malik" <vmalik@redhat.com>
Subject: Re: [PATCHv2 RFC bpf-next 0/7] bpf: Add support for ftrace probe
Date: Wed, 21 Apr 2021 15:40:37 +0200 [thread overview]
Message-ID: <YIArVa6IE37vsazU@krava> (raw)
In-Reply-To: <CAADnVQLh3tCWi=TiWnJVaMrYhJ=j-xSrJ72+XnZDP8CMZM+1mQ@mail.gmail.com>
On Tue, Apr 20, 2021 at 04:38:45PM -0700, Alexei Starovoitov wrote:
SNIP
> > >
> > > I don't see how you can do it without BTF.
> > > The mass-attach feature should prepare generic 6 or so arguments
> > > from all functions it attached to.
> > > On x86-64 it's trivial because 6 regs are the same.
> > > On arm64 is now more challenging since return value regs overlaps with
> > > first argument, so bpf trampoline (when it's ready for arm64) will look
> > > a bit different than bpf trampoline on x86-64 to preserve arg0, arg1,
> > > ..arg6, ret
> > > 64-bit values that bpf prog expects to see.
> > > On x86-32 it's even more trickier, since the same 6 args need to be copied
> > > from a combination of regs and stack.
> > > This is not some hypothetical case. We already use BTF in x86-32 JIT
> > > and btf_func_model was introduced specifically to handle such cases.
> > > So I really don't see how ftrace can do that just yet. It has to understand BTF
> > > of all of the funcs it attaches to otherwise it's just saving all regs.
> > > That approach was a pain to deal with.
> >
> > ok, my idea was to get regs from the ftrace and have arch specific code
> > to prepare 6 (or less) args for ebpf program.. that part would be
> > already in bpf code
> >
> > so you'd like to see this functionality directly in ftrace, so we don't
> > save unneeded regs, is that right?
>
> What do you mean by "already in bpf code" ?
that it would not be part of ftrace code
>
> The main question is an api across layers.
> If ftrace doesn't use BTF it has to prepare all regs that could be used.
> Meaning on x86-64 that has to be 6 regs for args, 1 reg for return and
> stack pointer.
> That would be enough to discover input args and return value in fexit.
> On arm64 that has to be similar, but while x86-64 can do with single pt_regs
> where %rax is updated on fexit, arm64 cannot do so, since the same register
> is used as arg1 and as a return value.
> The most generic api between ftrace and bpf layers would be two sets of
> pt_regs. One on entry and one on exit, but that's going to be very expensive.
that's what I was going for and I think it's the only way if
we use ftrace graph_ops for mass attaching
> On x86-32 it would have to be 3 regs plus stack pointer and another 2 regs
> to cover all input args and return value.
> So there will be plenty of per-arch differences.
>
> Jiri, if you're thinking of a bpf helper like:
> u64 bpf_read_argN(pt_regs, ip, arg_num)
> that will do lookup of btf_id from ip, then it will parse btf_id and
> function proto,
> then it will translate that to btf_func_model and finally will extract the right
> argument value from a combination of stack and regs ?
> That's doable, but it's a lot of run-time overhead.
> It would be usable by bpf progs that don't care much about run-time perf
> and don't care that they're not usable 24/7 on production systems.
> Such tools exist and they're useful,
> but I'd like this mass-attach facility to be usable everywhere
> including the production and 24/7 tracing.
I did not think of this option, but yep, seems also expensive
> Hence I think it's better to do this per-arch translation during bpf
> prog attach.
> That's exactly what bpf trampoline is doing.
> Currently it's doing for single btf_id, single trampoline, and single bpf prog.
> To make the same logic work across N attach points the trampoline logic
> would need to iterate all btf_func_model-s of all btf_id-s and generate
> M trampolines (where M < N) for a combination of possible argument passing.
> On x86-64 the M will be equal to 1. On arm64 it will be equal to 1 as well.
> But on x86-32 it will depend on a set of btf_ids. It could be 1,2,..10.
> Since bpf doesn't allow to attach to struct-by-value it's only 32-bit and 64-bit
> integers to deal with and number of combinations of possible calling conventions
> is actually very small. I suspect it won't be more than 10.
> This way there will be no additional run-time overhead and bpf programs
> can be portable. They will work as-is on x86-64, x86-32, arm64.
> Just like fentry/fexit work today. Or rather they will be portable
> when bpf trampoline is supported on these archs.
> This portability is the key feature of bpf trampoline design. The bpf trampoline
> was implemented for x86-64 only so far. Arm64 patches are still wip.
> btf_func_model is used by both x86-64 and x86-32 JITs.
ok, I understand why this would be the best solution for calling
the program from multiple probes
I think it's the 'attach' layer which is the source of problems
currently there is ftrace's fgraph_ops support that allows fast mass
attach and calls callbacks for functions entry and exit:
https://lore.kernel.org/lkml/20190525031633.811342628@goodmis.org/
these callbacks get ip/parent_ip and can get pt_regs (that's not
implemented at the moment)
but that gets us to the situation of having full pt_regs on both
entry/exit callbacks that you described above and want to avoid,
but I think it's the price for having this on top of generic
tracing layer
the way ftrace's fgraph_ops is implemented, I'm not sure it can
be as fast as current bpf entry/exit trampoline
but to better understand the pain points I think I'll try to implement
the 'mass trampolines' call to the bpf program you described above and
attach it for now to fgraph_ops callbacks
perhaps this is a good topic to discuss in one of the Thursday's BPF mtg?
thanks,
jirka
next prev parent reply other threads:[~2021-04-21 13:41 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-13 12:15 [PATCHv2 RFC bpf-next 0/7] bpf: Add support for ftrace probe Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 1/7] bpf: Move bpf_prog_start/end functions to generic place Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 2/7] bpf: Add bpf_functions object Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 3/7] bpf: Add support to attach program to ftrace probe Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 4/7] libbpf: Add btf__find_by_pattern_kind function Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 5/7] libbpf: Add support to load and attach ftrace probe Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 6/7] selftests/bpf: Add ftrace probe to fentry test Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 7/7] selftests/bpf: Add ftrace probe test Jiri Olsa
2021-04-14 1:04 ` [PATCHv2 RFC bpf-next 0/7] bpf: Add support for ftrace probe Andrii Nakryiko
2021-04-14 12:19 ` Jiri Olsa
2021-04-14 22:46 ` Andrii Nakryiko
2021-04-15 14:00 ` Jiri Olsa
2021-04-15 15:10 ` Steven Rostedt
2021-04-15 17:39 ` Jiri Olsa
2021-04-15 18:18 ` Steven Rostedt
2021-04-15 18:21 ` Steven Rostedt
2021-04-15 21:49 ` Jiri Olsa
2021-04-15 23:30 ` Steven Rostedt
2021-04-19 20:51 ` Jiri Olsa
2021-04-19 22:00 ` Steven Rostedt
2021-04-15 18:31 ` Yonghong Song
2021-04-15 20:45 ` Andrii Nakryiko
2021-04-15 21:00 ` Steven Rostedt
2021-04-16 15:03 ` Masami Hiramatsu
2021-04-16 16:48 ` Steven Rostedt
2021-04-19 14:29 ` Masami Hiramatsu
2021-04-20 12:51 ` Jiri Olsa
2021-04-20 15:33 ` Alexei Starovoitov
2021-04-20 16:33 ` Steven Rostedt
2021-04-20 16:52 ` Jiri Olsa
2021-04-20 23:38 ` Alexei Starovoitov
2021-04-21 13:40 ` Jiri Olsa [this message]
2021-04-21 14:05 ` Steven Rostedt
2021-04-21 18:52 ` Andrii Nakryiko
2021-04-21 19:18 ` Jiri Olsa
2021-04-22 14:24 ` Steven Rostedt
2021-04-21 21:37 ` Jiri Olsa
2021-04-22 1:17 ` Steven Rostedt
2021-04-20 4:51 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YIArVa6IE37vsazU@krava \
--to=jolsa@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dxu@dxuuu.xyz \
--cc=jbrouer@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=mhiramat@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=toke@redhat.com \
--cc=vmalik@redhat.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).