From: Jiri Olsa <olsajiri@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@chromium.org>
Subject: Re: [RFC bpf-next] bpf: Use prog->active instead of bpf_prog_active for kprobe_multi
Date: Mon, 13 Jun 2022 14:36:38 +0200 [thread overview]
Message-ID: <YqcvVpcJR6R8MH35@krava> (raw)
In-Reply-To: <20220611205326.7ladtowtvt3ap6z3@macbook-pro-3.dhcp.thefacebook.com>
On Sat, Jun 11, 2022 at 01:53:26PM -0700, Alexei Starovoitov wrote:
> On Fri, Jun 10, 2022 at 10:58:50AM -0700, Andrii Nakryiko wrote:
> > On Thu, Jun 9, 2022 at 3:03 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Jun 9, 2022 at 11:27 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 7, 2022 at 9:29 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Tue, May 31, 2022 at 4:24 PM Andrii Nakryiko
> > > > > <andrii.nakryiko@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, May 25, 2022 at 4:40 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > > > > >
> > > > > > > hi,
> > > > > > > Alexei suggested to use prog->active instead global bpf_prog_active
> > > > > > > for programs attached with kprobe multi [1].
> > > > > > >
> > > > > > > AFAICS this will bypass bpf_disable_instrumentation, which seems to be
> > > > > > > ok for some places like hash map update, but I'm not sure about other
> > > > > > > places, hence this is RFC post.
> > > > > > >
> > > > > > > I'm not sure how are kprobes different to trampolines in this regard,
> > > > > > > because trampolines use prog->active and it's not a problem.
> > > > > > >
> > > > > > > thoughts?
> > > > > > >
> > > > > >
> > > > > > Let's say we have two kernel functions A and B? B can be called from
> > > > > > BPF program though some BPF helper, ok? Now let's say I have two BPF
> > > > > > programs kprobeX and kretprobeX, both are attached to A and B. With
> > > > > > using prog->active instead of per-cpu bpf_prog_active, what would be
> > > > > > the behavior when A is called somewhere in the kernel.
> > > > > >
> > > > > > 1. A is called
> > > > > > 2. kprobeX is activated for A, calls some helper which eventually calls B
> > > > > > 3. kprobeX is attempted to be called for B, but is skipped due to prog->active
> > > > > > 4. B runs
> > > > > > 5. kretprobeX is activated for B, calls some helper which eventually calls B
> > > > > > 6. kprobeX is ignored (prog->active > 0)
> > > > > > 7. B runs
> > > > > > 8. kretprobeX is ignored (prog->active > 0)
> > > > > > 9. kretprobeX is activated for A, calls helper which calls B
> > > > > > 10. kprobeX is activated for B
> > > > > > 11. kprobeX is ignored (prog->active > 0)
> > > > >
> > > > > not correct. kprobeX actually runs.
> > > > > but the end result is correct.
> > > > >
> > > >
> > > > Right, it was a long sequence, but you got the idea :)
>
> The above analysis was actually incorrect.
> There are three kprobe flavors: int3, opt, ftrace.
> while multi-kprobe is based on fprobe.
> kretprobe can be traditional and rethook based.
> In all of these mechanisms there is at least ftrace_test_recursion_trylock()
> and for kprobes there is kprobe_running (per-cpu current_kprobe) filter
> that acts as bpf_prog_active.
>
> So this:
> 1. kprobeX for A
> 2. kretprobeX for B
> 3. kretprobeX for A
> 4. kprobeX for B
> doesn't seem possible.
> Unless there is reproducer of above behavior there is no point using above
> as a design argument.
yes, I just experimentally verified ;-) I have a selftest with new test
helper doing Andrii's scenario (with kprobes on ftrace) and kprobe_running
check will take care of the entry side:
if (kprobe_running()) {
kprobes_inc_nmissed_count(p);
and as a results kretprobe won't be installed as well
>
> > > > > It's awful. We have to fix it.
> > > >
> > > > You can call it "a fix" if you'd like, but it's changing a very
> > > > user-visible behavior and guarantees on which users relied for a
> > > > while. So even if we switch to per-prog protection it will have to be
> > > > some sort of opt-in (flag, new program type, whatever).
> > >
> > > No opt-in allowed for fixes and it's a bug fix.
> > > No one should rely on broken kernel behavior.
> > > If retsnoop rely on that it's really retsnoop's fault.
> >
> > No point in arguing if we can't even agree on whether this is a bug or
> > not, sorry.
> >
> > Getting kretprobe invocation out of the blue without getting
> > corresponding kprobe invocation first (both of which were successfully
> > attached) seems like more of a bug to me. But perhaps that's a matter
> > of subjective opinion.
>
> The issue of kprobe/kretprobe mismatch was known for long time.
> First maxactive was an issue. It should be solved by rethook now.
> Then kprobe/kretprobe attach is not atomic.
> bpf prog attaching kprobe and kretprobe to the same func cannot assume
> that they will always pair. bcc scripts had to deal with this.
>
> Say, kprobe/kretprobe will become fentry/fexit like with prog->active only.
> If retsnoop wants to do its own per-cpu prog_active counter it will
> prevent out-of-order fentry/fexit for the case when the same prog
> is attached to before-bpf-func and during-bpf-func. Only retsnoop's progs
> will miss during-bpf-func events. Such policy decisions is localized to one tool.
> All other users will see the events they care about.
> kprobe/kretprobe/fprobe run handlers with preemption disabled which makes
> these mechanisms unfriendly to RT. Their design shows that they're not suitable
> for always-on running. When bpf+kprobe was introduced 7 years ago it wasn't
> meant to be 24-7 either. bpf_prog_active is modeled like current_kprobe.
> It was addressing the deadlock issue with spinlocks in maps.
> Recursion was not an issue.
> Sadly kprobe/kretprobe/fprobe look unfixable in this form. Too much work
> needs to be done to enable something like:
> user A attaches prog A to func X. X runs, prog A runs with migration disabled.
> Preemption. Something else starts on this cpu. Another user B attaching prog B
> to func Y should see its prog being executed.
> With kprobes it looks impossible. While fentry was designed with this use case
> in mind. Note it's not about sleepable progs. Normal bpf progs can be preempted.
>
> Back to Jiri's question whether we can remove bpf_prog_active from
> trace_call_bpf. Yes. We can and we should. It will allow bperf to collect
> stack traces that include bpf progs. It's an important fix. Incorrect retsnoop
> assumptions about kprobes will not be affected.
which bperf tool are you talking about (I found 2)?
and given that the kprobe layer is effectively doing the bpf_prog_active check,
what's the benefit of the change then?
thanks,
jirka
next prev parent reply other threads:[~2022-06-13 15:19 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-25 11:40 [RFC bpf-next] bpf: Use prog->active instead of bpf_prog_active for kprobe_multi Jiri Olsa
2022-05-26 16:23 ` Yonghong Song
2022-05-31 23:24 ` Andrii Nakryiko
2022-06-08 4:29 ` Alexei Starovoitov
2022-06-08 12:32 ` Jiri Olsa
2022-06-09 18:26 ` Andrii Nakryiko
2022-06-09 22:03 ` Alexei Starovoitov
2022-06-10 17:58 ` Andrii Nakryiko
2022-06-11 20:53 ` Alexei Starovoitov
2022-06-13 12:36 ` Jiri Olsa [this message]
2022-06-13 16:32 ` Andrii Nakryiko
2022-06-13 16:38 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YqcvVpcJR6R8MH35@krava \
--to=olsajiri@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=netdev@vger.kernel.org \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.