From: Kris Van Hees <kris.van.hees@oracle.com>
To: Alan Maguire <alan.maguire@oracle.com>
Cc: Kris Van Hees <kris.van.hees@oracle.com>,
dtrace@lists.linux.dev, dtrace-devel@oss.oracle.com
Subject: Re: [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix
Date: Fri, 18 Oct 2024 14:38:17 -0400 [thread overview]
Message-ID: <ZxKrGURKZm+ARD8F@oracle.com> (raw)
In-Reply-To: <d8aeeabb-fc53-44a6-9f8c-c5b3a1d8ae40@oracle.com>
On Fri, Oct 18, 2024 at 05:15:59PM +0100, Alan Maguire wrote:
> On 18/10/2024 16:41, Kris Van Hees wrote:
> > Before I review further, I have a question... Do we need to consider the
> > <func>.<suffix> symbols as separate probes from <func> (at a user level),
> > or can we group them together? I am hoping that grouping them together
> > would be the preference if only because the suffix versions result from
> > compiler optimizations and it is therefore likely that a user would want to
> > be able to probe <func> and expect it to work even if the compiler decided
> > to do something under the covers that results in a suffix-variant to also
> > be created.
> >
>
> Right, that's one of the changes in this patch series versus v2. We
> expose the probe as <func>, even if it the underlying function has
> become <func>.<suffix>. This provides more stability to the user and we
> are guaranteed it still represents a function boundary because of its
> presence in available_filter_functions (this implies it has an
> associated mcount-tagged fentry site).
Good - then I am on the right track. Continuing review - focusing mostly on
whether we currently properly handle multiple FBT probes with the same fully
qualified probe name.
> > On Wed, Oct 16, 2024 at 04:54:05PM +0100, Alan Maguire wrote:
> >> This series is focused on solving a few issues with fprobe-based
> >> attachment which prevent us being able to attach to functions
> >> like finish_task_switch.isra.0. Such functions are present in
> >> available_filter_functions, and represent real function boundaries
> >> (since they correspond to the mcount function boundary sites)
> >> but because they either lack BTF representations, or because
> >> those BTF representations are named without the .isra suffix, attach
> >> via fentry/fexit is currently impossible. Falling back to the
> >> kprobe implementation is the best solution here.
> >>
> >> However, for stability, it is best to represent the probes for
> >> these functions without the ".isra" suffix, so we need to store
> >> the full function name (with suffix) in the tracepoint data when
> >> the probe is populated. Patch 1 supports this.
> >>
> >> Patch 2 ensures that we use kprobe implementation for any "."-suffixed
> >> functions. An additional fbt provider with kprobe implementation is
> >> created to support this (so as not to disturb existing fprobes for other
> >> functions). At kprobe attach we use the full function name stored
> >> as tp event data to carry out attach.
> >>
> >> Next we need to ensure we do not end up with a mix of kprobes and
> >> fprobes. Ideally we would do this in a more fine-grained manner, but
> >> for now just ensure we do not have an fprobe/kprobe mix program-wide.
> >> When fprobes are active, we will only use kprobes for "."-suffixed
> >> functions that are used, so in practice such mixes will be relatively
> >> rare.
> >>
> >> As Kris pointed out [1] at compilation time, trampolines have not yet been
> >> set up, so we can replace the provider underlying fbt at that time.
> >> The probe_info() callbacks are used to check for a mix of kprobe and
> >> fprobe implementations; we check for multiple fbt providers which
> >> have a count of used probes > 0; if this occurs, switch the fbt provider
> >> using fprobe to use the kprobe implementation and reset any event
> >> ids associated with fprobes from the BTF id used in fprobes to 0.
> >>
> >> Finally we can then use fbt::finish_task_switch:return as the
> >> dependent probe for sched:::on-cpu, as we now can probe it even
> >> if it becomes finish_task_switch.isra.0.
> >>
> >> So to recap:
> >>
> >> Patch 1 supports storing/freeing event data with tp events.
> >> Patch 2 allows tracing of "."-suffixed functions like
> >> finish_task_switch.isra.0 via a kprobe-backed fbt implementation.
> >> Patch 3 ensures we do not end up with a kprobe/fprobe mix.
> >> Patch 4 then uses the fact we can now trace "."-suffixed functions
> >> (with kprobe fallback) by using fbt:vmlinux:finish_task_switch:return
> >> as the kprobe dependent event for sched:::on-cpu . This function is
> >> often optimized to become finish_task_switch.isra.0.
> >>
> >> Tested on upstream, 5.15 and 5.4 kernels.
> >>
> >> Changes since v2:
> >>
> >> - probe function name exposed drops the suffix (Kris, patches 1, 2)
> >> - restrict kprobe use to "."-suffixed functions; this makes their use
> >> less likely in the fprobe environment. Do this instead of creating
> >> a "fake" fprobe probe with kprobe backing (Kris, patch 2)
> >> - modify fallback logic to handle kprobe/fprobe mix (patch 3)
> >> - modify sched:::on-cpu to use fbt::finish_task_switch:return ; no
> >> wildcard needed now that probe function name is unsuffixed.
> >>
> >> Changes since v1:
> >>
> >> - simplified approach by just swapping out probe impl when BTF lookup fails
> >> (Kris, patch 2)
> >>
> >> [1] https://lore.kernel.org/dtrace/20241009140236.883884-1-alan.maguire@oracle.com/
> >>
> >> Alan Maguire (4):
> >> dt_provider_tp: add optional event data, freed on tp free
> >> fbt: support "."-suffixed functions for kprobes
> >> fbt: avoid mix of kprobe, fprobe implementations for used probes
> >> sched: fix on-cpu firing for kernels < 5.16
> >>
> >> libdtrace/dt_prov_fbt.c | 138 ++++++++++++++++++++++++++++++++-----
> >> libdtrace/dt_prov_sched.c | 23 +------
> >> libdtrace/dt_provider_tp.c | 27 ++++++++
> >> libdtrace/dt_provider_tp.h | 8 +++
> >> 4 files changed, 158 insertions(+), 38 deletions(-)
> >>
> >> --
> >> 2.43.5
> >>
prev parent reply other threads:[~2024-10-18 18:38 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-16 15:54 [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 1/4] dt_provider_tp: add optional event data, freed on tp free Alan Maguire
2024-11-14 5:26 ` [DTrace-devel] " Kris Van Hees
2024-11-14 16:45 ` Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 2/4] fbt: support "."-suffixed functions for kprobes Alan Maguire
2024-11-28 2:22 ` Kris Van Hees
2024-11-28 9:58 ` Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 3/4] fbt: avoid mix of kprobe, fprobe implementations for used probes Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 4/4] sched: fix on-cpu firing for kernels < 5.16 Alan Maguire
2024-10-18 15:41 ` [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix Kris Van Hees
2024-10-18 16:15 ` Alan Maguire
2024-10-18 18:38 ` Kris Van Hees [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZxKrGURKZm+ARD8F@oracle.com \
--to=kris.van.hees@oracle.com \
--cc=alan.maguire@oracle.com \
--cc=dtrace-devel@oss.oracle.com \
--cc=dtrace@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.