public inbox for dtrace@lists.linux.dev
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: dtrace@lists.linux.dev
Cc: dtrace-devel@oss.oracle.com, Alan Maguire <alan.maguire@oracle.com>
Subject: [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix
Date: Wed, 16 Oct 2024 16:54:05 +0100	[thread overview]
Message-ID: <20241016155409.4038017-1-alan.maguire@oracle.com> (raw)

This series is focused on solving a few issues with fprobe-based
attachment which prevent us being able to attach to functions
like finish_task_switch.isra.0.  Such functions are present in
available_filter_functions, and represent real function boundaries
(since they correspond to the mcount function boundary sites)
but because they either lack BTF representations, or because
those BTF representations are named without the .isra suffix, attach
via fentry/fexit is currently impossible.  Falling back to the
kprobe implementation is the best solution here.

However, for stability, it is best to represent the probes for
these functions without the ".isra" suffix, so we need to store
the full function name (with suffix) in the tracepoint data when
the probe is populated.  Patch 1 supports this.

Patch 2 ensures that we use kprobe implementation for any "."-suffixed
functions.  An additional fbt provider with kprobe implementation is
created to support this (so as not to  disturb existing fprobes for other
functions).  At kprobe attach we use the full function name stored
as tp event data to carry out attach.

Next we need to ensure we do not end up with a mix of kprobes and
fprobes.  Ideally we would do this in a more fine-grained manner, but
for now just ensure we do not have an fprobe/kprobe mix program-wide.
When fprobes are active, we will only use kprobes for "."-suffixed
functions that are used, so in practice such mixes will be relatively
rare.

As Kris pointed out [1] at compilation time, trampolines have not yet been
set up, so we can replace the provider underlying fbt at that time.
The probe_info() callbacks are used to check for a mix of kprobe and
fprobe implementations; we check for multiple fbt providers which
have a count of used probes > 0; if this occurs, switch the fbt provider
using fprobe to use the kprobe implementation and reset any event
ids associated with fprobes from the BTF id used in fprobes to 0.

Finally we can then use fbt::finish_task_switch:return as the
dependent probe for sched:::on-cpu, as we now can probe it even
if it becomes finish_task_switch.isra.0.

So to recap:

Patch 1 supports storing/freeing event data with tp events.
Patch 2 allows tracing of "."-suffixed functions like
finish_task_switch.isra.0 via a kprobe-backed fbt implementation.
Patch 3 ensures we do not end up with a kprobe/fprobe mix.
Patch 4 then uses the fact we can now trace "."-suffixed functions
(with kprobe fallback) by using fbt:vmlinux:finish_task_switch:return
as the kprobe dependent event for sched:::on-cpu  . This function is
often optimized to become finish_task_switch.isra.0.

Tested on upstream, 5.15 and 5.4 kernels.

Changes since v2:

- probe function name exposed drops the suffix (Kris, patches 1, 2)
- restrict kprobe use to "."-suffixed functions; this makes their use
  less likely in the fprobe environment.  Do this instead of creating
  a "fake" fprobe probe with kprobe backing (Kris, patch 2)
- modify fallback logic to handle kprobe/fprobe mix (patch 3)
- modify sched:::on-cpu to use fbt::finish_task_switch:return ; no
  wildcard needed now that probe function name is unsuffixed.

Changes since v1:

- simplified approach by just swapping out probe impl when BTF lookup fails
  (Kris, patch 2)

[1] https://lore.kernel.org/dtrace/20241009140236.883884-1-alan.maguire@oracle.com/

Alan Maguire (4):
  dt_provider_tp: add optional event data, freed on tp free
  fbt: support "."-suffixed functions for kprobes
  fbt: avoid mix of kprobe, fprobe implementations for used probes
  sched: fix on-cpu firing for kernels < 5.16

 libdtrace/dt_prov_fbt.c    | 138 ++++++++++++++++++++++++++++++++-----
 libdtrace/dt_prov_sched.c  |  23 +------
 libdtrace/dt_provider_tp.c |  27 ++++++++
 libdtrace/dt_provider_tp.h |   8 +++
 4 files changed, 158 insertions(+), 38 deletions(-)

-- 
2.43.5


             reply	other threads:[~2024-10-16 15:54 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16 15:54 Alan Maguire [this message]
2024-10-16 15:54 ` [PATCH v3 dtrace 1/4] dt_provider_tp: add optional event data, freed on tp free Alan Maguire
2024-11-14  5:26   ` [DTrace-devel] " Kris Van Hees
2024-11-14 16:45     ` Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 2/4] fbt: support "."-suffixed functions for kprobes Alan Maguire
2024-11-28  2:22   ` Kris Van Hees
2024-11-28  9:58     ` Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 3/4] fbt: avoid mix of kprobe, fprobe implementations for used probes Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 4/4] sched: fix on-cpu firing for kernels < 5.16 Alan Maguire
2024-10-18 15:41 ` [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix Kris Van Hees
2024-10-18 16:15   ` Alan Maguire
2024-10-18 18:38     ` Kris Van Hees

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241016155409.4038017-1-alan.maguire@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=dtrace-devel@oss.oracle.com \
    --cc=dtrace@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox