public inbox for dtrace@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix
@ 2024-10-16 15:54 Alan Maguire
  2024-10-16 15:54 ` [PATCH v3 dtrace 1/4] dt_provider_tp: add optional event data, freed on tp free Alan Maguire
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Alan Maguire @ 2024-10-16 15:54 UTC (permalink / raw)
  To: dtrace; +Cc: dtrace-devel, Alan Maguire

This series is focused on solving a few issues with fprobe-based
attachment which prevent us being able to attach to functions
like finish_task_switch.isra.0.  Such functions are present in
available_filter_functions, and represent real function boundaries
(since they correspond to the mcount function boundary sites)
but because they either lack BTF representations, or because
those BTF representations are named without the .isra suffix, attach
via fentry/fexit is currently impossible.  Falling back to the
kprobe implementation is the best solution here.

However, for stability, it is best to represent the probes for
these functions without the ".isra" suffix, so we need to store
the full function name (with suffix) in the tracepoint data when
the probe is populated.  Patch 1 supports this.

Patch 2 ensures that we use kprobe implementation for any "."-suffixed
functions.  An additional fbt provider with kprobe implementation is
created to support this (so as not to  disturb existing fprobes for other
functions).  At kprobe attach we use the full function name stored
as tp event data to carry out attach.

Next we need to ensure we do not end up with a mix of kprobes and
fprobes.  Ideally we would do this in a more fine-grained manner, but
for now just ensure we do not have an fprobe/kprobe mix program-wide.
When fprobes are active, we will only use kprobes for "."-suffixed
functions that are used, so in practice such mixes will be relatively
rare.

As Kris pointed out [1] at compilation time, trampolines have not yet been
set up, so we can replace the provider underlying fbt at that time.
The probe_info() callbacks are used to check for a mix of kprobe and
fprobe implementations; we check for multiple fbt providers which
have a count of used probes > 0; if this occurs, switch the fbt provider
using fprobe to use the kprobe implementation and reset any event
ids associated with fprobes from the BTF id used in fprobes to 0.

Finally we can then use fbt::finish_task_switch:return as the
dependent probe for sched:::on-cpu, as we now can probe it even
if it becomes finish_task_switch.isra.0.

So to recap:

Patch 1 supports storing/freeing event data with tp events.
Patch 2 allows tracing of "."-suffixed functions like
finish_task_switch.isra.0 via a kprobe-backed fbt implementation.
Patch 3 ensures we do not end up with a kprobe/fprobe mix.
Patch 4 then uses the fact we can now trace "."-suffixed functions
(with kprobe fallback) by using fbt:vmlinux:finish_task_switch:return
as the kprobe dependent event for sched:::on-cpu  . This function is
often optimized to become finish_task_switch.isra.0.

Tested on upstream, 5.15 and 5.4 kernels.

Changes since v2:

- probe function name exposed drops the suffix (Kris, patches 1, 2)
- restrict kprobe use to "."-suffixed functions; this makes their use
  less likely in the fprobe environment.  Do this instead of creating
  a "fake" fprobe probe with kprobe backing (Kris, patch 2)
- modify fallback logic to handle kprobe/fprobe mix (patch 3)
- modify sched:::on-cpu to use fbt::finish_task_switch:return ; no
  wildcard needed now that probe function name is unsuffixed.

Changes since v1:

- simplified approach by just swapping out probe impl when BTF lookup fails
  (Kris, patch 2)

[1] https://lore.kernel.org/dtrace/20241009140236.883884-1-alan.maguire@oracle.com/

Alan Maguire (4):
  dt_provider_tp: add optional event data, freed on tp free
  fbt: support "."-suffixed functions for kprobes
  fbt: avoid mix of kprobe, fprobe implementations for used probes
  sched: fix on-cpu firing for kernels < 5.16

 libdtrace/dt_prov_fbt.c    | 138 ++++++++++++++++++++++++++++++++-----
 libdtrace/dt_prov_sched.c  |  23 +------
 libdtrace/dt_provider_tp.c |  27 ++++++++
 libdtrace/dt_provider_tp.h |   8 +++
 4 files changed, 158 insertions(+), 38 deletions(-)

-- 
2.43.5


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-11-28  9:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-16 15:54 [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 1/4] dt_provider_tp: add optional event data, freed on tp free Alan Maguire
2024-11-14  5:26   ` [DTrace-devel] " Kris Van Hees
2024-11-14 16:45     ` Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 2/4] fbt: support "."-suffixed functions for kprobes Alan Maguire
2024-11-28  2:22   ` Kris Van Hees
2024-11-28  9:58     ` Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 3/4] fbt: avoid mix of kprobe, fprobe implementations for used probes Alan Maguire
2024-10-16 15:54 ` [PATCH v3 dtrace 4/4] sched: fix on-cpu firing for kernels < 5.16 Alan Maguire
2024-10-18 15:41 ` [PATCH v3 dtrace 0/4] kprobe support for .isra.0, sched fix Kris Van Hees
2024-10-18 16:15   ` Alan Maguire
2024-10-18 18:38     ` Kris Van Hees

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox