* [PATCH 2/8] sched: clean up unnecessary includes and functions
@ 2025-03-07 21:34 Kris Van Hees
2025-03-07 21:34 ` [PATCH 3/8] rawfbt: perform lookup on true symbol names Kris Van Hees
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: Kris Van Hees @ 2025-03-07 21:34 UTC (permalink / raw)
To: dtrace, dtrace-devel; +Cc: Kris Van Hees
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
libdtrace/dt_prov_sched.c | 30 ++----------------------------
1 file changed, 2 insertions(+), 28 deletions(-)
diff --git a/libdtrace/dt_prov_sched.c b/libdtrace/dt_prov_sched.c
index e05ef246..125d5891 100644
--- a/libdtrace/dt_prov_sched.c
+++ b/libdtrace/dt_prov_sched.c
@@ -1,6 +1,6 @@
/*
* Oracle Linux DTrace.
- * Copyright (c) 2023, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2023, 2025, Oracle and/or its affiliates. All rights reserved.
* Licensed under the Universal Permissive License v 1.0 as shown at
* http://oss.oracle.com/licenses/upl.
*
@@ -9,9 +9,6 @@
#include <assert.h>
#include <errno.h>
-#include <linux/perf_event.h>
-#include <perfmon/pfmlib_perf_event.h>
-
#include "dt_dctx.h"
#include "dt_cg.h"
#include "dt_provider_sdt.h"
@@ -146,36 +143,13 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
return 0;
}
-/*
- * We need a custom enabling for on-cpu probes to ensure that the fbt function
- * __perf_event_task_sched_in is called. __perf_event_task_sched_in will not
- * be called unless context switch perf events have been enabled, so we do that
- * here by opening a context switch count perf event but not attaching anything
- * to it to minimize overhead. The alternative - attaching to
- * cpc:::context_switches-all-1 and weeding out on- versus off-cpu events via a
- * trampoline is too expensive. This approach works stably across kernels
- * because __perf_event_task_sched_in() is not static, so not potentially
- * subject to inlining or other optimizations.
- */
-static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp)
-{
- return dt_sdt_enable(dtp, prp);
-}
-
-static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
-{
- if (prp->prv_data)
- close((int)(long)prp->prv_data);
-}
-
dt_provimpl_t dt_sched = {
.name = prvname,
.prog_type = BPF_PROG_TYPE_UNSPEC,
.populate = &populate,
- .enable = &enable,
+ .enable = &dt_sdt_enable,
.load_prog = &dt_bpf_prog_load,
.trampoline = &trampoline,
.probe_info = &dt_sdt_probe_info,
- .detach = &detach,
.destroy = &dt_sdt_destroy,
};
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/8] rawfbt: perform lookup on true symbol names
2025-03-07 21:34 [PATCH 2/8] sched: clean up unnecessary includes and functions Kris Van Hees
@ 2025-03-07 21:34 ` Kris Van Hees
2025-03-10 22:03 ` Eugene Loh
2025-03-07 21:34 ` [PATCH 4/8] ksyms: make symbol name filters less picky Kris Van Hees
` (4 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Kris Van Hees @ 2025-03-07 21:34 UTC (permalink / raw)
To: dtrace, dtrace-devel; +Cc: Kris Van Hees
When encountering a <sym>.<suffix> symbol, a symbol lookup was done for
<sym> instead of <sym>.<suffix> under the assumption that names with .
in them were not listed in kallsyms. But that is not true.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
libdtrace/dt_prov_rawfbt.c | 18 ------------------
1 file changed, 18 deletions(-)
diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c
index 4c8e8130..62f2f4f0 100644
--- a/libdtrace/dt_prov_rawfbt.c
+++ b/libdtrace/dt_prov_rawfbt.c
@@ -122,27 +122,9 @@ static int populate(dtrace_hdl_t *dtp)
* try to determine the module name.
*/
if (!p) {
- char *q;
-
- /*
- * For synthetic symbol names (those containing '.'),
- * we need to use the base name (before the '.') for
- * module name lookup, because the synthetic forms are
- * not recorded in kallsyms information.
- *
- * We replace the first '.' with a 0 to terminate the
- * string, and after the lookup, we put it back.
- */
- q = strchr(buf, '.');
- if (q != NULL)
- *q = '\0';
-
if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf,
NULL, &sip) == 0)
mod = sip.object;
-
- if (q != NULL)
- *q = '.';
} else
mod = p;
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/8] ksyms: make symbol name filters less picky
2025-03-07 21:34 [PATCH 2/8] sched: clean up unnecessary includes and functions Kris Van Hees
2025-03-07 21:34 ` [PATCH 3/8] rawfbt: perform lookup on true symbol names Kris Van Hees
@ 2025-03-07 21:34 ` Kris Van Hees
2025-03-10 22:04 ` Eugene Loh
2025-03-07 21:34 ` [PATCH 5/8] symtab: add support for 'traceable' flag Kris Van Hees
` (3 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Kris Van Hees @ 2025-03-07 21:34 UTC (permalink / raw)
To: dtrace, dtrace-devel; +Cc: Kris Van Hees
Some symbols were being filtered out even though they represent symbols
that can actually be probed.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
libdtrace/dt_module.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c
index dc00aa88..2e915e2f 100644
--- a/libdtrace/dt_module.c
+++ b/libdtrace/dt_module.c
@@ -1215,7 +1215,7 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym,
(strstarts(sym->name, "__syscall_meta__")) ||
(strstarts(sym->name, "__p_syscall_meta__")) ||
(strstarts(sym->name, "__event_")) ||
- (strstarts(sym->name, "event_")) ||
+ (strstarts(sym->name, "event_") && sym->type == 'd') ||
(strstarts(sym->name, "ftrace_event_")) ||
(strstarts(sym->name, "types__")) ||
(strstarts(sym->name, "args__")) ||
@@ -1223,7 +1223,6 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym,
(strstarts(sym->name, "__tpstrtab_")) ||
(strstarts(sym->name, "__tpstrtab__")) ||
(strstarts(sym->name, "__initcall_")) ||
- (strstarts(sym->name, "__setup_")) ||
(strstarts(sym->name, "__pci_fixup_")))
skip = 1;
#undef strstarts
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/8] symtab: add support for 'traceable' flag
2025-03-07 21:34 [PATCH 2/8] sched: clean up unnecessary includes and functions Kris Van Hees
2025-03-07 21:34 ` [PATCH 3/8] rawfbt: perform lookup on true symbol names Kris Van Hees
2025-03-07 21:34 ` [PATCH 4/8] ksyms: make symbol name filters less picky Kris Van Hees
@ 2025-03-07 21:34 ` Kris Van Hees
2025-03-18 5:26 ` [DTrace-devel] " Eugene Loh
2025-03-07 21:34 ` [PATCH 6/8] fbt: performance improvements Kris Van Hees
` (2 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Kris Van Hees @ 2025-03-07 21:34 UTC (permalink / raw)
To: dtrace, dtrace-devel; +Cc: Kris Van Hees
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
libdtrace/dt_symtab.c | 53 +++++++++++++++++++++++++++++++++++++++----
libdtrace/dt_symtab.h | 6 +++++
2 files changed, 55 insertions(+), 4 deletions(-)
diff --git a/libdtrace/dt_symtab.c b/libdtrace/dt_symtab.c
index db63cc88..4e46f280 100644
--- a/libdtrace/dt_symtab.c
+++ b/libdtrace/dt_symtab.c
@@ -23,9 +23,12 @@
#include <dt_string.h>
#include <unistd.h>
-#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */
-#define DT_ST_PACKED 0x02 /* Symbol table packed
+#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */
+#define DT_ST_PACKED 0x02 /* Symbol table packed
* (necessarily sorted too) */
+#define DT_ST_TRACEABLE 0x04 /* Symbols have traceable flag */
+
+#define DT_STB_TRACE 8 /* traceable symbol */
struct dt_symbol {
dt_list_t dts_list; /* list forward/back pointers */
@@ -275,6 +278,12 @@ dt_symbol_by_name(dtrace_hdl_t *dtp, const char *name)
return dt_htab_lookup(dtp->dt_kernsyms, &tmpl);
}
+dt_symbol_t *
+dt_symbol_by_name_next(const dt_symbol_t *symbol)
+{
+ return symbol ? (dt_symbol_t *)symbol->dts_he.next : NULL;
+}
+
/* Find a symbol in a given module. */
dt_symbol_t *
dt_module_symbol_by_name(dtrace_hdl_t *dtp, dt_module_t *dmp, const char *name)
@@ -548,7 +557,7 @@ dt_symbol_name(const dt_symbol_t *symbol)
void
dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_symp)
{
- elf_symp->st_info = symbol->dts_info;
+ elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0);
elf_symp->st_value = symbol->dts_addr;
elf_symp->st_size = symbol->dts_size;
elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */
@@ -557,7 +566,7 @@ dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_sym
void
dt_symbol_to_elfsym32(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf32_Sym *elf_symp)
{
- elf_symp->st_info = symbol->dts_info;
+ elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0);
elf_symp->st_value = symbol->dts_addr;
elf_symp->st_size = symbol->dts_size;
elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */
@@ -581,3 +590,39 @@ dt_symbol_module(dt_symbol_t *symbol)
{
return symbol->dts_dmp;
}
+
+/*
+ * Mark the symtab annotated with traceable flags on symbols.
+ */
+void
+dt_symtab_set_traceable(dt_symtab_t *symtab)
+{
+ symtab->dtst_flags |= DT_ST_TRACEABLE;
+}
+
+/*
+ * Return whether symbols have traceable flags.
+ */
+int
+dt_symtab_traceable(const dt_symtab_t *symtab)
+{
+ return symtab->dtst_flags & DT_ST_TRACEABLE;
+}
+
+/*
+ * Mark a symbol as traceable.
+ */
+void
+dt_symbol_set_traceable(dt_symbol_t *symbol)
+{
+ symbol->dts_info |= GELF_ST_INFO(DT_STB_TRACE, 0);
+}
+
+/*
+ * Return true if the symbol can be traced.
+ */
+int
+dt_symbol_traceable(const dt_symbol_t *symbol)
+{
+ return GELF_ST_BIND(symbol->dts_info) & DT_STB_TRACE;
+}
diff --git a/libdtrace/dt_symtab.h b/libdtrace/dt_symtab.h
index 8d396c46..9ee60c38 100644
--- a/libdtrace/dt_symtab.h
+++ b/libdtrace/dt_symtab.h
@@ -39,6 +39,7 @@ extern dt_symbol_t *dt_symbol_insert(dtrace_hdl_t *dtp, dt_symtab_t *symtab,
struct dt_module *dmp, const char *name, GElf_Addr addr, GElf_Xword size,
unsigned char info);
extern dt_symbol_t *dt_symbol_by_name(dtrace_hdl_t *dtp, const char *name);
+extern dt_symbol_t *dt_symbol_by_name_next(const dt_symbol_t *symbol);
extern dt_symbol_t *dt_module_symbol_by_name(dtrace_hdl_t *dtp,
struct dt_module *dmp, const char *name);
extern dt_symbol_t *dt_symbol_by_addr(dt_symtab_t *symtab, GElf_Addr dts_addr);
@@ -51,6 +52,11 @@ extern void dt_symbol_to_elfsym(dtrace_hdl_t *dtp, dt_symbol_t *symbol,
GElf_Sym *elf_symp);
extern struct dt_module *dt_symbol_module(dt_symbol_t *symbol);
+extern void dt_symtab_set_traceable(dt_symtab_t *symtab);
+extern int dt_symtab_traceable(const dt_symtab_t *symtab);
+extern void dt_symbol_set_traceable(dt_symbol_t *symbol);
+extern int dt_symbol_traceable(const dt_symbol_t *symbol);
+
#ifdef __cplusplus
}
#endif
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/8] fbt: performance improvements
2025-03-07 21:34 [PATCH 2/8] sched: clean up unnecessary includes and functions Kris Van Hees
` (2 preceding siblings ...)
2025-03-07 21:34 ` [PATCH 5/8] symtab: add support for 'traceable' flag Kris Van Hees
@ 2025-03-07 21:34 ` Kris Van Hees
2025-03-12 5:17 ` Eugene Loh
2025-03-07 21:34 ` [PATCH 7/8] rawfbt: " Kris Van Hees
2025-03-07 21:34 ` [PATCH 8/8] fbt, rawfbt: consolidate code to avoid duplication Kris Van Hees
5 siblings, 1 reply; 14+ messages in thread
From: Kris Van Hees @ 2025-03-07 21:34 UTC (permalink / raw)
To: dtrace, dtrace-devel; +Cc: Kris Van Hees
Up until now, FBT probes were registered for every symbol that was
listed as traceable. Most tracing session do not use most or even
any of these, and the process of registering them all was quite
slow.
Going forward, FBT probes are registered on demand.
If any FBT probes are to be registered, the first will incur the
cost of reading the entire list of traceable symbols. Any further
FBT probe registration will be able to be satisfied based on that
initial processing. The performance improvement is therefore quite
significant for tracing sessions that do not trigger any FBT probe
registration, and if FBT probes are used, the improvement is still
quite noticable because only the probes that are actually needed
get registered.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
libdtrace/dt_module.c | 78 +++++++++++++++
libdtrace/dt_module.h | 2 +
libdtrace/dt_prov_fbt.c | 217 +++++++++++++++++++++++++++-------------
3 files changed, 228 insertions(+), 69 deletions(-)
diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c
index 2e915e2f..e7553a07 100644
--- a/libdtrace/dt_module.c
+++ b/libdtrace/dt_module.c
@@ -22,6 +22,7 @@
#include <port.h>
#include <zlib.h>
+#include <tracefs.h>
#include <dt_kernel_module.h>
#include <dt_module.h>
@@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp)
}
}
+#define PROBE_LIST TRACEFS "available_filter_functions"
+
+/*
+ * Determine which kernel functions are traceable and mark them.
+ */
+void
+dt_modsym_mark_traceable(dtrace_hdl_t *dtp)
+{
+ FILE *f;
+ char *buf = NULL;
+ size_t len = 0;
+
+ if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms))
+ return;
+
+ f = fopen(PROBE_LIST, "r");
+ if (f == NULL)
+ return;
+
+ while (getline(&buf, &len, f) >= 0) {
+ char *p;
+ dt_symbol_t *sym = NULL;
+
+ /*
+ * Here buf is either "funcname\n" or "funcname [modname]\n".
+ * The last line may not have a linefeed.
+ */
+ p = strchr(buf, '\n');
+ if (p) {
+ *p = '\0';
+ if (p > buf && *(--p) == ']')
+ *p = '\0';
+ }
+
+ /*
+ * Now buf is either "funcname" or "funcname [modname". If
+ * there is no module name provided, we will use the default.
+ */
+ p = strchr(buf, ' ');
+ if (p) {
+ *p++ = '\0';
+ if (*p == '[')
+ p++;
+ }
+
+#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
+ /* Weed out __ftrace_invalid_address___* entries. */
+ if (strstarts(buf, "__ftrace_invalid_address__") ||
+ strstarts(buf, "__probestub_") ||
+ strstarts(buf, "__traceiter_"))
+ continue;
+#undef strstarts
+
+ /*
+ * If we have a module name, look for the symbol in that
+ * module.
+ * If not, perform a general symbol lookup to find its first
+ * instance.
+ */
+ if (p) {
+ dt_module_t *dmp = dt_module_lookup_by_name(dtp, p);
+
+ if (dmp)
+ sym = dt_module_symbol_by_name(dtp, dmp, buf);
+ } else
+ sym = dt_symbol_by_name(dtp, buf);
+
+ if (sym)
+ dt_symbol_set_traceable(sym);
+ }
+
+ free(buf);
+ fclose(f);
+
+ dt_symtab_set_traceable(dtp->dt_exec->dm_kernsyms);
+}
+
/*
* Symbol data can be collected in three ways:
* - kallmodsyms
diff --git a/libdtrace/dt_module.h b/libdtrace/dt_module.h
index 56df17a6..dd3ad17c 100644
--- a/libdtrace/dt_module.h
+++ b/libdtrace/dt_module.h
@@ -25,6 +25,8 @@ extern dt_ident_t *dt_module_extern(dtrace_hdl_t *, dt_module_t *,
extern const char *dt_module_modelname(dt_module_t *);
+extern void dt_modsym_mark_traceable(dtrace_hdl_t *);
+
#ifdef __cplusplus
}
#endif
diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c
index eef93879..d837e14d 100644
--- a/libdtrace/dt_prov_fbt.c
+++ b/libdtrace/dt_prov_fbt.c
@@ -41,10 +41,8 @@
#include "dt_pt_regs.h"
static const char prvname[] = "fbt";
-static const char modname[] = "vmlinux";
#define KPROBE_EVENTS TRACEFS "kprobe_events"
-#define PROBE_LIST TRACEFS "available_filter_functions"
#define FBT_GROUP_FMT GROUP_FMT "_%s"
#define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb
@@ -61,19 +59,11 @@ dt_provimpl_t dt_fbt_fprobe;
dt_provimpl_t dt_fbt_kprobe;
/*
- * Scan the PROBE_LIST file and add entry and return probes for every function
- * that is listed.
+ * Create the fbt provider.
*/
static int populate(dtrace_hdl_t *dtp)
{
dt_provider_t *prv;
- FILE *f;
- char *buf = NULL;
- char *p;
- const char *mod = modname;
- size_t n;
- dtrace_syminfo_t sip;
- dtrace_probedesc_t pd;
dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe;
@@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp)
if (prv == NULL)
return -1; /* errno already set */
- f = fopen(PROBE_LIST, "r");
- if (f == NULL)
+ return 0;
+}
+
+/* Create a probe (if it does not exist yet). */
+static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
+{
+ dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv);
+
+ if (prv == NULL)
+ return 0;
+ if (dt_probe_lookup(dtp, pdp) != NULL)
return 0;
+ if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb))
+ return 1;
- while (getline(&buf, &n, f) >= 0) {
- /*
- * Here buf is either "funcname\n" or "funcname [modname]\n".
- * The last line may not have a linefeed.
- */
- p = strchr(buf, '\n');
- if (p) {
- *p = '\0';
- if (p > buf && *(--p) == ']')
- *p = '\0';
+ return 0;
+}
+
+/*
+ * Try to provide probes for the given probe description. The caller ensures
+ * that the provider name in probe desxcription (if any) is a match for this
+ * provider. When this is called, we already know that this provider matches
+ * the provider component of the probe specification.
+ */
+#define FBT_ENTRY 1
+#define FBT_RETURN 2
+
+static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
+{
+ int n = 0;
+ int prb = 0;
+ dt_module_t *dmp = NULL;
+ dt_symbol_t *sym = NULL;
+ dt_htab_next_t *it = NULL;
+ dtrace_probedesc_t pd;
+
+ dt_modsym_mark_traceable(dtp);
+
+ /*
+ * Nothing to do if a probe name is specified and cannot match 'entry'
+ * or 'return'.
+ */
+ if (dt_gmatch("entry", pdp->prb))
+ prb |= FBT_ENTRY;
+ if (dt_gmatch("return", pdp->prb))
+ prb |= FBT_RETURN;
+ if (prb == 0)
+ return 0;
+
+ /* Synthetic function names are not supported for FBT. */
+ if (strchr(pdp->fun, '.'))
+ return 0;
+
+ /*
+ * If we have an explicit module name, check it. If not found, we can
+ * ignore this request.
+ */
+ if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) {
+ dmp = dt_module_lookup_by_name(dtp, pdp->mod);
+ if (dmp == NULL)
+ return 0;
+ }
+
+ /*
+ * If we have an explicit function name, we start with a basic symbol
+ * name lookup.
+ */
+ if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) {
+ /* If we have a module, use it. */
+ if (dmp != NULL) {
+ sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun);
+ if (sym == NULL)
+ return 0;
+ if (!dt_symbol_traceable(sym))
+ return 0;
+
+ pd.id = DTRACE_IDNONE;
+ pd.prv = pdp->prv;
+ pd.mod = dmp->dm_name;
+ pd.fun = pdp->fun;
+
+ if (prb & FBT_ENTRY) {
+ pd.prb = "entry";
+ n += provide_probe(dtp, &pd);
+ }
+ if (prb & FBT_RETURN) {
+ pd.prb = "return";
+ n += provide_probe(dtp, &pd);
+ }
+
+ return n;
}
- /*
- * Now buf is either "funcname" or "funcname [modname". If
- * there is no module name provided, we will use the default.
- */
- p = strchr(buf, ' ');
- if (p) {
- *p++ = '\0';
- if (*p == '[')
- p++;
+ sym = dt_symbol_by_name(dtp, pdp->fun);
+ while (sym != NULL) {
+ const char *mod = dt_symbol_module(sym)->dm_name;
+
+ if (dt_symbol_traceable(sym) &&
+ dt_gmatch(mod, pdp->mod)) {
+ pd.id = DTRACE_IDNONE;
+ pd.prv = pdp->prv;
+ pd.mod = mod;
+ pd.fun = pdp->fun;
+
+ if (prb & FBT_ENTRY) {
+ pd.prb = "entry";
+ n += provide_probe(dtp, &pd);
+ }
+ if (prb & FBT_RETURN) {
+ pd.prb = "return";
+ n += provide_probe(dtp, &pd);
+ }
+
+ }
+ sym = dt_symbol_by_name_next(sym);
}
- /* Weed out synthetic symbol names (that are invalid). */
- if (strchr(buf, '.') != NULL)
+ return n;
+ }
+
+ /*
+ * No explicit function name. We need to go through all possible
+ * symbol names and see if they match.
+ */
+ while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) {
+ dt_module_t *smp;
+ const char *fun;
+
+ /* Ensure the symbol can be traced. */
+ if (!dt_symbol_traceable(sym))
continue;
-#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
- /* Weed out __ftrace_invalid_address___* entries. */
- if (strstarts(buf, "__ftrace_invalid_address__") ||
- strstarts(buf, "__probestub_") ||
- strstarts(buf, "__traceiter_"))
+ /* Match the function name. */
+ fun = dt_symbol_name(sym);
+ if (!dt_gmatch(fun, pdp->fun))
continue;
-#undef strstarts
- /*
- * If we did not see a module name, perform a symbol lookup to
- * try to determine the module name.
- */
- if (!p) {
- if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf,
- NULL, &sip) == 0)
- mod = sip.object;
- } else
- mod = p;
+ /* Validate the module name. */
+ smp = dt_symbol_module(sym);
+ if (dmp) {
+ if (smp != dmp)
+ continue;
+ } else if (!dt_gmatch(smp->dm_name, pdp->mod))
+ continue;
- /*
- * Due to the lack of module names in
- * TRACEFS/available_filter_functions, there are some duplicate
- * function names. We need to make sure that we do not create
- * duplicate probes for these.
- */
pd.id = DTRACE_IDNONE;
- pd.prv = prvname;
- pd.mod = mod;
- pd.fun = buf;
- pd.prb = "entry";
- if (dt_probe_lookup(dtp, &pd) != NULL)
- continue;
+ pd.prv = pdp->prv;
+ pd.mod = smp->dm_name;
+ pd.fun = fun;
- if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry"))
- n++;
- if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return"))
- n++;
+ if (prb & FBT_ENTRY) {
+ pd.prb = "entry";
+ n += provide_probe(dtp, &pd);
+ }
+ if (prb & FBT_RETURN) {
+ pd.prb = "return";
+ n += provide_probe(dtp, &pd);
+ }
}
- free(buf);
- fclose(f);
-
return n;
}
@@ -447,6 +524,7 @@ dt_provimpl_t dt_fbt_fprobe = {
.prog_type = BPF_PROG_TYPE_TRACING,
.stack_skip = 4,
.populate = &populate,
+ .provide = &provide,
.load_prog = &fprobe_prog_load,
.trampoline = &fprobe_trampoline,
.attach = &dt_tp_probe_attach_raw,
@@ -459,6 +537,7 @@ dt_provimpl_t dt_fbt_kprobe = {
.name = prvname,
.prog_type = BPF_PROG_TYPE_KPROBE,
.populate = &populate,
+ .provide = &provide,
.load_prog = &dt_bpf_prog_load,
.trampoline = &kprobe_trampoline,
.attach = &kprobe_attach,
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/8] rawfbt: performance improvements
2025-03-07 21:34 [PATCH 2/8] sched: clean up unnecessary includes and functions Kris Van Hees
` (3 preceding siblings ...)
2025-03-07 21:34 ` [PATCH 6/8] fbt: performance improvements Kris Van Hees
@ 2025-03-07 21:34 ` Kris Van Hees
2025-03-07 21:34 ` [PATCH 8/8] fbt, rawfbt: consolidate code to avoid duplication Kris Van Hees
5 siblings, 0 replies; 14+ messages in thread
From: Kris Van Hees @ 2025-03-07 21:34 UTC (permalink / raw)
To: dtrace, dtrace-devel; +Cc: Kris Van Hees
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
libdtrace/dt_prov_rawfbt.c | 223 +++++++++++++++++++++++++------------
1 file changed, 151 insertions(+), 72 deletions(-)
diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c
index 62f2f4f0..52152655 100644
--- a/libdtrace/dt_prov_rawfbt.c
+++ b/libdtrace/dt_prov_rawfbt.c
@@ -44,10 +44,8 @@
#include "dt_pt_regs.h"
static const char prvname[] = "rawfbt";
-static const char modname[] = "vmlinux";
#define KPROBE_EVENTS TRACEFS "kprobe_events"
-#define PROBE_LIST TRACEFS "available_filter_functions"
#define FBT_GROUP_FMT GROUP_FMT "_%s"
#define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb
@@ -61,98 +59,178 @@ static const dtrace_pattr_t pattr = {
};
/*
- * Scan the PROBE_LIST file and add entry and return probes for every function
- * that is listed.
+ * Create the rawfbt provider.
*/
static int populate(dtrace_hdl_t *dtp)
{
dt_provider_t *prv;
- FILE *f;
- char *buf = NULL;
- size_t len = 0;
- size_t n = 0;
- dtrace_syminfo_t sip;
- dtrace_probedesc_t pd;
prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL);
if (prv == NULL)
return -1; /* errno already set */
- f = fopen(PROBE_LIST, "r");
- if (f == NULL)
+ return 0;
+}
+
+/* Create a probe (if it does not exist yet). */
+static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
+{
+ dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv);
+
+ if (prv == NULL)
return 0;
+ if (dt_probe_lookup(dtp, pdp) != NULL)
+ return 0;
+#ifdef DEBUG_FBT
+ if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) {
+ fprintf(stderr, "%s(..., PROVIDE %s:%s:%s:%s) - ...\n", __func__, pdp->prv, pdp->mod, pdp->fun, pdp->prb);
+ return 1;
+ }
+#else
+ if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb))
+ return 1;
+#endif
- while (getline(&buf, &len, f) >= 0) {
- char *p;
- const char *mod = modname;
- dt_probe_t *prp;
+ return 0;
+}
- /*
- * Here buf is either "funcname\n" or "funcname [modname]\n".
- * The last line may not have a linefeed.
- */
- p = strchr(buf, '\n');
- if (p) {
- *p = '\0';
- if (p > buf && *(--p) == ']')
- *p = '\0';
+/*
+ * Try to provide probes for the given probe description. The caller ensures
+ * that the provider name in probe desxcription (if any) is a match for this
+ * provider. When this is called, we already know that this provider matches
+ * the provider component of the probe specification.
+ */
+#define FBT_ENTRY 1
+#define FBT_RETURN 2
+
+static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
+{
+ int n = 0;
+ int prb = 0;
+ dt_module_t *dmp = NULL;
+ dt_symbol_t *sym = NULL;
+ dt_htab_next_t *it = NULL;
+ dtrace_probedesc_t pd;
+
+ dt_modsym_mark_traceable(dtp);
+
+ /*
+ * Nothing to do if a probe name is specified and cannot match 'entry'
+ * or 'return'.
+ */
+ if (dt_gmatch("entry", pdp->prb))
+ prb |= FBT_ENTRY;
+ if (dt_gmatch("return", pdp->prb))
+ prb |= FBT_RETURN;
+ if (prb == 0)
+ return 0;
+
+ /*
+ * If we have an explicit module name, check it. If not found, we can
+ * ignore this request.
+ */
+ if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) {
+ dmp = dt_module_lookup_by_name(dtp, pdp->mod);
+ if (dmp == NULL)
+ return 0;
+ }
+
+ /*
+ * If we have an explicit function name, we start with a basic symbol
+ * name lookup.
+ */
+ if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) {
+ /* If we have a module, use it. */
+ if (dmp != NULL) {
+ sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun);
+ if (sym == NULL)
+ return 0;
+ if (!dt_symbol_traceable(sym))
+ return 0;
+
+ pd.id = DTRACE_IDNONE;
+ pd.prv = pdp->prv;
+ pd.mod = dmp->dm_name;
+ pd.fun = pdp->fun;
+
+ if (prb & FBT_ENTRY) {
+ pd.prb = "entry";
+ n += provide_probe(dtp, &pd);
+ }
+ if (prb & FBT_RETURN) {
+ pd.prb = "return";
+ n += provide_probe(dtp, &pd);
+ }
+
+ return n;
}
- /*
- * Now buf is either "funcname" or "funcname [modname". If
- * there is no module name provided, we will use the default.
- */
- p = strchr(buf, ' ');
- if (p) {
- *p++ = '\0';
- if (*p == '[')
- p++;
+ sym = dt_symbol_by_name(dtp, pdp->fun);
+ while (sym != NULL) {
+ const char *mod = dt_symbol_module(sym)->dm_name;
+
+ if (dt_symbol_traceable(sym) &&
+ dt_gmatch(mod, pdp->mod)) {
+ pd.id = DTRACE_IDNONE;
+ pd.prv = pdp->prv;
+ pd.mod = mod;
+ pd.fun = pdp->fun;
+
+ if (prb & FBT_ENTRY) {
+ pd.prb = "entry";
+ n += provide_probe(dtp, &pd);
+ }
+ if (prb & FBT_RETURN) {
+ pd.prb = "return";
+ n += provide_probe(dtp, &pd);
+ }
+
+ }
+ sym = dt_symbol_by_name_next(sym);
}
-#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
- /* Weed out __ftrace_invalid_address___* entries. */
- if (strstarts(buf, "__ftrace_invalid_address__") ||
- strstarts(buf, "__probestub_") ||
- strstarts(buf, "__traceiter_"))
+ return n;
+ }
+
+ /*
+ * No explicit function name. We need to go through all possible
+ * symbol names and see if they match.
+ */
+ while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) {
+ dt_module_t *smp;
+ const char *fun;
+
+ /* Ensure the symbol can be traced. */
+ if (!dt_symbol_traceable(sym))
continue;
-#undef strstarts
- /*
- * If we did not see a module name, perform a symbol lookup to
- * try to determine the module name.
- */
- if (!p) {
- if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf,
- NULL, &sip) == 0)
- mod = sip.object;
- } else
- mod = p;
+ /* Match the function name. */
+ fun = dt_symbol_name(sym);
+ if (!dt_gmatch(fun, pdp->fun))
+ continue;
- /*
- * Due to the lack of module names in
- * TRACEFS/available_filter_functions, there are some duplicate
- * function names. The kernel does not let us trace functions
- * that have duplicates, so we need to remove the existing one.
- */
- pd.id = DTRACE_IDNONE;
- pd.prv = prvname;
- pd.mod = mod;
- pd.fun = buf;
- pd.prb = "entry";
- prp = dt_probe_lookup(dtp, &pd);
- if (prp != NULL) {
- dt_probe_destroy(prp);
+ /* Validate the module name. */
+ smp = dt_symbol_module(sym);
+ if (dmp) {
+ if (smp != dmp)
+ continue;
+ } else if (!dt_gmatch(smp->dm_name, pdp->mod))
continue;
- }
- if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry"))
- n++;
- if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return"))
- n++;
- }
+ pd.id = DTRACE_IDNONE;
+ pd.prv = pdp->prv;
+ pd.mod = smp->dm_name;
+ pd.fun = fun;
- free(buf);
- fclose(f);
+ if (prb & FBT_ENTRY) {
+ pd.prb = "entry";
+ n += provide_probe(dtp, &pd);
+ }
+ if (prb & FBT_RETURN) {
+ pd.prb = "return";
+ n += provide_probe(dtp, &pd);
+ }
+ }
return n;
}
@@ -306,6 +384,7 @@ dt_provimpl_t dt_rawfbt = {
.name = prvname,
.prog_type = BPF_PROG_TYPE_KPROBE,
.populate = &populate,
+ .provide = &provide,
.load_prog = &dt_bpf_prog_load,
.trampoline = &trampoline,
.attach = &attach,
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 8/8] fbt, rawfbt: consolidate code to avoid duplication
2025-03-07 21:34 [PATCH 2/8] sched: clean up unnecessary includes and functions Kris Van Hees
` (4 preceding siblings ...)
2025-03-07 21:34 ` [PATCH 7/8] rawfbt: " Kris Van Hees
@ 2025-03-07 21:34 ` Kris Van Hees
5 siblings, 0 replies; 14+ messages in thread
From: Kris Van Hees @ 2025-03-07 21:34 UTC (permalink / raw)
To: dtrace, dtrace-devel; +Cc: Kris Van Hees
After optimizing both fbt and rawfbt providers, the resulting code has
a significant amount of duplication. The rawfbt provider can now be
defined in terms of the kprobe-based fbt provider functions.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
libdtrace/Build | 1 -
libdtrace/dt_prov_fbt.c | 131 +++++++++----
libdtrace/dt_prov_rawfbt.c | 393 -------------------------------------
3 files changed, 96 insertions(+), 429 deletions(-)
delete mode 100644 libdtrace/dt_prov_rawfbt.c
diff --git a/libdtrace/Build b/libdtrace/Build
index 51e0f078..7e6e8a38 100644
--- a/libdtrace/Build
+++ b/libdtrace/Build
@@ -55,7 +55,6 @@ libdtrace-build_SOURCES = dt_aggregate.c \
dt_prov_lockstat.c \
dt_prov_proc.c \
dt_prov_profile.c \
- dt_prov_rawfbt.c \
dt_prov_rawtp.c \
dt_prov_sched.c \
dt_prov_sdt.c \
diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c
index d837e14d..93ed270e 100644
--- a/libdtrace/dt_prov_fbt.c
+++ b/libdtrace/dt_prov_fbt.c
@@ -6,17 +6,26 @@
*
* The Function Boundary Tracing (FBT) provider for DTrace.
*
- * FBT probes are exposed by the kernel as kprobes. They are listed in the
- * TRACEFS/available_filter_functions file. Some kprobes are associated with
- * a specific kernel module, while most are in the core kernel.
+ * Kernnel functions can be traced through fentry/fexit probes (when available)
+ * and kprobes. The FBT provider supports both implementations and will use
+ * fentry/fexit probes if the kernel supports them, and fallback to kprobes
+ * otherwise. The FBT provider does not support tracing synthetic functions
+ * (i.e. compiler-generated functions with a . in their name).
+ *
+ * The rawfbt provider implements a variant of the FBT provider and always uses
+ * kprobes. This provider allow tracing of synthetic function.
*
* Mapping from event name to DTrace probe name:
*
* <name> fbt:vmlinux:<name>:entry
* fbt:vmlinux:<name>:return
+ * rawfbt:vmlinux:<name>:entry
+ * rawfbt:vmlinux:<name>:return
* or
* <name> [<modname>] fbt:<modname>:<name>:entry
* fbt:<modname>:<name>:return
+ * rawfbt:<modname>:<name>:entry
+ * rawfbt:<modname>:<name>:return
*/
#include <assert.h>
#include <errno.h>
@@ -57,18 +66,19 @@ static const dtrace_pattr_t pattr = {
dt_provimpl_t dt_fbt_fprobe;
dt_provimpl_t dt_fbt_kprobe;
+dt_provimpl_t dt_rawfbt;
/*
- * Create the fbt provider.
+ * Create the fbt and rawfbt providers.
*/
static int populate(dtrace_hdl_t *dtp)
{
- dt_provider_t *prv;
-
dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe;
- prv = dt_provider_create(dtp, prvname, &dt_fbt, &pattr, NULL);
- if (prv == NULL)
+ if (dt_provider_create(dtp, dt_fbt.name, &dt_fbt, &pattr,
+ NULL) == NULL ||
+ dt_provider_create(dtp, dt_rawfbt.name, &dt_rawfbt, &pattr,
+ NULL) == NULL)
return -1; /* errno already set */
return 0;
@@ -107,8 +117,6 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
dt_htab_next_t *it = NULL;
dtrace_probedesc_t pd;
- dt_modsym_mark_traceable(dtp);
-
/*
* Nothing to do if a probe name is specified and cannot match 'entry'
* or 'return'.
@@ -120,8 +128,11 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
if (prb == 0)
return 0;
- /* Synthetic function names are not supported for FBT. */
- if (strchr(pdp->fun, '.'))
+ /*
+ * Unless we are dealing with a rawfbt probe, synthetic functions are
+ * not supported.
+ */
+ if (strcmp(pdp->prv, dt_rawfbt.name) != 0 && strchr(pdp->fun, '.'))
return 0;
/*
@@ -134,6 +145,14 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
return 0;
}
+ /*
+ * Ensure that kernel symbols that are FBT-traceable are marked as
+ * such. We don't do this earlier in this function so that the
+ * preceding tests have the greatest opportunity to avoid doing this
+ * unnecessarily.
+ */
+ dt_modsym_mark_traceable(dtp);
+
/*
* If we have an explicit function name, we start with a basic symbol
* name lookup.
@@ -396,12 +415,12 @@ static int fprobe_prog_load(dtrace_hdl_t *dtp, const dt_probe_t *prp,
\*******************************/
/*
- * Generate a BPF trampoline for a FBT probe.
+ * Generate a BPF trampoline for a FBT (or rawfbt) probe.
*
* The trampoline function is called when a FBT probe triggers, and it must
* satisfy the following prototype:
*
- * int dt_fbt(dt_pt_regs *regs)
+ * int dt_(raw)fbt(dt_pt_regs *regs)
*
* The trampoline will populate a dt_dctx_t struct and then call the function
* that implements the compiled D clause. It returns 0 to the caller.
@@ -422,7 +441,7 @@ static int kprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl)
dt_cg_tramp_copy_rval_from_regs(pcb);
/*
- * fbt:::return arg0 should be the function offset for
+ * (raw)fbt:::return arg0 should be the function offset for
* return instruction. Since we use kretprobes, however,
* which do not fire until the function has returned to
* its caller, information about the returning instruction
@@ -441,11 +460,28 @@ static int kprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl)
static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
{
+ const char *fun = prp->desc->fun;
+ char *tpn = (char *)fun;
+ int rc = -1;
+
if (!dt_tp_probe_has_info(prp)) {
char *fn;
FILE *f;
- size_t len;
- int fd, rc = -1;
+ int fd;
+
+ /*
+ * For rawfbt probes, we need to apply a . -> _ conversion to
+ * ensure the tracepoint name is valid.
+ */
+ if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) {
+ char *p;
+
+ tpn = strdup(fun);
+ for (p = tpn; *p; p++) {
+ if (*p == '.')
+ *p = '_';
+ }
+ }
/*
* Register the kprobe with the tracing subsystem. This will
@@ -453,41 +489,42 @@ static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
*/
fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND);
if (fd == -1)
- return -ENOENT;
+ goto out;
rc = dprintf(fd, "%c:" FBT_GROUP_FMT "/%s %s\n",
prp->desc->prb[0] == 'e' ? 'p' : 'r',
- FBT_GROUP_DATA, prp->desc->fun, prp->desc->fun);
+ FBT_GROUP_DATA, tpn, fun);
close(fd);
if (rc == -1)
- return -ENOENT;
+ goto out;
/* create format file name */
- len = snprintf(NULL, 0, "%s" FBT_GROUP_FMT "/%s/format",
- EVENTSFS, FBT_GROUP_DATA, prp->desc->fun) + 1;
- fn = dt_alloc(dtp, len);
- if (fn == NULL)
- return -ENOENT;
-
- snprintf(fn, len, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS,
- FBT_GROUP_DATA, prp->desc->fun);
+ if (asprintf(&fn, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS,
+ FBT_GROUP_DATA, tpn) == -1)
+ goto out;
/* open format file */
f = fopen(fn, "r");
- dt_free(dtp, fn);
+ free(fn);
if (f == NULL)
- return -ENOENT;
+ goto out;
/* read event id from format file */
rc = dt_tp_probe_info(dtp, f, 0, prp, NULL, NULL);
fclose(f);
if (rc < 0)
- return -ENOENT;
+ goto out;
}
/* attach BPF program to the probe */
- return dt_tp_probe_attach(dtp, prp, bpf_fd);
+ rc = dt_tp_probe_attach(dtp, prp, bpf_fd);
+
+out:
+ if (tpn != prp->desc->fun)
+ free(tpn);
+
+ return rc == -1 ? -ENOENT : rc;
}
/*
@@ -503,7 +540,8 @@ static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
*/
static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
{
- int fd;
+ int fd;
+ char *tpn = (char *)prp->desc->fun;
if (!dt_tp_probe_has_info(prp))
return;
@@ -514,9 +552,20 @@ static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
if (fd == -1)
return;
- dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA,
- prp->desc->fun);
+ if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) {
+ char *p;
+
+ for (p = tpn; *p; p++) {
+ if (*p == '.')
+ *p = '_';
+ }
+ }
+
+ dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, tpn);
close(fd);
+
+ if (tpn != prp->desc->fun)
+ free(tpn);
}
dt_provimpl_t dt_fbt_fprobe = {
@@ -549,3 +598,15 @@ dt_provimpl_t dt_fbt = {
.name = prvname,
.populate = &populate,
};
+
+dt_provimpl_t dt_rawfbt = {
+ .name = "rawfbt",
+ .prog_type = BPF_PROG_TYPE_KPROBE,
+ .populate = &populate,
+ .provide = &provide,
+ .load_prog = &dt_bpf_prog_load,
+ .trampoline = &kprobe_trampoline,
+ .attach = &kprobe_attach,
+ .detach = &kprobe_detach,
+ .probe_destroy = &dt_tp_probe_destroy,
+};
diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c
deleted file mode 100644
index 52152655..00000000
--- a/libdtrace/dt_prov_rawfbt.c
+++ /dev/null
@@ -1,393 +0,0 @@
-/*
- * Oracle Linux DTrace.
- * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved.
- * Licensed under the Universal Permissive License v 1.0 as shown at
- * http://oss.oracle.com/licenses/upl.
- *
- * The Raw Function Boundary Tracing provider for DTrace.
- *
- * The kernel provides kprobes to trace specific symbols. They are listed in
- * the TRACEFS/available_filter_functions file. Kprobes may be associated with
- * a symbol in the core kernel or with a symbol in a specific kernel module.
- * Whereas the fbt provider supports tracing regular symbols only, the rawfbt
- * provider also provides access to synthetic symbols, i.e. symbols created by
- * compiler optimizations.
- *
- * Mapping from event name to DTrace probe name:
- *
- * <name> rawfbt:vmlinux:<name>:entry
- * rawfbt:vmlinux:<name>:return
- * or
- * <name> [<modname>] rawfbt:<modname>:<name>:entry
- * rawfbt:<modname>:<name>:return
- */
-#include <assert.h>
-#include <errno.h>
-#include <fcntl.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <linux/bpf.h>
-#include <linux/btf.h>
-#include <sys/stat.h>
-#include <sys/types.h>
-
-#include <bpf_asm.h>
-
-#include "dt_btf.h"
-#include "dt_dctx.h"
-#include "dt_cg.h"
-#include "dt_module.h"
-#include "dt_provider_tp.h"
-#include "dt_probe.h"
-#include "dt_pt_regs.h"
-
-static const char prvname[] = "rawfbt";
-
-#define KPROBE_EVENTS TRACEFS "kprobe_events"
-
-#define FBT_GROUP_FMT GROUP_FMT "_%s"
-#define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb
-
-static const dtrace_pattr_t pattr = {
-{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON },
-{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_UNKNOWN },
-{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_ISA },
-{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON },
-{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_ISA },
-};
-
-/*
- * Create the rawfbt provider.
- */
-static int populate(dtrace_hdl_t *dtp)
-{
- dt_provider_t *prv;
-
- prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL);
- if (prv == NULL)
- return -1; /* errno already set */
-
- return 0;
-}
-
-/* Create a probe (if it does not exist yet). */
-static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
-{
- dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv);
-
- if (prv == NULL)
- return 0;
- if (dt_probe_lookup(dtp, pdp) != NULL)
- return 0;
-#ifdef DEBUG_FBT
- if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) {
- fprintf(stderr, "%s(..., PROVIDE %s:%s:%s:%s) - ...\n", __func__, pdp->prv, pdp->mod, pdp->fun, pdp->prb);
- return 1;
- }
-#else
- if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb))
- return 1;
-#endif
-
- return 0;
-}
-
-/*
- * Try to provide probes for the given probe description. The caller ensures
- * that the provider name in probe desxcription (if any) is a match for this
- * provider. When this is called, we already know that this provider matches
- * the provider component of the probe specification.
- */
-#define FBT_ENTRY 1
-#define FBT_RETURN 2
-
-static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
-{
- int n = 0;
- int prb = 0;
- dt_module_t *dmp = NULL;
- dt_symbol_t *sym = NULL;
- dt_htab_next_t *it = NULL;
- dtrace_probedesc_t pd;
-
- dt_modsym_mark_traceable(dtp);
-
- /*
- * Nothing to do if a probe name is specified and cannot match 'entry'
- * or 'return'.
- */
- if (dt_gmatch("entry", pdp->prb))
- prb |= FBT_ENTRY;
- if (dt_gmatch("return", pdp->prb))
- prb |= FBT_RETURN;
- if (prb == 0)
- return 0;
-
- /*
- * If we have an explicit module name, check it. If not found, we can
- * ignore this request.
- */
- if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) {
- dmp = dt_module_lookup_by_name(dtp, pdp->mod);
- if (dmp == NULL)
- return 0;
- }
-
- /*
- * If we have an explicit function name, we start with a basic symbol
- * name lookup.
- */
- if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) {
- /* If we have a module, use it. */
- if (dmp != NULL) {
- sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun);
- if (sym == NULL)
- return 0;
- if (!dt_symbol_traceable(sym))
- return 0;
-
- pd.id = DTRACE_IDNONE;
- pd.prv = pdp->prv;
- pd.mod = dmp->dm_name;
- pd.fun = pdp->fun;
-
- if (prb & FBT_ENTRY) {
- pd.prb = "entry";
- n += provide_probe(dtp, &pd);
- }
- if (prb & FBT_RETURN) {
- pd.prb = "return";
- n += provide_probe(dtp, &pd);
- }
-
- return n;
- }
-
- sym = dt_symbol_by_name(dtp, pdp->fun);
- while (sym != NULL) {
- const char *mod = dt_symbol_module(sym)->dm_name;
-
- if (dt_symbol_traceable(sym) &&
- dt_gmatch(mod, pdp->mod)) {
- pd.id = DTRACE_IDNONE;
- pd.prv = pdp->prv;
- pd.mod = mod;
- pd.fun = pdp->fun;
-
- if (prb & FBT_ENTRY) {
- pd.prb = "entry";
- n += provide_probe(dtp, &pd);
- }
- if (prb & FBT_RETURN) {
- pd.prb = "return";
- n += provide_probe(dtp, &pd);
- }
-
- }
- sym = dt_symbol_by_name_next(sym);
- }
-
- return n;
- }
-
- /*
- * No explicit function name. We need to go through all possible
- * symbol names and see if they match.
- */
- while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) {
- dt_module_t *smp;
- const char *fun;
-
- /* Ensure the symbol can be traced. */
- if (!dt_symbol_traceable(sym))
- continue;
-
- /* Match the function name. */
- fun = dt_symbol_name(sym);
- if (!dt_gmatch(fun, pdp->fun))
- continue;
-
- /* Validate the module name. */
- smp = dt_symbol_module(sym);
- if (dmp) {
- if (smp != dmp)
- continue;
- } else if (!dt_gmatch(smp->dm_name, pdp->mod))
- continue;
-
- pd.id = DTRACE_IDNONE;
- pd.prv = pdp->prv;
- pd.mod = smp->dm_name;
- pd.fun = fun;
-
- if (prb & FBT_ENTRY) {
- pd.prb = "entry";
- n += provide_probe(dtp, &pd);
- }
- if (prb & FBT_RETURN) {
- pd.prb = "return";
- n += provide_probe(dtp, &pd);
- }
- }
-
- return n;
-}
-
-/*
- * Generate a BPF trampoline for a FBT probe.
- *
- * The trampoline function is called when a FBT probe triggers, and it must
- * satisfy the following prototype:
- *
- * int dt_rawfbt(dt_pt_regs *regs)
- *
- * The trampoline will populate a dt_dctx_t struct and then call the function
- * that implements the compiled D clause. It returns 0 to the caller.
- */
-static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
-{
- dt_cg_tramp_prologue(pcb);
-
- /*
- * After the dt_cg_tramp_prologue() call, we have:
- * // (%r7 = dctx->mst)
- * // (%r8 = dctx->ctx)
- */
- dt_cg_tramp_copy_regs(pcb);
- if (strcmp(pcb->pcb_probe->desc->prb, "return") == 0) {
- dt_irlist_t *dlp = &pcb->pcb_ir;
-
- dt_cg_tramp_copy_rval_from_regs(pcb);
-
- /*
- * fbt:::return arg0 should be the function offset for
- * return instruction. Since we use kretprobes, however,
- * which do not fire until the function has returned to
- * its caller, information about the returning instruction
- * in the callee has been lost.
- *
- * Set arg0=-1 to indicate that we do not know the value.
- */
- dt_cg_xsetx(dlp, NULL, DT_LBL_NONE, BPF_REG_0, -1);
- emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0));
- } else
- dt_cg_tramp_copy_args_from_regs(pcb, 1);
- dt_cg_tramp_epilogue(pcb);
-
- return 0;
-}
-
-static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
-{
- char *prb = NULL;
-
- if (!dt_tp_probe_has_info(prp)) {
- char *fn, *p;
- FILE *f;
- int fd, rc = -1;
-
- /*
- * The tracepoint event we will be creating needs to have a
- * valid name. We use a copy of the probe name, with . -> _
- * conversion.
- */
- prb = strdup(prp->desc->fun);
- for (p = prb; *p; p++) {
- if (*p == '.')
- *p = '_';
- }
-
- /*
- * Register the kprobe with the tracing subsystem. This will
- * create a tracepoint event.
- */
- fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND);
- if (fd == -1)
- goto fail;
-
- rc = dprintf(fd, "%c:" FBT_GROUP_FMT "/%s %s\n",
- prp->desc->prb[0] == 'e' ? 'p' : 'r',
- FBT_GROUP_DATA, prb, prp->desc->fun);
- close(fd);
- if (rc == -1)
- goto fail;
-
- /* create format file name */
- if (asprintf(&fn, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS,
- FBT_GROUP_DATA, prb) == -1)
- goto fail;
-
- /* open format file */
- f = fopen(fn, "r");
- free(fn);
- if (f == NULL)
- goto fail;
-
- /* read event id from format file */
- rc = dt_tp_probe_info(dtp, f, 0, prp, NULL, NULL);
- fclose(f);
-
- if (rc < 0)
- goto fail;
-
- free(prb);
- }
-
- /* attach BPF program to the probe */
- return dt_tp_probe_attach(dtp, prp, bpf_fd);
-
-fail:
- free(prb);
- return -ENOENT;
-}
-
-/*
- * Try to clean up system resources that may have been allocated for this
- * probe.
- *
- * If there is an event FD, we close it.
- *
- * We also try to remove any kprobe that may have been created for the probe.
- * This is harmless for probes that didn't get created. If the removal fails
- * for some reason we are out of luck - fortunately it is not harmful to the
- * system as a whole.
- */
-static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
-{
- int fd;
- char *prb, *p;
-
- if (!dt_tp_probe_has_info(prp))
- return;
-
- dt_tp_probe_detach(dtp, prp);
-
- fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND);
- if (fd == -1)
- return;
-
- /* The tracepoint event is the probe nam, with . -> _ conversion. */
- prb = strdup(prp->desc->fun);
- for (p = prb; *p; p++) {
- if (*p == '.')
- *p = '_';
- }
-
- dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, prb);
- free(prb);
- close(fd);
-}
-
-dt_provimpl_t dt_rawfbt = {
- .name = prvname,
- .prog_type = BPF_PROG_TYPE_KPROBE,
- .populate = &populate,
- .provide = &provide,
- .load_prog = &dt_bpf_prog_load,
- .trampoline = &trampoline,
- .attach = &attach,
- .detach = &detach,
- .probe_destroy = &dt_tp_probe_destroy,
-};
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/8] rawfbt: perform lookup on true symbol names
2025-03-07 21:34 ` [PATCH 3/8] rawfbt: perform lookup on true symbol names Kris Van Hees
@ 2025-03-10 22:03 ` Eugene Loh
0 siblings, 0 replies; 14+ messages in thread
From: Eugene Loh @ 2025-03-10 22:03 UTC (permalink / raw)
To: Kris Van Hees, dtrace, dtrace-devel
Reviewed-by: Eugene Loh <eugene.loh@oracle.com>
On 3/7/25 16:34, Kris Van Hees wrote:
> When encountering a <sym>.<suffix> symbol, a symbol lookup was done for
> <sym> instead of <sym>.<suffix> under the assumption that names with .
> in them were not listed in kallsyms. But that is not true.
>
> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
> libdtrace/dt_prov_rawfbt.c | 18 ------------------
> 1 file changed, 18 deletions(-)
>
> diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c
> index 4c8e8130..62f2f4f0 100644
> --- a/libdtrace/dt_prov_rawfbt.c
> +++ b/libdtrace/dt_prov_rawfbt.c
> @@ -122,27 +122,9 @@ static int populate(dtrace_hdl_t *dtp)
> * try to determine the module name.
> */
> if (!p) {
> - char *q;
> -
> - /*
> - * For synthetic symbol names (those containing '.'),
> - * we need to use the base name (before the '.') for
> - * module name lookup, because the synthetic forms are
> - * not recorded in kallsyms information.
> - *
> - * We replace the first '.' with a 0 to terminate the
> - * string, and after the lookup, we put it back.
> - */
> - q = strchr(buf, '.');
> - if (q != NULL)
> - *q = '\0';
> -
> if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf,
> NULL, &sip) == 0)
> mod = sip.object;
> -
> - if (q != NULL)
> - *q = '.';
> } else
> mod = p;
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/8] ksyms: make symbol name filters less picky
2025-03-07 21:34 ` [PATCH 4/8] ksyms: make symbol name filters less picky Kris Van Hees
@ 2025-03-10 22:04 ` Eugene Loh
0 siblings, 0 replies; 14+ messages in thread
From: Eugene Loh @ 2025-03-10 22:04 UTC (permalink / raw)
To: Kris Van Hees, dtrace, dtrace-devel
Reviewed-by: Eugene Loh <eugene.loh@oracle.com>
On 3/7/25 16:34, Kris Van Hees wrote:
> Some symbols were being filtered out even though they represent symbols
> that can actually be probed.
>
> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
> libdtrace/dt_module.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c
> index dc00aa88..2e915e2f 100644
> --- a/libdtrace/dt_module.c
> +++ b/libdtrace/dt_module.c
> @@ -1215,7 +1215,7 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym,
> (strstarts(sym->name, "__syscall_meta__")) ||
> (strstarts(sym->name, "__p_syscall_meta__")) ||
> (strstarts(sym->name, "__event_")) ||
> - (strstarts(sym->name, "event_")) ||
> + (strstarts(sym->name, "event_") && sym->type == 'd') ||
> (strstarts(sym->name, "ftrace_event_")) ||
> (strstarts(sym->name, "types__")) ||
> (strstarts(sym->name, "args__")) ||
> @@ -1223,7 +1223,6 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym,
> (strstarts(sym->name, "__tpstrtab_")) ||
> (strstarts(sym->name, "__tpstrtab__")) ||
> (strstarts(sym->name, "__initcall_")) ||
> - (strstarts(sym->name, "__setup_")) ||
> (strstarts(sym->name, "__pci_fixup_")))
> skip = 1;
> #undef strstarts
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/8] fbt: performance improvements
2025-03-07 21:34 ` [PATCH 6/8] fbt: performance improvements Kris Van Hees
@ 2025-03-12 5:17 ` Eugene Loh
2025-03-12 5:33 ` Kris Van Hees
0 siblings, 1 reply; 14+ messages in thread
From: Eugene Loh @ 2025-03-12 5:17 UTC (permalink / raw)
To: Kris Van Hees, dtrace, dtrace-devel
Sorry for the slow progress. Anyhow, with this patch, I get failures on
test/unittest/lockstat/tst.lockstat-summary.d on x86 UEK7 systems.
Stuff like:
dtrace: could not enable tracing: BPF program load for
'fbt:vmlinux:native_queued_spin_lock_slowp: Invalid argument
Well, in dt_prov_lockstat.c, I see:
{ "spin-spin", DTRACE_PROBESPEC_FUNC, "fbt::queued_spin_lock_*" },
{ "spin-spin", DTRACE_PROBESPEC_FUNC,
"fbt::native_queued_spin_lock_*" },
And on those problematic systems, I see:
$ sudo build/run-dtrace -lP fbt |& grep native_queued
98429 fbt vmlinux
native_queued_spin_lock_slowpath return
98428 fbt vmlinux
native_queued_spin_lock_slowpath entry
9433 fbt vmlinux
native_queued_spin_lock_slowpath.part.0 return
9432 fbt vmlinux
native_queued_spin_lock_slowpath.part.0 entry
In contrast, on UEK8, "dtrace -lP fbt" does not include the .part.0 probes.
Back on the UEK7 systems, if I modify dt_prov_lockstat.c like this:
- { "spin-spin", DTRACE_PROBESPEC_FUNC,
"fbt::native_queued_spin_lock_*" },
+ { "spin-spin", DTRACE_PROBESPEC_FUNC,
"fbt::native_queued_spin_lock_slowpath" },
the test passes.
Is that the right change to make? If so, shall I submit a patch, or
should it go with your patch series?
On 3/7/25 16:34, Kris Van Hees wrote:
> Up until now, FBT probes were registered for every symbol that was
> listed as traceable. Most tracing session do not use most or even
> any of these, and the process of registering them all was quite
> slow.
>
> Going forward, FBT probes are registered on demand.
>
> If any FBT probes are to be registered, the first will incur the
> cost of reading the entire list of traceable symbols. Any further
> FBT probe registration will be able to be satisfied based on that
> initial processing. The performance improvement is therefore quite
> significant for tracing sessions that do not trigger any FBT probe
> registration, and if FBT probes are used, the improvement is still
> quite noticable because only the probes that are actually needed
> get registered.
>
> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
> libdtrace/dt_module.c | 78 +++++++++++++++
> libdtrace/dt_module.h | 2 +
> libdtrace/dt_prov_fbt.c | 217 +++++++++++++++++++++++++++-------------
> 3 files changed, 228 insertions(+), 69 deletions(-)
>
> diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c
> index 2e915e2f..e7553a07 100644
> --- a/libdtrace/dt_module.c
> +++ b/libdtrace/dt_module.c
> @@ -22,6 +22,7 @@
> #include <port.h>
>
> #include <zlib.h>
> +#include <tracefs.h>
>
> #include <dt_kernel_module.h>
> #include <dt_module.h>
> @@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp)
> }
> }
>
> +#define PROBE_LIST TRACEFS "available_filter_functions"
> +
> +/*
> + * Determine which kernel functions are traceable and mark them.
> + */
> +void
> +dt_modsym_mark_traceable(dtrace_hdl_t *dtp)
> +{
> + FILE *f;
> + char *buf = NULL;
> + size_t len = 0;
> +
> + if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms))
> + return;
> +
> + f = fopen(PROBE_LIST, "r");
> + if (f == NULL)
> + return;
> +
> + while (getline(&buf, &len, f) >= 0) {
> + char *p;
> + dt_symbol_t *sym = NULL;
> +
> + /*
> + * Here buf is either "funcname\n" or "funcname [modname]\n".
> + * The last line may not have a linefeed.
> + */
> + p = strchr(buf, '\n');
> + if (p) {
> + *p = '\0';
> + if (p > buf && *(--p) == ']')
> + *p = '\0';
> + }
> +
> + /*
> + * Now buf is either "funcname" or "funcname [modname". If
> + * there is no module name provided, we will use the default.
> + */
> + p = strchr(buf, ' ');
> + if (p) {
> + *p++ = '\0';
> + if (*p == '[')
> + p++;
> + }
> +
> +#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
> + /* Weed out __ftrace_invalid_address___* entries. */
> + if (strstarts(buf, "__ftrace_invalid_address__") ||
> + strstarts(buf, "__probestub_") ||
> + strstarts(buf, "__traceiter_"))
> + continue;
> +#undef strstarts
> +
> + /*
> + * If we have a module name, look for the symbol in that
> + * module.
> + * If not, perform a general symbol lookup to find its first
> + * instance.
> + */
> + if (p) {
> + dt_module_t *dmp = dt_module_lookup_by_name(dtp, p);
> +
> + if (dmp)
> + sym = dt_module_symbol_by_name(dtp, dmp, buf);
> + } else
> + sym = dt_symbol_by_name(dtp, buf);
> +
> + if (sym)
> + dt_symbol_set_traceable(sym);
> + }
> +
> + free(buf);
> + fclose(f);
> +
> + dt_symtab_set_traceable(dtp->dt_exec->dm_kernsyms);
> +}
> +
> /*
> * Symbol data can be collected in three ways:
> * - kallmodsyms
> diff --git a/libdtrace/dt_module.h b/libdtrace/dt_module.h
> index 56df17a6..dd3ad17c 100644
> --- a/libdtrace/dt_module.h
> +++ b/libdtrace/dt_module.h
> @@ -25,6 +25,8 @@ extern dt_ident_t *dt_module_extern(dtrace_hdl_t *, dt_module_t *,
>
> extern const char *dt_module_modelname(dt_module_t *);
>
> +extern void dt_modsym_mark_traceable(dtrace_hdl_t *);
> +
> #ifdef __cplusplus
> }
> #endif
> diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c
> index eef93879..d837e14d 100644
> --- a/libdtrace/dt_prov_fbt.c
> +++ b/libdtrace/dt_prov_fbt.c
> @@ -41,10 +41,8 @@
> #include "dt_pt_regs.h"
>
> static const char prvname[] = "fbt";
> -static const char modname[] = "vmlinux";
>
> #define KPROBE_EVENTS TRACEFS "kprobe_events"
> -#define PROBE_LIST TRACEFS "available_filter_functions"
>
> #define FBT_GROUP_FMT GROUP_FMT "_%s"
> #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb
> @@ -61,19 +59,11 @@ dt_provimpl_t dt_fbt_fprobe;
> dt_provimpl_t dt_fbt_kprobe;
>
> /*
> - * Scan the PROBE_LIST file and add entry and return probes for every function
> - * that is listed.
> + * Create the fbt provider.
> */
> static int populate(dtrace_hdl_t *dtp)
> {
> dt_provider_t *prv;
> - FILE *f;
> - char *buf = NULL;
> - char *p;
> - const char *mod = modname;
> - size_t n;
> - dtrace_syminfo_t sip;
> - dtrace_probedesc_t pd;
>
> dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe;
>
> @@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp)
> if (prv == NULL)
> return -1; /* errno already set */
>
> - f = fopen(PROBE_LIST, "r");
> - if (f == NULL)
> + return 0;
> +}
> +
> +/* Create a probe (if it does not exist yet). */
> +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
> +{
> + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv);
> +
> + if (prv == NULL)
> + return 0;
> + if (dt_probe_lookup(dtp, pdp) != NULL)
> return 0;
> + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb))
> + return 1;
>
> - while (getline(&buf, &n, f) >= 0) {
> - /*
> - * Here buf is either "funcname\n" or "funcname [modname]\n".
> - * The last line may not have a linefeed.
> - */
> - p = strchr(buf, '\n');
> - if (p) {
> - *p = '\0';
> - if (p > buf && *(--p) == ']')
> - *p = '\0';
> + return 0;
> +}
> +
> +/*
> + * Try to provide probes for the given probe description. The caller ensures
> + * that the provider name in probe desxcription (if any) is a match for this
> + * provider. When this is called, we already know that this provider matches
> + * the provider component of the probe specification.
> + */
> +#define FBT_ENTRY 1
> +#define FBT_RETURN 2
> +
> +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
> +{
> + int n = 0;
> + int prb = 0;
> + dt_module_t *dmp = NULL;
> + dt_symbol_t *sym = NULL;
> + dt_htab_next_t *it = NULL;
> + dtrace_probedesc_t pd;
> +
> + dt_modsym_mark_traceable(dtp);
> +
> + /*
> + * Nothing to do if a probe name is specified and cannot match 'entry'
> + * or 'return'.
> + */
> + if (dt_gmatch("entry", pdp->prb))
> + prb |= FBT_ENTRY;
> + if (dt_gmatch("return", pdp->prb))
> + prb |= FBT_RETURN;
> + if (prb == 0)
> + return 0;
> +
> + /* Synthetic function names are not supported for FBT. */
> + if (strchr(pdp->fun, '.'))
> + return 0;
> +
> + /*
> + * If we have an explicit module name, check it. If not found, we can
> + * ignore this request.
> + */
> + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) {
> + dmp = dt_module_lookup_by_name(dtp, pdp->mod);
> + if (dmp == NULL)
> + return 0;
> + }
> +
> + /*
> + * If we have an explicit function name, we start with a basic symbol
> + * name lookup.
> + */
> + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) {
> + /* If we have a module, use it. */
> + if (dmp != NULL) {
> + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun);
> + if (sym == NULL)
> + return 0;
> + if (!dt_symbol_traceable(sym))
> + return 0;
> +
> + pd.id = DTRACE_IDNONE;
> + pd.prv = pdp->prv;
> + pd.mod = dmp->dm_name;
> + pd.fun = pdp->fun;
> +
> + if (prb & FBT_ENTRY) {
> + pd.prb = "entry";
> + n += provide_probe(dtp, &pd);
> + }
> + if (prb & FBT_RETURN) {
> + pd.prb = "return";
> + n += provide_probe(dtp, &pd);
> + }
> +
> + return n;
> }
>
> - /*
> - * Now buf is either "funcname" or "funcname [modname". If
> - * there is no module name provided, we will use the default.
> - */
> - p = strchr(buf, ' ');
> - if (p) {
> - *p++ = '\0';
> - if (*p == '[')
> - p++;
> + sym = dt_symbol_by_name(dtp, pdp->fun);
> + while (sym != NULL) {
> + const char *mod = dt_symbol_module(sym)->dm_name;
> +
> + if (dt_symbol_traceable(sym) &&
> + dt_gmatch(mod, pdp->mod)) {
> + pd.id = DTRACE_IDNONE;
> + pd.prv = pdp->prv;
> + pd.mod = mod;
> + pd.fun = pdp->fun;
> +
> + if (prb & FBT_ENTRY) {
> + pd.prb = "entry";
> + n += provide_probe(dtp, &pd);
> + }
> + if (prb & FBT_RETURN) {
> + pd.prb = "return";
> + n += provide_probe(dtp, &pd);
> + }
> +
> + }
> + sym = dt_symbol_by_name_next(sym);
> }
>
> - /* Weed out synthetic symbol names (that are invalid). */
> - if (strchr(buf, '.') != NULL)
> + return n;
> + }
> +
> + /*
> + * No explicit function name. We need to go through all possible
> + * symbol names and see if they match.
> + */
> + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) {
> + dt_module_t *smp;
> + const char *fun;
> +
> + /* Ensure the symbol can be traced. */
> + if (!dt_symbol_traceable(sym))
> continue;
>
> -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
> - /* Weed out __ftrace_invalid_address___* entries. */
> - if (strstarts(buf, "__ftrace_invalid_address__") ||
> - strstarts(buf, "__probestub_") ||
> - strstarts(buf, "__traceiter_"))
> + /* Match the function name. */
> + fun = dt_symbol_name(sym);
> + if (!dt_gmatch(fun, pdp->fun))
> continue;
> -#undef strstarts
>
> - /*
> - * If we did not see a module name, perform a symbol lookup to
> - * try to determine the module name.
> - */
> - if (!p) {
> - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf,
> - NULL, &sip) == 0)
> - mod = sip.object;
> - } else
> - mod = p;
> + /* Validate the module name. */
> + smp = dt_symbol_module(sym);
> + if (dmp) {
> + if (smp != dmp)
> + continue;
> + } else if (!dt_gmatch(smp->dm_name, pdp->mod))
> + continue;
>
> - /*
> - * Due to the lack of module names in
> - * TRACEFS/available_filter_functions, there are some duplicate
> - * function names. We need to make sure that we do not create
> - * duplicate probes for these.
> - */
> pd.id = DTRACE_IDNONE;
> - pd.prv = prvname;
> - pd.mod = mod;
> - pd.fun = buf;
> - pd.prb = "entry";
> - if (dt_probe_lookup(dtp, &pd) != NULL)
> - continue;
> + pd.prv = pdp->prv;
> + pd.mod = smp->dm_name;
> + pd.fun = fun;
>
> - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry"))
> - n++;
> - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return"))
> - n++;
> + if (prb & FBT_ENTRY) {
> + pd.prb = "entry";
> + n += provide_probe(dtp, &pd);
> + }
> + if (prb & FBT_RETURN) {
> + pd.prb = "return";
> + n += provide_probe(dtp, &pd);
> + }
> }
>
> - free(buf);
> - fclose(f);
> -
> return n;
> }
>
> @@ -447,6 +524,7 @@ dt_provimpl_t dt_fbt_fprobe = {
> .prog_type = BPF_PROG_TYPE_TRACING,
> .stack_skip = 4,
> .populate = &populate,
> + .provide = &provide,
> .load_prog = &fprobe_prog_load,
> .trampoline = &fprobe_trampoline,
> .attach = &dt_tp_probe_attach_raw,
> @@ -459,6 +537,7 @@ dt_provimpl_t dt_fbt_kprobe = {
> .name = prvname,
> .prog_type = BPF_PROG_TYPE_KPROBE,
> .populate = &populate,
> + .provide = &provide,
> .load_prog = &dt_bpf_prog_load,
> .trampoline = &kprobe_trampoline,
> .attach = &kprobe_attach,
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/8] fbt: performance improvements
2025-03-12 5:17 ` Eugene Loh
@ 2025-03-12 5:33 ` Kris Van Hees
2025-03-17 21:00 ` Eugene Loh
0 siblings, 1 reply; 14+ messages in thread
From: Kris Van Hees @ 2025-03-12 5:33 UTC (permalink / raw)
To: Eugene Loh; +Cc: Kris Van Hees, dtrace, dtrace-devel
On Wed, Mar 12, 2025 at 01:17:51AM -0400, Eugene Loh wrote:
> Sorry for the slow progress. Anyhow, with this patch, I get failures on
> test/unittest/lockstat/tst.lockstat-summary.d on x86 UEK7 systems. Stuff
> like:
>
> dtrace: could not enable tracing: BPF program load for
> 'fbt:vmlinux:native_queued_spin_lock_slowp: Invalid argument
>
> Well, in dt_prov_lockstat.c, I see:
> { "spin-spin", DTRACE_PROBESPEC_FUNC, "fbt::queued_spin_lock_*" },
> { "spin-spin", DTRACE_PROBESPEC_FUNC,
> "fbt::native_queued_spin_lock_*" },
>
> And on those problematic systems, I see:
> $ sudo build/run-dtrace -lP fbt |& grep native_queued
> 98429 fbt vmlinux native_queued_spin_lock_slowpath
> return
> 98428 fbt vmlinux native_queued_spin_lock_slowpath
> entry
> 9433 fbt vmlinux
> native_queued_spin_lock_slowpath.part.0 return
> 9432 fbt vmlinux
> native_queued_spin_lock_slowpath.part.0 entry
>
> In contrast, on UEK8, "dtrace -lP fbt" does not include the .part.0 probes.
>
> Back on the UEK7 systems, if I modify dt_prov_lockstat.c like this:
> - { "spin-spin", DTRACE_PROBESPEC_FUNC,
> "fbt::native_queued_spin_lock_*" },
> + { "spin-spin", DTRACE_PROBESPEC_FUNC,
> "fbt::native_queued_spin_lock_slowpath" },
> the test passes.
>
> Is that the right change to make? If so, shall I submit a patch, or should
> it go with your patch series?
No change should be needed to the lockstat provider. You uncovered a bug
in my patch - I'll fix it.
In short, while the provide() function filters out function names that have
a '.' in them for the pdp->fun specification string, it fails to do so for
the case where globbing requires us to loop over possible function names that
could match pdp->fun (or all function names if pdp->fun == ""). We need to
exclude names with "." in them there also because the FBT provider does not
allow probing of those synmbols.
I think I'll just defer that check to provide_probe() since I can do it in a
single place for all cases.
Incidentally, I also just noticed that dt_modsym_mark_traceable(dtp); is being
done too early. We only really need that to be done once we get to looking at
function symbols. I'll move it - that way we avoid marking function traceable
for probes that cannot be FBT probes because of probe name or module name.
I'll send out a v2 tomorrow with that fix.
> On 3/7/25 16:34, Kris Van Hees wrote:
> > Up until now, FBT probes were registered for every symbol that was
> > listed as traceable. Most tracing session do not use most or even
> > any of these, and the process of registering them all was quite
> > slow.
> >
> > Going forward, FBT probes are registered on demand.
> >
> > If any FBT probes are to be registered, the first will incur the
> > cost of reading the entire list of traceable symbols. Any further
> > FBT probe registration will be able to be satisfied based on that
> > initial processing. The performance improvement is therefore quite
> > significant for tracing sessions that do not trigger any FBT probe
> > registration, and if FBT probes are used, the improvement is still
> > quite noticable because only the probes that are actually needed
> > get registered.
> >
> > Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> > ---
> > libdtrace/dt_module.c | 78 +++++++++++++++
> > libdtrace/dt_module.h | 2 +
> > libdtrace/dt_prov_fbt.c | 217 +++++++++++++++++++++++++++-------------
> > 3 files changed, 228 insertions(+), 69 deletions(-)
> >
> > diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c
> > index 2e915e2f..e7553a07 100644
> > --- a/libdtrace/dt_module.c
> > +++ b/libdtrace/dt_module.c
> > @@ -22,6 +22,7 @@
> > #include <port.h>
> > #include <zlib.h>
> > +#include <tracefs.h>
> > #include <dt_kernel_module.h>
> > #include <dt_module.h>
> > @@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp)
> > }
> > }
> > +#define PROBE_LIST TRACEFS "available_filter_functions"
> > +
> > +/*
> > + * Determine which kernel functions are traceable and mark them.
> > + */
> > +void
> > +dt_modsym_mark_traceable(dtrace_hdl_t *dtp)
> > +{
> > + FILE *f;
> > + char *buf = NULL;
> > + size_t len = 0;
> > +
> > + if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms))
> > + return;
> > +
> > + f = fopen(PROBE_LIST, "r");
> > + if (f == NULL)
> > + return;
> > +
> > + while (getline(&buf, &len, f) >= 0) {
> > + char *p;
> > + dt_symbol_t *sym = NULL;
> > +
> > + /*
> > + * Here buf is either "funcname\n" or "funcname [modname]\n".
> > + * The last line may not have a linefeed.
> > + */
> > + p = strchr(buf, '\n');
> > + if (p) {
> > + *p = '\0';
> > + if (p > buf && *(--p) == ']')
> > + *p = '\0';
> > + }
> > +
> > + /*
> > + * Now buf is either "funcname" or "funcname [modname". If
> > + * there is no module name provided, we will use the default.
> > + */
> > + p = strchr(buf, ' ');
> > + if (p) {
> > + *p++ = '\0';
> > + if (*p == '[')
> > + p++;
> > + }
> > +
> > +#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
> > + /* Weed out __ftrace_invalid_address___* entries. */
> > + if (strstarts(buf, "__ftrace_invalid_address__") ||
> > + strstarts(buf, "__probestub_") ||
> > + strstarts(buf, "__traceiter_"))
> > + continue;
> > +#undef strstarts
> > +
> > + /*
> > + * If we have a module name, look for the symbol in that
> > + * module.
> > + * If not, perform a general symbol lookup to find its first
> > + * instance.
> > + */
> > + if (p) {
> > + dt_module_t *dmp = dt_module_lookup_by_name(dtp, p);
> > +
> > + if (dmp)
> > + sym = dt_module_symbol_by_name(dtp, dmp, buf);
> > + } else
> > + sym = dt_symbol_by_name(dtp, buf);
> > +
> > + if (sym)
> > + dt_symbol_set_traceable(sym);
> > + }
> > +
> > + free(buf);
> > + fclose(f);
> > +
> > + dt_symtab_set_traceable(dtp->dt_exec->dm_kernsyms);
> > +}
> > +
> > /*
> > * Symbol data can be collected in three ways:
> > * - kallmodsyms
> > diff --git a/libdtrace/dt_module.h b/libdtrace/dt_module.h
> > index 56df17a6..dd3ad17c 100644
> > --- a/libdtrace/dt_module.h
> > +++ b/libdtrace/dt_module.h
> > @@ -25,6 +25,8 @@ extern dt_ident_t *dt_module_extern(dtrace_hdl_t *, dt_module_t *,
> > extern const char *dt_module_modelname(dt_module_t *);
> > +extern void dt_modsym_mark_traceable(dtrace_hdl_t *);
> > +
> > #ifdef __cplusplus
> > }
> > #endif
> > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c
> > index eef93879..d837e14d 100644
> > --- a/libdtrace/dt_prov_fbt.c
> > +++ b/libdtrace/dt_prov_fbt.c
> > @@ -41,10 +41,8 @@
> > #include "dt_pt_regs.h"
> > static const char prvname[] = "fbt";
> > -static const char modname[] = "vmlinux";
> > #define KPROBE_EVENTS TRACEFS "kprobe_events"
> > -#define PROBE_LIST TRACEFS "available_filter_functions"
> > #define FBT_GROUP_FMT GROUP_FMT "_%s"
> > #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb
> > @@ -61,19 +59,11 @@ dt_provimpl_t dt_fbt_fprobe;
> > dt_provimpl_t dt_fbt_kprobe;
> > /*
> > - * Scan the PROBE_LIST file and add entry and return probes for every function
> > - * that is listed.
> > + * Create the fbt provider.
> > */
> > static int populate(dtrace_hdl_t *dtp)
> > {
> > dt_provider_t *prv;
> > - FILE *f;
> > - char *buf = NULL;
> > - char *p;
> > - const char *mod = modname;
> > - size_t n;
> > - dtrace_syminfo_t sip;
> > - dtrace_probedesc_t pd;
> > dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe;
> > @@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp)
> > if (prv == NULL)
> > return -1; /* errno already set */
> > - f = fopen(PROBE_LIST, "r");
> > - if (f == NULL)
> > + return 0;
> > +}
> > +
> > +/* Create a probe (if it does not exist yet). */
> > +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
> > +{
> > + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv);
> > +
> > + if (prv == NULL)
> > + return 0;
> > + if (dt_probe_lookup(dtp, pdp) != NULL)
> > return 0;
> > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb))
> > + return 1;
> > - while (getline(&buf, &n, f) >= 0) {
> > - /*
> > - * Here buf is either "funcname\n" or "funcname [modname]\n".
> > - * The last line may not have a linefeed.
> > - */
> > - p = strchr(buf, '\n');
> > - if (p) {
> > - *p = '\0';
> > - if (p > buf && *(--p) == ']')
> > - *p = '\0';
> > + return 0;
> > +}
> > +
> > +/*
> > + * Try to provide probes for the given probe description. The caller ensures
> > + * that the provider name in probe desxcription (if any) is a match for this
> > + * provider. When this is called, we already know that this provider matches
> > + * the provider component of the probe specification.
> > + */
> > +#define FBT_ENTRY 1
> > +#define FBT_RETURN 2
> > +
> > +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
> > +{
> > + int n = 0;
> > + int prb = 0;
> > + dt_module_t *dmp = NULL;
> > + dt_symbol_t *sym = NULL;
> > + dt_htab_next_t *it = NULL;
> > + dtrace_probedesc_t pd;
> > +
> > + dt_modsym_mark_traceable(dtp);
> > +
> > + /*
> > + * Nothing to do if a probe name is specified and cannot match 'entry'
> > + * or 'return'.
> > + */
> > + if (dt_gmatch("entry", pdp->prb))
> > + prb |= FBT_ENTRY;
> > + if (dt_gmatch("return", pdp->prb))
> > + prb |= FBT_RETURN;
> > + if (prb == 0)
> > + return 0;
> > +
> > + /* Synthetic function names are not supported for FBT. */
> > + if (strchr(pdp->fun, '.'))
> > + return 0;
> > +
> > + /*
> > + * If we have an explicit module name, check it. If not found, we can
> > + * ignore this request.
> > + */
> > + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) {
> > + dmp = dt_module_lookup_by_name(dtp, pdp->mod);
> > + if (dmp == NULL)
> > + return 0;
> > + }
> > +
> > + /*
> > + * If we have an explicit function name, we start with a basic symbol
> > + * name lookup.
> > + */
> > + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) {
> > + /* If we have a module, use it. */
> > + if (dmp != NULL) {
> > + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun);
> > + if (sym == NULL)
> > + return 0;
> > + if (!dt_symbol_traceable(sym))
> > + return 0;
> > +
> > + pd.id = DTRACE_IDNONE;
> > + pd.prv = pdp->prv;
> > + pd.mod = dmp->dm_name;
> > + pd.fun = pdp->fun;
> > +
> > + if (prb & FBT_ENTRY) {
> > + pd.prb = "entry";
> > + n += provide_probe(dtp, &pd);
> > + }
> > + if (prb & FBT_RETURN) {
> > + pd.prb = "return";
> > + n += provide_probe(dtp, &pd);
> > + }
> > +
> > + return n;
> > }
> > - /*
> > - * Now buf is either "funcname" or "funcname [modname". If
> > - * there is no module name provided, we will use the default.
> > - */
> > - p = strchr(buf, ' ');
> > - if (p) {
> > - *p++ = '\0';
> > - if (*p == '[')
> > - p++;
> > + sym = dt_symbol_by_name(dtp, pdp->fun);
> > + while (sym != NULL) {
> > + const char *mod = dt_symbol_module(sym)->dm_name;
> > +
> > + if (dt_symbol_traceable(sym) &&
> > + dt_gmatch(mod, pdp->mod)) {
> > + pd.id = DTRACE_IDNONE;
> > + pd.prv = pdp->prv;
> > + pd.mod = mod;
> > + pd.fun = pdp->fun;
> > +
> > + if (prb & FBT_ENTRY) {
> > + pd.prb = "entry";
> > + n += provide_probe(dtp, &pd);
> > + }
> > + if (prb & FBT_RETURN) {
> > + pd.prb = "return";
> > + n += provide_probe(dtp, &pd);
> > + }
> > +
> > + }
> > + sym = dt_symbol_by_name_next(sym);
> > }
> > - /* Weed out synthetic symbol names (that are invalid). */
> > - if (strchr(buf, '.') != NULL)
> > + return n;
> > + }
> > +
> > + /*
> > + * No explicit function name. We need to go through all possible
> > + * symbol names and see if they match.
> > + */
> > + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) {
> > + dt_module_t *smp;
> > + const char *fun;
> > +
> > + /* Ensure the symbol can be traced. */
> > + if (!dt_symbol_traceable(sym))
> > continue;
> > -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0)
> > - /* Weed out __ftrace_invalid_address___* entries. */
> > - if (strstarts(buf, "__ftrace_invalid_address__") ||
> > - strstarts(buf, "__probestub_") ||
> > - strstarts(buf, "__traceiter_"))
> > + /* Match the function name. */
> > + fun = dt_symbol_name(sym);
> > + if (!dt_gmatch(fun, pdp->fun))
> > continue;
> > -#undef strstarts
> > - /*
> > - * If we did not see a module name, perform a symbol lookup to
> > - * try to determine the module name.
> > - */
> > - if (!p) {
> > - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf,
> > - NULL, &sip) == 0)
> > - mod = sip.object;
> > - } else
> > - mod = p;
> > + /* Validate the module name. */
> > + smp = dt_symbol_module(sym);
> > + if (dmp) {
> > + if (smp != dmp)
> > + continue;
> > + } else if (!dt_gmatch(smp->dm_name, pdp->mod))
> > + continue;
> > - /*
> > - * Due to the lack of module names in
> > - * TRACEFS/available_filter_functions, there are some duplicate
> > - * function names. We need to make sure that we do not create
> > - * duplicate probes for these.
> > - */
> > pd.id = DTRACE_IDNONE;
> > - pd.prv = prvname;
> > - pd.mod = mod;
> > - pd.fun = buf;
> > - pd.prb = "entry";
> > - if (dt_probe_lookup(dtp, &pd) != NULL)
> > - continue;
> > + pd.prv = pdp->prv;
> > + pd.mod = smp->dm_name;
> > + pd.fun = fun;
> > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry"))
> > - n++;
> > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return"))
> > - n++;
> > + if (prb & FBT_ENTRY) {
> > + pd.prb = "entry";
> > + n += provide_probe(dtp, &pd);
> > + }
> > + if (prb & FBT_RETURN) {
> > + pd.prb = "return";
> > + n += provide_probe(dtp, &pd);
> > + }
> > }
> > - free(buf);
> > - fclose(f);
> > -
> > return n;
> > }
> > @@ -447,6 +524,7 @@ dt_provimpl_t dt_fbt_fprobe = {
> > .prog_type = BPF_PROG_TYPE_TRACING,
> > .stack_skip = 4,
> > .populate = &populate,
> > + .provide = &provide,
> > .load_prog = &fprobe_prog_load,
> > .trampoline = &fprobe_trampoline,
> > .attach = &dt_tp_probe_attach_raw,
> > @@ -459,6 +537,7 @@ dt_provimpl_t dt_fbt_kprobe = {
> > .name = prvname,
> > .prog_type = BPF_PROG_TYPE_KPROBE,
> > .populate = &populate,
> > + .provide = &provide,
> > .load_prog = &dt_bpf_prog_load,
> > .trampoline = &kprobe_trampoline,
> > .attach = &kprobe_attach,
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/8] fbt: performance improvements
2025-03-12 5:33 ` Kris Van Hees
@ 2025-03-17 21:00 ` Eugene Loh
2025-03-17 21:15 ` Kris Van Hees
0 siblings, 1 reply; 14+ messages in thread
From: Eugene Loh @ 2025-03-17 21:00 UTC (permalink / raw)
To: Kris Van Hees; +Cc: dtrace, dtrace-devel
On 3/12/25 01:33, Kris Van Hees wrote:
> No change should be needed to the lockstat provider. You uncovered a bug
> in my patch - I'll fix it.
Just curious why lockstat even used "_*" rather than "_slowpath".
> Incidentally, I also just noticed that dt_modsym_mark_traceable(dtp); is being
> done too early. We only really need that to be done once we get to looking at
> function symbols. I'll move it - that way we avoid marking function traceable
> for probes that cannot be FBT probes because of probe name or module name.
>
> I'll send out a v2 tomorrow with that fix.
Okay, I think I see v2. Just checking: does it indeed have
dt_modsym_mark_traceable() moved?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/8] fbt: performance improvements
2025-03-17 21:00 ` Eugene Loh
@ 2025-03-17 21:15 ` Kris Van Hees
0 siblings, 0 replies; 14+ messages in thread
From: Kris Van Hees @ 2025-03-17 21:15 UTC (permalink / raw)
To: Eugene Loh; +Cc: Kris Van Hees, dtrace, dtrace-devel
On Mon, Mar 17, 2025 at 05:00:39PM -0400, Eugene Loh wrote:
> On 3/12/25 01:33, Kris Van Hees wrote:
>
> > No change should be needed to the lockstat provider. You uncovered a bug
> > in my patch - I'll fix it.
>
> Just curious why lockstat even used "_*" rather than "_slowpath".
Not sure - maybe there was another variant at some point in time that is no
longer there right now.
> > Incidentally, I also just noticed that dt_modsym_mark_traceable(dtp); is being
> > done too early. We only really need that to be done once we get to looking at
> > function symbols. I'll move it - that way we avoid marking function traceable
> > for probes that cannot be FBT probes because of probe name or module name.
> >
> > I'll send out a v2 tomorrow with that fix.
>
> Okay, I think I see v2. Just checking: does it indeed have
> dt_modsym_mark_traceable() moved?
No, that is being done in the consolidation patch.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [DTrace-devel] [PATCH 5/8] symtab: add support for 'traceable' flag
2025-03-07 21:34 ` [PATCH 5/8] symtab: add support for 'traceable' flag Kris Van Hees
@ 2025-03-18 5:26 ` Eugene Loh
0 siblings, 0 replies; 14+ messages in thread
From: Eugene Loh @ 2025-03-18 5:26 UTC (permalink / raw)
To: dtrace, dtrace-devel
Reviewed-by: Eugene Loh <eugene.loh@oracle.com>
Note...
Both files need updated copyright years?
On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote:
> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
> libdtrace/dt_symtab.c | 53 +++++++++++++++++++++++++++++++++++++++----
> libdtrace/dt_symtab.h | 6 +++++
> 2 files changed, 55 insertions(+), 4 deletions(-)
>
> diff --git a/libdtrace/dt_symtab.c b/libdtrace/dt_symtab.c
> index db63cc88..4e46f280 100644
> --- a/libdtrace/dt_symtab.c
> +++ b/libdtrace/dt_symtab.c
> @@ -23,9 +23,12 @@
> #include <dt_string.h>
> #include <unistd.h>
>
> -#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */
> -#define DT_ST_PACKED 0x02 /* Symbol table packed
> +#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */
> +#define DT_ST_PACKED 0x02 /* Symbol table packed
> * (necessarily sorted too) */
> +#define DT_ST_TRACEABLE 0x04 /* Symbols have traceable flag */
> +
> +#define DT_STB_TRACE 8 /* traceable symbol */
>
> struct dt_symbol {
> dt_list_t dts_list; /* list forward/back pointers */
> @@ -275,6 +278,12 @@ dt_symbol_by_name(dtrace_hdl_t *dtp, const char *name)
> return dt_htab_lookup(dtp->dt_kernsyms, &tmpl);
> }
>
> +dt_symbol_t *
> +dt_symbol_by_name_next(const dt_symbol_t *symbol)
> +{
> + return symbol ? (dt_symbol_t *)symbol->dts_he.next : NULL;
> +}
> +
Obviously, not a big deal, but this function seems to have nothing
really to do with this patch. Arguably, its introduction should be
deferred to the next patch. But, I'm fine with leaving it here.
> /* Find a symbol in a given module. */
> dt_symbol_t *
> dt_module_symbol_by_name(dtrace_hdl_t *dtp, dt_module_t *dmp, const char *name)
> @@ -548,7 +557,7 @@ dt_symbol_name(const dt_symbol_t *symbol)
> void
> dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_symp)
> {
> - elf_symp->st_info = symbol->dts_info;
> + elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0);
> elf_symp->st_value = symbol->dts_addr;
> elf_symp->st_size = symbol->dts_size;
> elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */
> @@ -557,7 +566,7 @@ dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_sym
> void
> dt_symbol_to_elfsym32(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf32_Sym *elf_symp)
> {
> - elf_symp->st_info = symbol->dts_info;
> + elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0);
> elf_symp->st_value = symbol->dts_addr;
> elf_symp->st_size = symbol->dts_size;
> elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */
> @@ -581,3 +590,39 @@ dt_symbol_module(dt_symbol_t *symbol)
> {
> return symbol->dts_dmp;
> }
> +
> +/*
> + * Mark the symtab annotated with traceable flags on symbols.
> + */
> +void
> +dt_symtab_set_traceable(dt_symtab_t *symtab)
> +{
> + symtab->dtst_flags |= DT_ST_TRACEABLE;
> +}
> +
> +/*
> + * Return whether symbols have traceable flags.
> + */
> +int
> +dt_symtab_traceable(const dt_symtab_t *symtab)
> +{
> + return symtab->dtst_flags & DT_ST_TRACEABLE;
> +}
> +
> +/*
> + * Mark a symbol as traceable.
> + */
> +void
> +dt_symbol_set_traceable(dt_symbol_t *symbol)
> +{
> + symbol->dts_info |= GELF_ST_INFO(DT_STB_TRACE, 0);
> +}
> +
> +/*
> + * Return true if the symbol can be traced.
> + */
> +int
> +dt_symbol_traceable(const dt_symbol_t *symbol)
> +{
> + return GELF_ST_BIND(symbol->dts_info) & DT_STB_TRACE;
> +}
Very minor, but these four functions are very similar and yet their
comments seem to have spurious differences. I would think the comments
would be boringly similar. Also, the first comment ("Mark the symtab
annotated with traceable flags on symbols.") makes sense to me only if I
read the code and think about the comment a lot. How about these
comments instead?
/* Mark a symtab as having traceable flags on symbols. */
/* Return whether a symtab has traceable flags on symbols. */
/* Mark a symbol as traceable. */
/* Return whether a symbol is traceable. */
Or even have "set" and "get" functions and reduce the number of comments
altogether. I mean, one can explain what "traceable" means in each
context, but then people can easily enough understand what the "set" and
"get" variants do.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-03-18 5:26 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-07 21:34 [PATCH 2/8] sched: clean up unnecessary includes and functions Kris Van Hees
2025-03-07 21:34 ` [PATCH 3/8] rawfbt: perform lookup on true symbol names Kris Van Hees
2025-03-10 22:03 ` Eugene Loh
2025-03-07 21:34 ` [PATCH 4/8] ksyms: make symbol name filters less picky Kris Van Hees
2025-03-10 22:04 ` Eugene Loh
2025-03-07 21:34 ` [PATCH 5/8] symtab: add support for 'traceable' flag Kris Van Hees
2025-03-18 5:26 ` [DTrace-devel] " Eugene Loh
2025-03-07 21:34 ` [PATCH 6/8] fbt: performance improvements Kris Van Hees
2025-03-12 5:17 ` Eugene Loh
2025-03-12 5:33 ` Kris Van Hees
2025-03-17 21:00 ` Eugene Loh
2025-03-17 21:15 ` Kris Van Hees
2025-03-07 21:34 ` [PATCH 7/8] rawfbt: " Kris Van Hees
2025-03-07 21:34 ` [PATCH 8/8] fbt, rawfbt: consolidate code to avoid duplication Kris Van Hees
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox