* "proof of concept" for systemwide USDT
@ 2024-06-04 18:10 eugene.loh
2024-06-04 18:11 ` [PATCH 01/14] Move comment closer to the code it describes eugene.loh
` (13 more replies)
0 siblings, 14 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:10 UTC (permalink / raw)
To: dtrace, dtrace-devel
This series of patches illustrates a proof of concept for systemwide
USDT. There is much cleanup needed, but first I would like buy in
for the various implementation decisions.
With this patch series, one can specify wildcard probe descriptions
for pid and USDT probes. For now, only processes that exist at the
time dtrace is launched will be picked up. An underlying probe will
look up a firing pid in a new BPF map. If there is an entry, the
associated PRID is returned, along with a bitmap indicating which
clauses to execute.
Since the PRID is no longer known for a clause when it is loaded into
the kernel -- specifically, a clause's EPID no longer specifies the
PRID -- we write the PRID into the output buffer. This is the new
mechanism for the consumer to figure out which PRID to associate with
a tracing record it reads.
The major functionality missing is to be able to pick up new processes
(and drop old ones) as they start up (or terminate) during a dtrace
session.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 01/14] Move comment closer to the code it describes
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:21 ` [DTrace-devel] " Kris Van Hees
2024-06-04 18:11 ` [PATCH 02/14] Clean up prp/uprp variable names eugene.loh
` (12 subsequent siblings)
13 siblings, 1 reply; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_prov_uprobe.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index b712c589..afc1f628 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -337,15 +337,16 @@ static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp, int is_usdt)
/*
* We need to enable the underlying probes (if not enabled yet).
- *
- * If necessary, we need to enable is-enabled probes too (if they
- * exist).
*/
for (pup = prp->prv_data; pup != NULL; pup = dt_list_next(pup)) {
dt_probe_t *uprp = pup->probe;
dt_probe_enable(dtp, uprp);
}
+ /*
+ * If necessary, we need to enable is-enabled probes too (if they
+ * exist).
+ */
if (is_usdt) {
dtrace_probedesc_t pd;
dt_probe_t *iep;
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 02/14] Clean up prp/uprp variable names
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
2024-06-04 18:11 ` [PATCH 01/14] Move comment closer to the code it describes eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:44 ` [DTrace-devel] " Kris Van Hees
2024-06-04 18:11 ` [PATCH 03/14] Let USDT module names contain dots eugene.loh
` (11 subsequent siblings)
13 siblings, 1 reply; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Should prp be oprp? Or should we use opr and upr?
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_prov_uprobe.c | 55 +++++++++++++++++++++++++++-----------
1 file changed, 40 insertions(+), 15 deletions(-)
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index afc1f628..cace406d 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -6,6 +6,31 @@
*
* The uprobe-based provider for DTrace (implementing pid and USDT providers).
*/
+/*
+ * This file uses both overlying probes (specified by the user) as well as
+ * underlying probes (the uprobes recognized by the kernel). To minimize
+ * confusion, this file should use consistent variable names:
+ *
+ * dt_probe_t *prp; // overlying probe
+ * dt_probe_t *uprp; // underlying probe
+ *
+ * Either one might be returned by dt_probe_lookup() or
+ * dt_probe_insert() or added to dt_enablings[] or dt_probes[].
+ * Further, uprp might be returned by create_underlying().
+ *
+ * dt_uprobe_t *upp; // uprobe associated with an underlying probe
+ *
+ * list_probe_t *pop; // overlying probe list
+ * list_probe_t *pup; // underlying probe list
+ *
+ * The provider-specific prv_data has these meanings:
+ *
+ * prp->prv_data // dt_list_t of associated underlying probes
+ *
+ * uprp->prv_data // upp (the associated uprobe)
+ *
+ * Finally, note that upp->probes is a dt_list_t of overlying probes.
+ */
#include <sys/types.h>
#include <assert.h>
#include <errno.h>
@@ -118,7 +143,7 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
char mod[DTRACE_MODNAMELEN];
char prb[DTRACE_NAMELEN];
dtrace_probedesc_t pd;
- dt_probe_t *prp;
+ dt_probe_t *uprp;
dt_uprobe_t *upp;
int is_enabled = 0;
@@ -160,8 +185,8 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
pd.fun = psp->pps_fun;
pd.prb = prb;
- prp = dt_probe_lookup(dtp, &pd);
- if (prp == NULL) {
+ uprp = dt_probe_lookup(dtp, &pd);
+ if (uprp == NULL) {
dt_provider_t *pvp;
/* Get the provider for underlying probes. */
@@ -182,12 +207,12 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
if (upp->tp == NULL)
goto fail;
- prp = dt_probe_insert(dtp, pvp, pd.prv, pd.mod, pd.fun, pd.prb,
+ uprp = dt_probe_insert(dtp, pvp, pd.prv, pd.mod, pd.fun, pd.prb,
upp);
- if (prp == NULL)
+ if (uprp == NULL)
goto fail;
} else
- upp = prp->prv_data;
+ upp = uprp->prv_data;
switch (psp->pps_type) {
case DTPPT_RETURN:
@@ -202,7 +227,7 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
*/
}
- return prp;
+ return uprp;
fail:
probe_destroy(dtp, upp);
@@ -394,8 +419,8 @@ static void enable_usdt(dtrace_hdl_t *dtp, dt_probe_t *prp)
static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
{
dt_irlist_t *dlp = &pcb->pcb_ir;
- const dt_probe_t *prp = pcb->pcb_probe;
- const dt_uprobe_t *upp = prp->prv_data;
+ const dt_probe_t *uprp = pcb->pcb_probe;
+ const dt_uprobe_t *upp = uprp->prv_data;
const list_probe_t *pop;
uint_t lbl_exit = pcb->pcb_exitlbl;
@@ -508,8 +533,8 @@ copyout_val(dt_pcb_t *pcb, uint_t lbl, uint32_t val, int arg)
static int trampoline_is_enabled(dt_pcb_t *pcb, uint_t exitlbl)
{
dt_irlist_t *dlp = &pcb->pcb_ir;
- const dt_probe_t *prp = pcb->pcb_probe;
- const dt_uprobe_t *upp = prp->prv_data;
+ const dt_probe_t *uprp = pcb->pcb_probe;
+ const dt_uprobe_t *upp = uprp->prv_data;
const list_probe_t *pop;
uint_t lbl_assign = dt_irlist_label(dlp);
uint_t lbl_exit = pcb->pcb_exitlbl;
@@ -636,9 +661,9 @@ out:
return name;
}
-static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
+static int attach(dtrace_hdl_t *dtp, const dt_probe_t *uprp, int bpf_fd)
{
- dt_uprobe_t *upp = prp->prv_data;
+ dt_uprobe_t *upp = uprp->prv_data;
tp_probe_t *tpp = upp->tp;
FILE *f;
char *fn;
@@ -733,9 +758,9 @@ out:
* probes that didn't get created. If the removal fails for some reason we are
* out of luck - fortunately it is not harmful to the system as a whole.
*/
-static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
+static void detach(dtrace_hdl_t *dtp, const dt_probe_t *uprp)
{
- dt_uprobe_t *upp = prp->prv_data;
+ dt_uprobe_t *upp = uprp->prv_data;
tp_probe_t *tpp = upp->tp;
if (!dt_tp_has_info(tpp))
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 03/14] Let USDT module names contain dots
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
2024-06-04 18:11 ` [PATCH 01/14] Move comment closer to the code it describes eugene.loh
2024-06-04 18:11 ` [PATCH 02/14] Clean up prp/uprp variable names eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 20:42 ` [DTrace-devel] " Kris Van Hees
2024-06-04 18:11 ` [PATCH 04/14] Track uprobe provider descriptions eugene.loh
` (10 subsequent siblings)
13 siblings, 1 reply; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
dtprobed/dof_stash.c | 5 +++++
libdtrace/dt_pid.c | 2 ++
2 files changed, 7 insertions(+)
diff --git a/dtprobed/dof_stash.c b/dtprobed/dof_stash.c
index 62418b66..44c67462 100644
--- a/dtprobed/dof_stash.c
+++ b/dtprobed/dof_stash.c
@@ -589,6 +589,10 @@ dof_stash_write_parsed(pid_t pid, dev_t dev, ino_t ino, dt_list_t *accum)
mod, fun, prb)) == NULL)
goto err_provider;
+#if 0
+This does not make any sense. We are checking parsedfn against "." and "..",
+but parsedfn comes from make_probespec_name(), whose output is of the form
+"%s:%s:%s:%s" and therefore can never be "." or "..".
/*
* Ban "." and ".." as name components. Obviously names
* containing dots are commonplace (shared libraries,
@@ -604,6 +608,7 @@ dof_stash_write_parsed(pid_t pid, dev_t dev, ino_t ino, dt_list_t *accum)
if (strcmp(parsedfn, ".") == 0 ||
strcmp(parsedfn, "..") == 0)
goto err_provider;
+#endif
op = "probe module";
diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
index 7c7d7e30..93a7ce76 100644
--- a/libdtrace/dt_pid.c
+++ b/libdtrace/dt_pid.c
@@ -833,6 +833,7 @@ dt_pid_create_usdt_probes(dtrace_hdl_t *dtp, dt_proc_t *dpr, dtrace_probedesc_t
assert(pvp->impl != NULL && pvp->impl->provide_probe != NULL);
+#if 0
if (strchr(pdp->prv, '.') != NULL ||
strchr(pdp->mod, '.') != NULL ||
strchr(pdp->fun, '.') != NULL ||
@@ -840,6 +841,7 @@ dt_pid_create_usdt_probes(dtrace_hdl_t *dtp, dt_proc_t *dpr, dtrace_probedesc_t
dt_dprintf("Probe component contains dots: cannot be a USDT probe.\n");
return 0;
}
+#endif
if (asprintf(&probepath, "%s/probes/%i/%s/%s/%s/%s", dtp->dt_dofstash_path,
dpr->dpr_pid, pdp->prv[0] == '\0' ? "*" : pdp->prv,
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 04/14] Track uprobe provider descriptions
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (2 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 03/14] Let USDT module names contain dots eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 21:10 ` [DTrace-devel] " Kris Van Hees
2024-06-04 18:11 ` [PATCH 05/14] Add a hook for a provider-specific "update" function eugene.loh
` (9 subsequent siblings)
13 siblings, 1 reply; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
A uprobe will have to decide which of its clauses to run for a
given pid. For now, keep the provider description for each clause.
There are many shortcomings to the way this is done in this patch,
but it is quick and dirty and helps us bootstrap real support.
Clean this up later, probably turning it into a growable array.
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_cc.c | 1 +
libdtrace/dt_impl.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
index d1ee3843..6bff7e0f 100644
--- a/libdtrace/dt_cc.c
+++ b/libdtrace/dt_cc.c
@@ -215,6 +215,7 @@ dt_compile_one_clause(dtrace_hdl_t *dtp, dt_node_t *cnp, dt_node_t *pnp)
*/
dt_cg(yypcb, cnp);
sdp->dtsd_clause = dt_clause_create(dtp, dt_as(yypcb));
+ dtp->dt_uprovdescs[dtp->dt_clause_nextid - 1] = strdup(pnp->dn_desc->prv);
assert(yypcb->pcb_stmt == sdp);
if (dtrace_stmt_add(yypcb->pcb_hdl, yypcb->pcb_prog, sdp) != 0)
diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
index 01313ff3..1bf79d80 100644
--- a/libdtrace/dt_impl.h
+++ b/libdtrace/dt_impl.h
@@ -310,6 +310,7 @@ struct dtrace_hdl {
dt_pcb_t *dt_pcb; /* pointer to current parsing control block */
ulong_t dt_gen; /* compiler generation number */
uint_t dt_clause_nextid; /* next ID to use for programs */
+ char *dt_uprovdescs[256]; /* uprobe provider descriptor per clause... FIXME turn this into a growable array */
dt_list_t dt_programs; /* linked list of dtrace_prog_t's */
dt_list_t dt_xlators; /* linked list of dt_xlator_t's */
dt_list_t dt_enablings; /* list of (to be) enabled probes */
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 05/14] Add a hook for a provider-specific "update" function
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (3 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 04/14] Track uprobe provider descriptions eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 21:38 ` [DTrace-devel] " Kris Van Hees
2024-06-04 18:11 ` [PATCH 06/14] Add clauses to per-uprobe list eugene.loh
` (8 subsequent siblings)
13 siblings, 1 reply; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
For up-coming USDT-probe support, we need to update a BPF map
-- at least when the dtrace session starts but possibly also later
to support systemwide USDT tracing for processes that may start up
later.
One way to do this is with a USDT-specific update function.
For now, let's add a hook for providers to have provider-specific
update functions. User space can either call
for (i = 0; i < ARRAY_SIZE(dt_providers); i++) {
if (dt_providers[i]->update)
dt_providers[i]->update(...);
}
any time it likes. Or it can call dt_usdt.update(...).
This is for WIP. A different approach can be adopted later instead.
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_provider.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/libdtrace/dt_provider.h b/libdtrace/dt_provider.h
index 17b1844c..21ff15ad 100644
--- a/libdtrace/dt_provider.h
+++ b/libdtrace/dt_provider.h
@@ -68,6 +68,8 @@ typedef struct dt_provimpl {
void *datap);
void (*destroy)(dtrace_hdl_t *dtp, /* free provider data */
void *datap);
+ void (*update)(dtrace_hdl_t *dtp, /* update provider-specific info */
+ void *datap);
} dt_provimpl_t;
/* list dt_dtrace first */
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 06/14] Add clauses to per-uprobe list
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (4 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 05/14] Add a hook for a provider-specific "update" function eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:11 ` [PATCH 07/14] Create the BPF uprobes map eugene.loh
` (7 subsequent siblings)
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_prov_uprobe.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index cace406d..34906aa0 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -62,6 +62,7 @@ typedef struct dt_uprobe {
int flags;
tp_probe_t *tp;
dt_list_t probes; /* pid/USDT probes triggered by it */
+ dt_list_t clauses; /* clauses possibly called by this uprobe */
} dt_uprobe_t;
typedef struct list_probe {
@@ -69,6 +70,11 @@ typedef struct list_probe {
dt_probe_t *probe;
} list_probe_t;
+typedef struct list_clause {
+ dt_list_t list;
+ uint_t clause;
+} list_clause_t;
+
static const dtrace_pattr_t pattr = {
{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_ISA },
{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_UNKNOWN },
@@ -243,6 +249,7 @@ static int provide_probe(dtrace_hdl_t *dtp, const pid_probespec_t *psp,
dt_uprobe_t *upp;
dt_probe_t *prp, *uprp;
list_probe_t *pop, *pup;
+ list_clause_t *cl;
snprintf(prv, sizeof(prv), "%s%d", psp->pps_prv, psp->pps_pid);
@@ -263,6 +270,7 @@ static int provide_probe(dtrace_hdl_t *dtp, const pid_probespec_t *psp,
/* Mark the provider as a PID-based provider. */
pvp->pv_flags |= DT_PROVIDER_PID;
+ /* Look up or create the underlying probe. */
uprp = create_underlying(dtp, psp);
if (uprp == NULL)
return -1;
@@ -270,6 +278,17 @@ static int provide_probe(dtrace_hdl_t *dtp, const pid_probespec_t *psp,
upp = uprp->prv_data;
upp->flags |= flags;
+ /* Add to this uprobe an index for the next clause to be assigned. */
+ cl = dt_list_prev(&upp->clauses);
+ if (cl == NULL || cl->clause != dtp->dt_clause_nextid) {
+ cl = dt_zalloc(dtp, sizeof(list_clause_t));
+ if (cl == NULL)
+ return -1;
+ cl->clause = dtp->dt_clause_nextid;
+ dt_list_append(&upp->clauses, cl);
+ }
+
+ /* Look up the overlying probe. */
prp = dt_probe_lookup(dtp, &pd);
if (prp != NULL) {
/*
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 07/14] Create the BPF uprobes map
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (5 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 06/14] Add clauses to per-uprobe list eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-05 4:33 ` [DTrace-devel] " Kris Van Hees
2024-06-04 18:11 ` [PATCH 08/14] Use uprobes map to call clauses conditionally eugene.loh
` (6 subsequent siblings)
13 siblings, 1 reply; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
WIP, sizing is hardwired for now.
WIP, should add new pids and purge old ones (after the initial call).
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_bpf.c | 48 +++++++++++++++++++++++++++++++
libdtrace/dt_dlibs.c | 1 +
libdtrace/dt_impl.h | 1 +
libdtrace/dt_prov_uprobe.c | 59 ++++++++++++++++++++++++++++++++++++++
4 files changed, 109 insertions(+)
diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
index 71c6a446..3aeae274 100644
--- a/libdtrace/dt_bpf.c
+++ b/libdtrace/dt_bpf.c
@@ -946,6 +946,53 @@ gmap_create_probes(dtrace_hdl_t *dtp)
return 0;
}
+/* also defined in dt_prov_uprobe.c */
+typedef struct uprobe_map_key {
+ pid_t pid;
+ dtrace_id_t uprid;
+} uprobe_map_key_t;
+typedef struct uprobe_map_val {
+ dtrace_id_t prid;
+ long long mask;
+} uprobe_map_val_t;
+
+/*
+ * Create the 'uprobes' BPF map.
+ *
+ * Uprobe information map. This is a global hash map for use
+ * with USDT and pid probes). It is indexed by (pid, probe ID),
+ * using the probe ID of the underlying probe. The value is a
+ * probe ID for the overlying probe and a bit mask indicating
+ * which clauses to execute for this pid.
+ *
+ * WIP. Just make up some sizes for now.
+ *
+ * How big is a probe ID? Sometimes, it's a dtrace_id_t.
+ * And uts/common/sys/dtrace_types.h gives us "typedef uint32_t dtrace_id_t".
+ * Meanwhile, libdtrace/dt_impl.h has "uint32_t dt_probe_id".
+ * So either uint32_t or dtrace_id_t is fine.
+ *
+ * How big should our bit mask be? Start with 8*sizeof(long long) bits.
+ *
+ * How many entries should we allow? Start with 1000.
+ *
+ * FIXME: is there any way two different overlying probes (differing by more
+ * than just pid) could map to the same underlying probe?
+ */
+static int
+gmap_create_uprobes(dtrace_hdl_t *dtp)
+{
+ dtp->dt_uprobesmap_fd = create_gmap(dtp, "uprobes", BPF_MAP_TYPE_HASH,
+ sizeof(uprobe_map_key_t), sizeof(uprobe_map_val_t), 1000);
+ if (dtp->dt_uprobesmap_fd == -1)
+ return -1;
+
+ /* Populate the newly created map. */
+ dt_uprobe.update(dtp, NULL);
+
+ return 0;
+}
+
/*
* Create the 'gvars' BPF map.
*
@@ -1051,6 +1098,7 @@ dt_bpf_gmap_create(dtrace_hdl_t *dtp)
CREATE_MAP(scratchmem)
CREATE_MAP(strtab)
CREATE_MAP(probes)
+ CREATE_MAP(uprobes)
CREATE_MAP(gvars)
CREATE_MAP(lvars)
CREATE_MAP(dvars)
diff --git a/libdtrace/dt_dlibs.c b/libdtrace/dt_dlibs.c
index bc883e11..7bbeb02c 100644
--- a/libdtrace/dt_dlibs.c
+++ b/libdtrace/dt_dlibs.c
@@ -66,6 +66,7 @@ static const dt_ident_t dt_bpf_symbols[] = {
DT_BPF_SYMBOL(lvars, DT_IDENT_PTR),
DT_BPF_SYMBOL(mem, DT_IDENT_PTR),
DT_BPF_SYMBOL(probes, DT_IDENT_PTR),
+ DT_BPF_SYMBOL(uprobes, DT_IDENT_PTR),
DT_BPF_SYMBOL(scratchmem, DT_IDENT_PTR),
DT_BPF_SYMBOL(specs, DT_IDENT_PTR),
DT_BPF_SYMBOL(state, DT_IDENT_PTR),
diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
index 1bf79d80..935b96f4 100644
--- a/libdtrace/dt_impl.h
+++ b/libdtrace/dt_impl.h
@@ -438,6 +438,7 @@ struct dtrace_hdl {
int dt_aggmap_fd; /* file descriptor for the 'aggs' BPF map */
int dt_genmap_fd; /* file descriptor for the 'agggen' BPF map */
int dt_cpumap_fd; /* file descriptor for the 'cpuinfo' BPF map */
+ int dt_uprobesmap_fd; /* file descriptor for the 'uprobes' BPF map */
dtrace_handle_err_f *dt_errhdlr; /* error handler, if any */
void *dt_errarg; /* error handler argument */
dtrace_handle_drop_f *dt_drophdlr; /* drop handler, if any */
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index 34906aa0..591f2fab 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -45,6 +45,7 @@
#include "dt_probe.h"
#include "dt_pid.h"
#include "dt_string.h"
+#include "port.h"
/* Provider name for the underlying probes. */
static const char prvname[] = "uprobe";
@@ -70,6 +71,15 @@ typedef struct list_probe {
dt_probe_t *probe;
} list_probe_t;
+/* uprobe_map_[key|val]_t are also defined in dt_bpf.c */
+typedef struct uprobe_map_key {
+ pid_t pid;
+ dtrace_id_t uprid;
+} uprobe_map_key_t;
+typedef struct uprobe_map_val {
+ dtrace_id_t prid;
+ long long mask;
+} uprobe_map_val_t;
typedef struct list_clause {
dt_list_t list;
uint_t clause;
@@ -135,6 +145,54 @@ static void probe_destroy(dtrace_hdl_t *dtp, void *datap)
free_probe_list(dtp, datap);
}
+/*
+ * Update the uprobe provider.
+ */
+static void update_uprobe(dtrace_hdl_t *dtp, void *datap)
+{
+ dt_probe_t *prp;
+
+ for (prp = dt_list_next(&dtp->dt_enablings); prp != NULL;
+ prp = dt_list_next(prp)) {
+ pid_t pid;
+ const list_probe_t *pup;
+ dt_probe_t *uprp;
+ long long mask = 0, bit = 1;
+ list_clause_t *cl;
+ uprobe_map_key_t key;
+ uprobe_map_val_t val;
+
+ /* Make sure it is an overlying pid or USDT probe. */
+ if (prp->prov->impl != &dt_pid && prp->prov->impl != &dt_usdt)
+ continue;
+
+ /* FIXME passing in NULL pcb and dpr wreaks havoc on error reporting? */
+ pid = dt_pid_get_pid(prp->desc, dtp, NULL, NULL);
+
+ for (pup = prp->prv_data; pup != NULL; pup = dt_list_next(pup)) {
+ dt_uprobe_t *upp;
+
+ uprp = pup->probe;
+ upp = uprp->prv_data;
+ for (cl = dt_list_next(&upp->clauses); cl != NULL; cl = dt_list_next(cl)) {
+ if (gmatch(prp->desc->prv, dtp->dt_uprovdescs[cl->clause]))
+ mask |= bit;
+
+ bit <<= 1;
+ }
+
+ key.pid = pid;
+ key.uprid = uprp->desc->id;
+
+ val.prid = prp->desc->id;
+ val.mask = mask;
+
+ // FIXME check return value
+ dt_bpf_map_update(dtp->dt_uprobesmap_fd, &key, &val);
+ }
+ }
+}
+
/*
* Look up or create an underlying (real) probe, corresponding directly to a
@@ -803,6 +861,7 @@ dt_provimpl_t dt_uprobe = {
.probe_info = &probe_info,
.detach = &detach,
.probe_destroy = &probe_destroy_underlying,
+ .update = &update_uprobe,
};
/*
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 08/14] Use uprobes map to call clauses conditionally
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (6 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 07/14] Create the BPF uprobes map eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:11 ` [PATCH 09/14] Systemwide USDT WIP eugene.loh
` (5 subsequent siblings)
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_prov_uprobe.c | 88 +++++++++++++++++++++++++++++++++++++-
1 file changed, 87 insertions(+), 1 deletion(-)
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index 591f2fab..cb79e0a3 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -498,7 +498,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
dt_irlist_t *dlp = &pcb->pcb_ir;
const dt_probe_t *uprp = pcb->pcb_probe;
const dt_uprobe_t *upp = uprp->prv_data;
- const list_probe_t *pop;
+// const list_probe_t *pop;
uint_t lbl_exit = pcb->pcb_exitlbl;
dt_cg_tramp_prologue(pcb);
@@ -522,6 +522,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
emit(dlp, BPF_CALL_HELPER(BPF_FUNC_get_current_pid_tgid));
emit(dlp, BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32));
+#if 0
/*
* Generate a composite conditional clause:
*
@@ -565,6 +566,91 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
emitl(dlp, lbl_next,
BPF_NOP());
}
+#else
+ dt_ident_t *uprobes = dt_dlib_get_map(pcb->pcb_hdl, "uprobes");
+ list_clause_t *cl;
+
+ assert(uprobes != NULL);
+
+ /*
+ * Look up in the BPF uprobes map. Space for the look-up key will be used
+ * on the BPF stack at %r9-sizeof(uprobe_map_key_t). The key comprises the
+ * pid (in %r0) and the underlying-probe prid.
+ */
+ emit(dlp, BPF_STORE(BPF_W, BPF_REG_9, (int)(-sizeof(uprobe_map_key_t)), BPF_REG_0));
+ emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_9, (int)(-sizeof(dtrace_id_t)) /* or -sizeof(uprobe_map_key_t) + sizeof(pid_t) */, uprp->desc->id));
+ dt_cg_xsetx(dlp, uprobes, DT_LBL_NONE, BPF_REG_1, uprobes->di_id);
+ emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_9));
+ emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, (int)(-sizeof(uprobe_map_key_t))));
+ emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem));
+ emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_exit));
+
+ /* Read the PRID from the table lookup and store to mst->prid. */
+ emit(dlp, BPF_LOAD(BPF_W, BPF_REG_1, BPF_REG_0, 0));
+ emit(dlp, BPF_STORE(BPF_W, BPF_REG_7, DMST_PRID, BPF_REG_1));
+
+ /* Read the bit mask from the table lookup in %r6. */ // FIXME someday, extend this past 64 bits
+ emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_6, BPF_REG_0, offsetof(uprobe_map_val_t, mask)));
+
+ /*
+ * Now loop over clauses.
+ *
+ * How do we know which clauses are called by this underlying probe?
+ * Each dt_probe_t has a "clauses" list, but that's built by
+ *
+ * dt_program.c
+ * dtrace_program_exec()
+ * dtrace_stmt_iter(..., dt_prog_stmt, ...);
+ * dt_prog_stmt()
+ * dt_probe_iter(..., dt_stmt_probe, ...);
+ * dt_stmt_probe()
+ * dt_probe_add_clause(...);
+ *
+ * which is to say that the "clauses" list will be built for overlying
+ * probes, while we are interested here in underlying probes. Should
+ * dt_probe_add_clause() also take some provider-specific action?
+ *
+ * For now, let's just look up the clause using what we know about
+ * the clause numbering.
+ */
+ /*
+ * Hold the bit mask in %r6.
+ */
+ for (cl = dt_list_next(&upp->clauses); cl != NULL; cl = dt_list_next(cl)) {
+ char nm[32]; // FIXME hardwired size
+ dt_ident_t *idp;
+ uint_t lbl_next = dt_irlist_label(dlp);
+
+ sprintf(nm, "dt_clause_%d", cl->clause);
+ idp = dt_idhash_lookup(pcb->pcb_hdl->dt_bpfsyms, nm); // FIXME what if idp is NULL?
+
+ /* Test the lowest bit. */
+ emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_6));
+ emit(dlp, BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 1));
+ emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_1, 0, lbl_next));
+
+ /*
+ * if (*dctx.act != act) // ldw %r0, [%r9 + DCTX_ACT]
+ * goto exit; // ldw %r0, [%r0 + 0]
+ * // jne %r0, act, lbl_exit
+ */
+ emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_9, DCTX_ACT));
+ emit(dlp, BPF_LOAD(BPF_W, BPF_REG_0, BPF_REG_0, 0));
+ emit(dlp, BPF_BRANCH_IMM(BPF_JNE, BPF_REG_0, DT_ACTIVITY_ACTIVE, lbl_exit));
+
+ /* dctx.mst->scratch_top = 8 */
+ emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_SCRATCH_TOP, 8));
+
+ /* Call clause. */
+ emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_9));
+ emite(dlp, BPF_CALL_FUNC(idp->di_id), idp);
+
+ /* Finished this clause. */
+ emitl(dlp, lbl_next,
+ BPF_NOP());
+ emit(dlp, BPF_ALU64_IMM(BPF_RSH, BPF_REG_6, 1));
+ }
+#endif
dt_cg_tramp_return(pcb);
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 09/14] Systemwide USDT WIP
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (7 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 08/14] Use uprobes map to call clauses conditionally eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:11 ` [PATCH 10/14] Fix the consumer's picture of the EPID eugene.loh
` (4 subsequent siblings)
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
need to:
- review tests
- add process monitoring
- deal with FIXMEs
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_cc.c | 3 +-
libdtrace/dt_pid.c | 136 +++++++++++++++---
libdtrace/dt_prov_uprobe.c | 1 +
.../dtrace-util/tst.ListProbesNameUSDT.sh | 1 -
.../dtrace-util/tst.ListProbesProviderUSDT.sh | 1 -
5 files changed, 122 insertions(+), 20 deletions(-)
diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
index 6bff7e0f..eddf8252 100644
--- a/libdtrace/dt_cc.c
+++ b/libdtrace/dt_cc.c
@@ -279,7 +279,8 @@ dt_setcontext(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp)
* and tag -- we just have to longjmp() out of here.
*/
if (pdp->prv && pdp->prv[0] &&
- isdigit(pdp->prv[strlen(pdp->prv) - 1]) &&
+ (isdigit(pdp->prv[strlen(pdp->prv) - 1]) ||
+ pdp->prv[strlen(pdp->prv) - 1] == '*') &&
((pvp = dt_provider_lookup(dtp, pdp->prv)) == NULL ||
pvp->pv_flags & DT_PROVIDER_PID) &&
dt_pid_create_probes((dtrace_probedesc_t *)pdp, dtp, yypcb) != 0) {
diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
index 93a7ce76..e07a3a4b 100644
--- a/libdtrace/dt_pid.c
+++ b/libdtrace/dt_pid.c
@@ -780,8 +780,8 @@ validate_dof_record(const char *path, const dof_parsed_t *parsed,
/*
* Create underlying probes relating to the probespec passed on input.
*
- * If dpr is set, just set up probes relating to mappings found in that one
- * process. (dpr must in this case be locked.)
+ * dpr must be set and locked. Just set up probes relating to mappings found
+ * in this one process.
*
* Return 0 on success or -1 on error. (Failure to create specific underlying
* probes is not an error.)
@@ -795,9 +795,6 @@ dt_pid_create_usdt_probes(dtrace_hdl_t *dtp, dt_proc_t *dpr, dtrace_probedesc_t
char *probepath = NULL;
glob_t probeglob = {0};
- /*
- * Systemwide probing: not yet implemented.
- */
assert(dpr != NULL && dpr->dpr_proc);
assert(MUTEX_HELD(&dpr->dpr_lock));
@@ -1104,22 +1101,96 @@ dt_pid_get_pid(const dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *pcb,
return pid;
}
+static char *
+convert_pidglob(char *s) // FIXME expect const char *?
+{
+ /*
+ * Convert a glob for "pid[1-9][0-9]*" into one for just the numerical
+ * portion. It will always be a subset of the input string. A NULL
+ * pointer is returned if there is no such string -- e.g., if the input
+ * string is "foo1234".
+ *
+ * There is no need to check the entire string for legality. E.g., if
+ * the input string is "pid*q*", we can simply return "*q*". Then, this
+ * string will not match any "[1-9][0-9]*"; we do not need to intervene.
+ */
+ char *p;
+ int nchars = 0; /* number of chars of "pid" we have seen so far */
+
+ for (p = s; ; p++) {
+ switch (*p) {
+// FIXME: I think this passes stuff like "pd1234", "id1234" etc. So make sure 'p' 'i' and 'd' are hit or else there is a '*' to jump over them.
+ case 'p':
+ if (nchars > 0)
+ return NULL;
+ nchars = 1;
+ break;
+ case 'i':
+ if (nchars > 1)
+ return NULL;
+ nchars = 2;
+ break;
+ case 'd':
+ if (nchars > 2)
+ return NULL;
+ nchars = 3;
+ break;
+ case '*':
+ break;
+ case '\0':
+ if (p == s)
+ return NULL;
+ if (*(p - 1) != '*')
+ return p;
+ return p - 1;
+ default:
+ if (*p < '0' || *p > '9')
+ return NULL;
+ if (p == s)
+ return NULL;
+ if (*(p - 1) != '*')
+ return p;
+ return p - 1;
+ }
+ }
+}
+
int
dt_pid_create_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *pcb)
{
- char provname[DTRACE_PROVNAMELEN];
dt_proc_t *dpr;
pid_t pid;
- int err = 0;
+ int err = 0, i, nmatches = 0;
+ glob_t globbuf;
+ char globpat[256]; // FIXME hardwired size
assert(pcb != NULL);
- if ((pid = dt_pid_get_pid(pdp, dtp, pcb, NULL)) == -1)
+ /* Exclude pid0 from being specifically requested. */
+ if (strcmp(pdp->prv, "pid0") == 0) {
+ dt_pid_error(dtp, pcb, NULL, D_PROC_BADPID,
+ "pid0 does not contain a valid pid");
return -1;
+ }
+
+ /*
+ * Try pid probes.
+ */
+ sprintf(globpat, "/proc/%s/map_files", convert_pidglob(pdp->prv));
+ nmatches = glob(globpat, 0, NULL, &globbuf) ? 0 : globbuf.gl_pathc;
+ for (i = 0; i < nmatches; i++) {
+ if (strncmp(globbuf.gl_pathv[i], "/proc/self", 10) == 0)
+ continue;
- snprintf(provname, sizeof(provname), "pid%d", (int)pid);
+ pid = strtol(globbuf.gl_pathv[i] + strlen("/proc/"), NULL, 10); // FIXME need error checking or anything?
+
+#if 1
+ // FIXME: omit this check?
+ char provname[DTRACE_PROVNAMELEN];
+ snprintf(provname, sizeof(provname), "pid%d", (int)pid);
+ assert(gmatch(provname, pdp->prv) != 0);
+#endif
- if (gmatch(provname, pdp->prv) != 0) {
if (dt_proc_grab_lock(dtp, pid, DTRACE_PROC_WAITING) < 0) {
dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB,
"failed to grab process %d", (int)pid);
@@ -1129,14 +1200,45 @@ dt_pid_create_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *pcb)
dpr = dt_proc_lookup(dtp, pid);
assert(dpr != NULL);
- err = dt_pid_create_pid_probes(pdp, dtp, pcb, dpr);
+ // FIXME handle err better
+ err = dt_pid_create_pid_probes(pdp, dtp, pcb, dpr); // FIXME rename to dt_pid_create_pid_probe()?
dt_proc_release_unlock(dtp, pid);
}
+ globfree(&globbuf);
/*
- * If it's not strictly a pid provider, we might match a USDT provider.
+ * Try USDT probes.
*/
- if (strcmp(provname, pdp->prv) != 0) {
+ sprintf(globpat, "%s/probes/*/%s/%s/%s/%s", dtp->dt_dofstash_path,
+ pdp->prv,
+ pdp->mod[0] == '\0' ? "*" : pdp->mod,
+ pdp->fun[0] == '\0' ? "*" : pdp->fun,
+ pdp->prb[0] == '\0' ? "*" : pdp->prb);
+ nmatches = glob(globpat, 0, NULL, &globbuf) ? 0 : globbuf.gl_pathc;
+
+ for (i = 0; i < nmatches; i++) {
+ char *s = strdup(globbuf.gl_pathv[i] + strlen(dtp->dt_dofstash_path) + strlen("/probes/"));
+ // FIXME: do we really need to strdup/free this, or can modify in-place?
+ char *p = strchr(s, '/'), *q;
+ char nullchar = '\0';
+ dtrace_probedesc_t pdptmp;
+
+ *p = '\0';
+ p++;
+ q = strchr(p, '/');
+ *q = '\0'; q++;
+// FIXME should also check dof_version, though that doesn't do much yet
+ pid = atoll(s);
+
+ // Check, since dtprobed takes a while to clean up dead processes.
+ if (!Pexists(pid))
+ continue;
+
+ pdptmp.prv = p;
+ pdptmp.mod = &nullchar; pdptmp.mod = q; q = strchr(q, '/'); *q = '\0'; q++;
+ pdptmp.fun = &nullchar; pdptmp.fun = q; q = strchr(q, '/'); *q = '\0'; q++;
+ pdptmp.prb = &nullchar; pdptmp.prb = q;
+
if (dt_proc_grab_lock(dtp, pid, DTRACE_PROC_WAITING |
DTRACE_PROC_SHORTLIVED) < 0) {
dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB,
@@ -1147,17 +1249,17 @@ dt_pid_create_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *pcb)
dpr = dt_proc_lookup(dtp, pid);
assert(dpr != NULL);
- err = dt_pid_create_usdt_probes(dtp, dpr, pdp, pcb);
+ err = dt_pid_create_usdt_probes(dtp, dpr, &pdptmp, pcb); // FIXME: we care about not just the last err
/*
* Put the module name in its canonical form.
*/
- dt_pid_fix_mod(NULL, pdp, dtp, dpr->dpr_pid);
+ dt_pid_fix_mod(NULL, &pdptmp, dtp, dpr->dpr_pid);
dt_proc_release_unlock(dtp, pid);
+ free(s);
}
-
- /* (USDT systemwide probing goes here.) */
+ globfree(&globbuf);
return err ? -1 : 0;
}
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index cb79e0a3..618219b8 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -766,6 +766,7 @@ static int trampoline_is_enabled(dt_pcb_t *pcb, uint_t exitlbl)
emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, pid, lbl_assign));
}
emit(dlp, BPF_JUMP(lbl_exit));
+
copyout_val(pcb, lbl_assign, 1, 0);
dt_cg_tramp_return(pcb);
diff --git a/test/unittest/dtrace-util/tst.ListProbesNameUSDT.sh b/test/unittest/dtrace-util/tst.ListProbesNameUSDT.sh
index c5dfc72b..9d8f06e9 100755
--- a/test/unittest/dtrace-util/tst.ListProbesNameUSDT.sh
+++ b/test/unittest/dtrace-util/tst.ListProbesNameUSDT.sh
@@ -5,7 +5,6 @@
# Licensed under the Universal Permissive License v 1.0 as shown at
# http://oss.oracle.com/licenses/upl.
#
-# @@xfail: dtv2
##
#
diff --git a/test/unittest/dtrace-util/tst.ListProbesProviderUSDT.sh b/test/unittest/dtrace-util/tst.ListProbesProviderUSDT.sh
index 644da2ac..6eae82ed 100755
--- a/test/unittest/dtrace-util/tst.ListProbesProviderUSDT.sh
+++ b/test/unittest/dtrace-util/tst.ListProbesProviderUSDT.sh
@@ -5,7 +5,6 @@
# Licensed under the Universal Permissive License v 1.0 as shown at
# http://oss.oracle.com/licenses/upl.
#
-# @@xfail: dtv2
##
#
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 10/14] Fix the consumer's picture of the EPID
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (8 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 09/14] Systemwide USDT WIP eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:11 ` [PATCH 11/14] Back out the previous patch eugene.loh
` (3 subsequent siblings)
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
THIS PATCH IS BACKED OUT BY THE PATCH THAT IMMEDIATELY SUCCEEDS IT!
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_prov_uprobe.c | 80 +++++++++++++++++++++++++++++++++++++-
1 file changed, 79 insertions(+), 1 deletion(-)
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index 618219b8..83acb7a4 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -498,7 +498,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
dt_irlist_t *dlp = &pcb->pcb_ir;
const dt_probe_t *uprp = pcb->pcb_probe;
const dt_uprobe_t *upp = uprp->prv_data;
-// const list_probe_t *pop;
+ const list_probe_t *pop;
uint_t lbl_exit = pcb->pcb_exitlbl;
dt_cg_tramp_prologue(pcb);
@@ -585,6 +585,84 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem));
emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_exit));
+#if 1
+/*
+ * Here is a big hack. I hope I understand and explain it correctly.
+ *
+ * DTrace's underlying model is that each EPID represents a combination
+ * of PRID and clause. EPIDs are enumerated so that, for a given EPID,
+ * the consumer can look up which PRID to use. A clause sets its EPID
+ * to some placeholder value.
+ *
+ * During dt_bpf_load_progs(), each probe's BPF program is loaded into
+ * the kernel. Each clause it calls is likewise loaded. Each time a
+ * clause is loaded for a probe, a new EPID is assigned. Clause code
+ * undergoes relocation, and the placeholder EPID value is replaced with
+ * the newly assigned EPID value.
+ *
+ * What's tricky is that the BPF code and the consumer get their PRID
+ * values from different places. While producer might send a PRID value
+ * to the consumer -- e.g., with "trace(id)" -- usually what happens is:
+ *
+ * - BPF code has PRID set in the trampoline, to a value that
+ * is set during relocation
+ *
+ * - the consumer sees the EPID for a record and looks up the
+ * associated PRID, that association made during relocation
+ *
+ * Specifically, during dt_link_construct(), when a new
+ * clause is added, the PRID is associated with the new
+ * EPID through a dt_epid_add() call.
+ *
+ * Further, if an underlying probe trampoline needs to call
+ * clauses for an overlying probe, then dt_link_construct()
+ * watches -- for idp->di_kind==DT_IDENT_SCALAR and
+ * idp->di_id default -- for any overlying probe specification,
+ * switching the prid to the new value.
+ *
+ * We are now changing our USDT support. The PRID is longer set in the
+ * trampoline to a value known at relocation. Rather, the trampoline
+ * now looks a PRID up at run time in a BPF map. The side effect of
+ * associating a PRID with an EPID during relocation is lost. An EPID
+ * could be associated with multiple PRIDs.
+ *
+ * A variety of solutions are possible.
+ *
+ * For example, the BPF map could be keyed on pid, clause, and uprobe.
+ * The output value could be a PRID and EPID for the clause. Clearly,
+ * this would take more BPF map space and more lookups than the current
+ * bitmap approach.
+ *
+ * Here, a different approach is chosen. The PRID that the consumer
+ * associates with an EPID does not need to be exact. A "representative"
+ * PRID suffices. Usually, the consumer only cares about the function
+ * and name components of the probe. So, set the mst->prid in the trampoline
+ * to any representative PRID. Then, dt_link_construct() will continue
+ * to have the same side effect, associating the PRID with the EPID.
+ *
+ * Another option is to write the PRID alongside the EPID in the output
+ * buffer.
+ *
+ * Work in progress.
+ */
+ /*
+ * Store a representative PRID to mst->prid. It will immediately
+ * be overwritten by the correct PRID, but we store it anyhow for
+ * the side effect it has of making a representative PRID known to
+ * the consumer.
+ */
+ pop = dt_list_next(&upp->probes);
+ if (pop != NULL) {
+ const dt_probe_t *pprp = pop->probe;
+ dt_ident_t *idp;
+
+ idp = dt_dlib_add_probe_var(pcb->pcb_hdl, pprp);
+ assert(idp != NULL);
+
+ emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_PRID, pprp->desc->id), idp);
+ }
+#endif
+
/* Read the PRID from the table lookup and store to mst->prid. */
emit(dlp, BPF_LOAD(BPF_W, BPF_REG_1, BPF_REG_0, 0));
emit(dlp, BPF_STORE(BPF_W, BPF_REG_7, DMST_PRID, BPF_REG_1));
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 11/14] Back out the previous patch
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (9 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 10/14] Fix the consumer's picture of the EPID eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:11 ` [PATCH 12/14] Fix comments that hardwire DBUF_ offsets eugene.loh
` (2 subsequent siblings)
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_prov_uprobe.c | 80 +-------------------------------------
1 file changed, 1 insertion(+), 79 deletions(-)
diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
index 83acb7a4..618219b8 100644
--- a/libdtrace/dt_prov_uprobe.c
+++ b/libdtrace/dt_prov_uprobe.c
@@ -498,7 +498,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
dt_irlist_t *dlp = &pcb->pcb_ir;
const dt_probe_t *uprp = pcb->pcb_probe;
const dt_uprobe_t *upp = uprp->prv_data;
- const list_probe_t *pop;
+// const list_probe_t *pop;
uint_t lbl_exit = pcb->pcb_exitlbl;
dt_cg_tramp_prologue(pcb);
@@ -585,84 +585,6 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem));
emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_exit));
-#if 1
-/*
- * Here is a big hack. I hope I understand and explain it correctly.
- *
- * DTrace's underlying model is that each EPID represents a combination
- * of PRID and clause. EPIDs are enumerated so that, for a given EPID,
- * the consumer can look up which PRID to use. A clause sets its EPID
- * to some placeholder value.
- *
- * During dt_bpf_load_progs(), each probe's BPF program is loaded into
- * the kernel. Each clause it calls is likewise loaded. Each time a
- * clause is loaded for a probe, a new EPID is assigned. Clause code
- * undergoes relocation, and the placeholder EPID value is replaced with
- * the newly assigned EPID value.
- *
- * What's tricky is that the BPF code and the consumer get their PRID
- * values from different places. While producer might send a PRID value
- * to the consumer -- e.g., with "trace(id)" -- usually what happens is:
- *
- * - BPF code has PRID set in the trampoline, to a value that
- * is set during relocation
- *
- * - the consumer sees the EPID for a record and looks up the
- * associated PRID, that association made during relocation
- *
- * Specifically, during dt_link_construct(), when a new
- * clause is added, the PRID is associated with the new
- * EPID through a dt_epid_add() call.
- *
- * Further, if an underlying probe trampoline needs to call
- * clauses for an overlying probe, then dt_link_construct()
- * watches -- for idp->di_kind==DT_IDENT_SCALAR and
- * idp->di_id default -- for any overlying probe specification,
- * switching the prid to the new value.
- *
- * We are now changing our USDT support. The PRID is longer set in the
- * trampoline to a value known at relocation. Rather, the trampoline
- * now looks a PRID up at run time in a BPF map. The side effect of
- * associating a PRID with an EPID during relocation is lost. An EPID
- * could be associated with multiple PRIDs.
- *
- * A variety of solutions are possible.
- *
- * For example, the BPF map could be keyed on pid, clause, and uprobe.
- * The output value could be a PRID and EPID for the clause. Clearly,
- * this would take more BPF map space and more lookups than the current
- * bitmap approach.
- *
- * Here, a different approach is chosen. The PRID that the consumer
- * associates with an EPID does not need to be exact. A "representative"
- * PRID suffices. Usually, the consumer only cares about the function
- * and name components of the probe. So, set the mst->prid in the trampoline
- * to any representative PRID. Then, dt_link_construct() will continue
- * to have the same side effect, associating the PRID with the EPID.
- *
- * Another option is to write the PRID alongside the EPID in the output
- * buffer.
- *
- * Work in progress.
- */
- /*
- * Store a representative PRID to mst->prid. It will immediately
- * be overwritten by the correct PRID, but we store it anyhow for
- * the side effect it has of making a representative PRID known to
- * the consumer.
- */
- pop = dt_list_next(&upp->probes);
- if (pop != NULL) {
- const dt_probe_t *pprp = pop->probe;
- dt_ident_t *idp;
-
- idp = dt_dlib_add_probe_var(pcb->pcb_hdl, pprp);
- assert(idp != NULL);
-
- emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_PRID, pprp->desc->id), idp);
- }
-#endif
-
/* Read the PRID from the table lookup and store to mst->prid. */
emit(dlp, BPF_LOAD(BPF_W, BPF_REG_1, BPF_REG_0, 0));
emit(dlp, BPF_STORE(BPF_W, BPF_REG_7, DMST_PRID, BPF_REG_1));
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 12/14] Fix comments that hardwire DBUF_ offsets
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (10 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 11/14] Back out the previous patch eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:11 ` [PATCH 13/14] Clean up some comments eugene.loh
2024-06-04 18:11 ` [PATCH 14/14] Have the consumer get the PRID from the output buffer eugene.loh
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_cg.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
index e166d2d8..0977406a 100644
--- a/libdtrace/dt_cg.c
+++ b/libdtrace/dt_cg.c
@@ -1057,8 +1057,8 @@ dt_cg_tramp_error(dt_pcb_t *pcb)
*
* 1. Store the base pointer to the output data buffer in %r9.
* 2. Initialize the machine state (dctx->mst).
- * 3. Store the epid at [%r9 + 0].
- * 4. Store 0 to indicate no active speculation at [%r9 + 4].
+ * 3. Store the epid at [%r9 + DBUF_EPID].
+ * 4. Store 0 to indicate no active speculation at [%r9 + DBUF_SPECID].
* 5. Evaluate the predicate expression and return if false.
*
* The dt_program() function will always return 0.
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 13/14] Clean up some comments
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (11 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 12/14] Fix comments that hardwire DBUF_ offsets eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
2024-06-04 18:11 ` [PATCH 14/14] Have the consumer get the PRID from the output buffer eugene.loh
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_cg.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
index 0977406a..4fd2d359 100644
--- a/libdtrace/dt_cg.c
+++ b/libdtrace/dt_cg.c
@@ -1150,9 +1150,9 @@ dt_cg_prologue(dt_pcb_t *pcb, dt_node_t *pred)
/*
* Generate the function epilogue:
- * 4. Submit the buffer to the perf event output buffer for the current
+ * 1. Submit the buffer to the perf event output buffer for the current
* cpu, if this is a data recording action..
- * 5. Return 0
+ * 2. Return 0
* }
*/
static void
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 14/14] Have the consumer get the PRID from the output buffer
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
` (12 preceding siblings ...)
2024-06-04 18:11 ` [PATCH 13/14] Clean up some comments eugene.loh
@ 2024-06-04 18:11 ` eugene.loh
13 siblings, 0 replies; 31+ messages in thread
From: eugene.loh @ 2024-06-04 18:11 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
WIP
Each output record has a EPID associated with it, allowing the consumer
to get both a data description and a PRID (so it can report PRID and
probe function and name). We want to support uprobes that trigger for
multiple PRIDs, however, basically breaking this scheme.
So have the producer write the PRID into the output buffer; it does so
easily by having the clause prologue copy mst->prid to the output buffer.
There was an unused pad in the output buffer anyhow.
Then, the consumer reads the PRID from the output buffer.
If this approach is adopted, further work is required for this patch:
- shift references to the output buffer
so that we are not reading from negative offsets
- get rid of the EPID->PRID mapping (relocation, etc.)
dt_link_construct management of PRID, calling dt_epid_add(), etc.
- clean up the hardwired PRID=3 for the ERROR probe
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
bpf/probe_error.c | 4 ++++
libdtrace/dt_cg.c | 12 +++++++++---
libdtrace/dt_consume.c | 21 ++++++++++++++-------
libdtrace/dt_dctx.h | 3 +++
4 files changed, 30 insertions(+), 10 deletions(-)
diff --git a/bpf/probe_error.c b/bpf/probe_error.c
index c8ddcdfa..df546665 100644
--- a/bpf/probe_error.c
+++ b/bpf/probe_error.c
@@ -26,6 +26,7 @@ noinline void dt_probe_error(const dt_dctx_t *dctx, uint64_t pc, uint64_t fault,
uint64_t illval)
{
dt_mstate_t *mst = dctx->mst;
+int oldprid;
mst->argv[0] = 0;
mst->argv[1] = mst->epid;
@@ -34,7 +35,10 @@ noinline void dt_probe_error(const dt_dctx_t *dctx, uint64_t pc, uint64_t fault,
mst->argv[4] = fault;
mst->argv[5] = illval;
+oldprid = mst->prid;
+mst->prid = 3; // FIXME all we are doing is forcing the PRID to be 3, which is usually what ERROR is; the clause prologues will then use this
dt_error(dctx);
+mst->prid = oldprid;
mst->fault = fault;
}
diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
index 4fd2d359..8ec917be 100644
--- a/libdtrace/dt_cg.c
+++ b/libdtrace/dt_cg.c
@@ -1057,9 +1057,10 @@ dt_cg_tramp_error(dt_pcb_t *pcb)
*
* 1. Store the base pointer to the output data buffer in %r9.
* 2. Initialize the machine state (dctx->mst).
- * 3. Store the epid at [%r9 + DBUF_EPID].
- * 4. Store 0 to indicate no active speculation at [%r9 + DBUF_SPECID].
- * 5. Evaluate the predicate expression and return if false.
+ * 3. Copy dctx->mst->prid to [%r9 + DBUF_PRID].
+ * 4. Store the epid at [%r9 + DBUF_EPID].
+ * 5. Store 0 to indicate no active speculation at [%r9 + DBUF_SPECID].
+ * 6. Evaluate the predicate expression and return if false.
*
* The dt_program() function will always return 0.
*/
@@ -1105,6 +1106,9 @@ dt_cg_prologue(dt_pcb_t *pcb, dt_node_t *pred)
* dctx->mst->specsize = 0;// stdw [%r0 + DMST_SPECSIZE], 0
* dctx->mst->epid = EPID; // stw [%r0 + DMST_EPID], EPID
* dctx->mst->clid = CLID; // stw [%r0 + DMST_CLID], CLID
+ * *((uint32_t *)&buf[DBUF_PRID]) = dctx->mst->prid;
+ * // ld %r1, [%r0 + DMST_PRID]
+ * // stw [%r9 + DBUF_PRID], %r1
* *((uint32_t *)&buf[DBUF_EPID]) = EPID;
* // stw [%r9 + DBUF_EPID], EPID
*/
@@ -1114,6 +1118,8 @@ dt_cg_prologue(dt_pcb_t *pcb, dt_node_t *pred)
emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_0, DMST_SPECSIZE, 0));
emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_0, DMST_EPID, -1), epid);
emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_0, DMST_CLID, -1), clid);
+ emit(dlp, BPF_LOAD(BPF_W, BPF_REG_1, BPF_REG_0, DMST_PRID));
+ emit(dlp, BPF_STORE(BPF_W, BPF_REG_9, DBUF_PRID, BPF_REG_1));
emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_9, DBUF_EPID, -1), epid);
/*
diff --git a/libdtrace/dt_consume.c b/libdtrace/dt_consume.c
index dec2314b..21ba7ce7 100644
--- a/libdtrace/dt_consume.c
+++ b/libdtrace/dt_consume.c
@@ -17,6 +17,7 @@
#include <dt_module.h>
#include <dt_pcap.h>
#include <dt_peb.h>
+#include <dt_probe.h>
#include <dt_state.h>
#include <dt_string.h>
#include <libproc.h>
@@ -463,7 +464,7 @@ dt_flowindent(dtrace_hdl_t *dtp, dtrace_probedata_t *data, dtrace_epid_t last,
*/
if (flow == DTRACEFLOW_ENTRY) {
if (last != DTRACE_EPIDNONE && id != last &&
- pd->id == dtp->dt_pdesc[last]->id)
+ pd->id == dtp->dt_pdesc[last]->id) // FIXME will this work if we do not get a pdesc from an epid? or is it even a bad idea to access dtp->dt_pdesc[] directly anyhow?
flow = DTRACEFLOW_NONE;
}
@@ -2087,13 +2088,13 @@ dt_spec_buf_add_data(dtrace_hdl_t *dtp, dt_spec_buf_t *dtsb,
goto oom;
dsbd->dsbd_cpu = cpu;
- dsbd->dsbd_size = size;
- dsbd->dsbd_data = dt_alloc(dtp, size);
+ dsbd->dsbd_size = size; // FIXME should also change this, or wait?
+ dsbd->dsbd_data = dt_alloc(dtp, size + 8); // FIXME OMG hack, need to account that the prid is in the previously ignored preceding 4-byte pad, also we want to preserve 8-byte alignment
if (!dsbd->dsbd_data)
goto oom;
dtsb->dtsb_size += size;
- memcpy(dsbd->dsbd_data, data, size);
+ memcpy(dsbd->dsbd_data + 4, data - 4, size + 4); // FIXME OMG, we want to copy starting at preceding 4-byte pad; pad destination 4 bytes to preserve 8-byte alignment
dt_list_append(&dtsb->dtsb_dsbd_list, dsbd);
return dsbd;
@@ -2184,7 +2185,7 @@ dt_commit_one_spec(dtrace_hdl_t *dtp, FILE *fp, dt_spec_buf_t *dtsb,
memcpy(&specpdat, pdat, sizeof(dtrace_probedata_t));
specpdat.dtpda_cpu = dsbd->dsbd_cpu;
- ret = dt_consume_one_probe(dtp, fp, dsbd->dsbd_data,
+ ret = dt_consume_one_probe(dtp, fp, dsbd->dsbd_data + 8, // FIXME OMG; points to where data[0] used to be
dsbd->dsbd_size, &specpdat,
efunc, rfunc, flow, quiet,
peekflags, last, 1, arg);
@@ -2227,7 +2228,12 @@ dt_consume_one_probe(dtrace_hdl_t *dtp, FILE *fp, char *data, uint32_t size,
&pdat->dtpda_pdesc);
if (rval != 0)
return dt_set_errno(dtp, EDT_BADEPID);
-
+{
+// overwrite the (prid and) probe description we get from the EPID, using instead the prid we read from the output buffer
+uint32_t prid;
+prid = ((uint32_t *)data)[-1]; // should check prid > 0???
+pdat->dtpda_pdesc = (dtrace_probedesc_t *)dtp->dt_probes[prid]->desc;
+}
epd = pdat->dtpda_ddesc;
if (epd->dtdd_uarg != DT_ECB_DEFAULT) {
rval = dt_handle(dtp, pdat);
@@ -2661,7 +2667,7 @@ dt_consume_one(dtrace_hdl_t *dtp, FILE *fp, char *buf,
* struct {
* struct perf_event_header header;
* uint32_t size;
- * uint32_t pad;
+ * uint32_t prid;
* uint32_t epid;
* uint32_t specid;
* uint64_t data[n];
@@ -2678,6 +2684,7 @@ dt_consume_one(dtrace_hdl_t *dtp, FILE *fp, char *buf,
if (ptr != buf + hdr->size)
return dt_set_errno(dtp, EDT_DSIZE);
+// FIXME get rid of this later
data += sizeof(uint32_t); /* skip padding */
size -= sizeof(uint32_t);
diff --git a/libdtrace/dt_dctx.h b/libdtrace/dt_dctx.h
index 1422ad24..ebb83a98 100644
--- a/libdtrace/dt_dctx.h
+++ b/libdtrace/dt_dctx.h
@@ -82,6 +82,8 @@ typedef struct dt_dctx {
* The dctx->buf pointer references a block of memory that contains:
*
* +----------------+
+ * -4 -> | PRID |
+ * +----------------+
* 0 -> | EPID |
* +----------------+
* 4 -> | Speculation ID |
@@ -90,6 +92,7 @@ typedef struct dt_dctx {
* | ... |
* +----------------+
*/
+#define DBUF_PRID -4
#define DBUF_EPID 0
#define DBUF_SPECID 4
--
2.18.4
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 01/14] Move comment closer to the code it describes
2024-06-04 18:11 ` [PATCH 01/14] Move comment closer to the code it describes eugene.loh
@ 2024-06-04 18:21 ` Kris Van Hees
0 siblings, 0 replies; 31+ messages in thread
From: Kris Van Hees @ 2024-06-04 18:21 UTC (permalink / raw)
To: eugene.loh; +Cc: dtrace, dtrace-devel
On Tue, Jun 04, 2024 at 02:11:00PM -0400, eugene.loh--- via DTrace-devel wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
> libdtrace/dt_prov_uprobe.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
> index b712c589..afc1f628 100644
> --- a/libdtrace/dt_prov_uprobe.c
> +++ b/libdtrace/dt_prov_uprobe.c
> @@ -337,15 +337,16 @@ static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp, int is_usdt)
>
> /*
> * We need to enable the underlying probes (if not enabled yet).
> - *
> - * If necessary, we need to enable is-enabled probes too (if they
> - * exist).
> */
> for (pup = prp->prv_data; pup != NULL; pup = dt_list_next(pup)) {
> dt_probe_t *uprp = pup->probe;
> dt_probe_enable(dtp, uprp);
> }
>
> + /*
> + * If necessary, we need to enable is-enabled probes too (if they
> + * exist).
> + */
> if (is_usdt) {
> dtrace_probedesc_t pd;
> dt_probe_t *iep;
> --
> 2.18.4
>
>
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 02/14] Clean up prp/uprp variable names
2024-06-04 18:11 ` [PATCH 02/14] Clean up prp/uprp variable names eugene.loh
@ 2024-06-04 18:44 ` Kris Van Hees
2024-06-05 18:18 ` Eugene Loh
0 siblings, 1 reply; 31+ messages in thread
From: Kris Van Hees @ 2024-06-04 18:44 UTC (permalink / raw)
To: eugene.loh; +Cc: dtrace, dtrace-devel
On Tue, Jun 04, 2024 at 02:11:01PM -0400, eugene.loh--- via DTrace-devel wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> Should prp be oprp? Or should we use opr and upr?
This does not belong in the commit message. But yes, we should stick with
prp for the overlying probe, which is consistent with the rest of the code
base.
Also, I think this patch should address a few other issues in this file, e.g.
making sure comments are consistent with the code (e.g. if it mentions we are
creating overlying and underlying probe lists, the code should also create
tgem in that order - or the comment ought to be updated to match the code
order). etc...
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
> libdtrace/dt_prov_uprobe.c | 55 +++++++++++++++++++++++++++-----------
> 1 file changed, 40 insertions(+), 15 deletions(-)
>
> diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
> index afc1f628..cace406d 100644
> --- a/libdtrace/dt_prov_uprobe.c
> +++ b/libdtrace/dt_prov_uprobe.c
> @@ -6,6 +6,31 @@
> *
> * The uprobe-based provider for DTrace (implementing pid and USDT providers).
> */
> +/*
> + * This file uses both overlying probes (specified by the user) as well as
> + * underlying probes (the uprobes recognized by the kernel). To minimize
> + * confusion, this file should use consistent variable names:
> + *
> + * dt_probe_t *prp; // overlying probe
> + * dt_probe_t *uprp; // underlying probe
> + *
> + * Either one might be returned by dt_probe_lookup() or
> + * dt_probe_insert() or added to dt_enablings[] or dt_probes[].
> + * Further, uprp might be returned by create_underlying().
> + *
> + * dt_uprobe_t *upp; // uprobe associated with an underlying probe
> + *
> + * list_probe_t *pop; // overlying probe list
> + * list_probe_t *pup; // underlying probe list
> + *
> + * The provider-specific prv_data has these meanings:
> + *
> + * prp->prv_data // dt_list_t of associated underlying probes
> + *
> + * uprp->prv_data // upp (the associated uprobe)
> + *
> + * Finally, note that upp->probes is a dt_list_t of overlying probes.
> + */
This comment block is too verbose and I don't think it is really needed, if you
are going to rename variables anyway to be consistent based on your proposal
(as you do in this patch). So, the comment becomes unnecessary by the patch
itself.
Even if we were to retain a comment like this, I think it should be much more
brief. But again, I think the patch itself accomplishes all that is needed,
so no need to comment.
> #include <sys/types.h>
> #include <assert.h>
> #include <errno.h>
> @@ -118,7 +143,7 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
> char mod[DTRACE_MODNAMELEN];
> char prb[DTRACE_NAMELEN];
> dtrace_probedesc_t pd;
> - dt_probe_t *prp;
> + dt_probe_t *uprp;
I am OK with doing this renaming of variable name because you want some form
of consustency throughout this file, but I don't believe it is really needed.
This function only deals with one type of probes, identified both in the
comment and the function name as underlying probes. So, the prp variable that
is used in many places in DTrace source code to denote a probe pointer should
not cause any confusion.
But if you want to change it, no problem.
> dt_uprobe_t *upp;
> int is_enabled = 0;
>
> @@ -160,8 +185,8 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
> pd.fun = psp->pps_fun;
> pd.prb = prb;
>
> - prp = dt_probe_lookup(dtp, &pd);
> - if (prp == NULL) {
> + uprp = dt_probe_lookup(dtp, &pd);
> + if (uprp == NULL) {
> dt_provider_t *pvp;
>
> /* Get the provider for underlying probes. */
> @@ -182,12 +207,12 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
> if (upp->tp == NULL)
> goto fail;
>
> - prp = dt_probe_insert(dtp, pvp, pd.prv, pd.mod, pd.fun, pd.prb,
> + uprp = dt_probe_insert(dtp, pvp, pd.prv, pd.mod, pd.fun, pd.prb,
> upp);
> - if (prp == NULL)
> + if (uprp == NULL)
> goto fail;
> } else
> - upp = prp->prv_data;
> + upp = uprp->prv_data;
>
> switch (psp->pps_type) {
> case DTPPT_RETURN:
> @@ -202,7 +227,7 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
> */
> }
>
> - return prp;
> + return uprp;
>
> fail:
> probe_destroy(dtp, upp);
> @@ -394,8 +419,8 @@ static void enable_usdt(dtrace_hdl_t *dtp, dt_probe_t *prp)
> static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
> {
> dt_irlist_t *dlp = &pcb->pcb_ir;
> - const dt_probe_t *prp = pcb->pcb_probe;
> - const dt_uprobe_t *upp = prp->prv_data;
> + const dt_probe_t *uprp = pcb->pcb_probe;
> + const dt_uprobe_t *upp = uprp->prv_data;
> const list_probe_t *pop;
> uint_t lbl_exit = pcb->pcb_exitlbl;
>
> @@ -508,8 +533,8 @@ copyout_val(dt_pcb_t *pcb, uint_t lbl, uint32_t val, int arg)
> static int trampoline_is_enabled(dt_pcb_t *pcb, uint_t exitlbl)
> {
> dt_irlist_t *dlp = &pcb->pcb_ir;
> - const dt_probe_t *prp = pcb->pcb_probe;
> - const dt_uprobe_t *upp = prp->prv_data;
> + const dt_probe_t *uprp = pcb->pcb_probe;
> + const dt_uprobe_t *upp = uprp->prv_data;
> const list_probe_t *pop;
> uint_t lbl_assign = dt_irlist_label(dlp);
> uint_t lbl_exit = pcb->pcb_exitlbl;
> @@ -636,9 +661,9 @@ out:
> return name;
> }
>
> -static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
> +static int attach(dtrace_hdl_t *dtp, const dt_probe_t *uprp, int bpf_fd)
> {
> - dt_uprobe_t *upp = prp->prv_data;
> + dt_uprobe_t *upp = uprp->prv_data;
> tp_probe_t *tpp = upp->tp;
> FILE *f;
> char *fn;
> @@ -733,9 +758,9 @@ out:
> * probes that didn't get created. If the removal fails for some reason we are
> * out of luck - fortunately it is not harmful to the system as a whole.
> */
> -static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
> +static void detach(dtrace_hdl_t *dtp, const dt_probe_t *uprp)
> {
> - dt_uprobe_t *upp = prp->prv_data;
> + dt_uprobe_t *upp = uprp->prv_data;
> tp_probe_t *tpp = upp->tp;
>
> if (!dt_tp_has_info(tpp))
> --
> 2.18.4
>
>
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 03/14] Let USDT module names contain dots
2024-06-04 18:11 ` [PATCH 03/14] Let USDT module names contain dots eugene.loh
@ 2024-06-04 20:42 ` Kris Van Hees
2024-06-04 22:30 ` Eugene Loh
0 siblings, 1 reply; 31+ messages in thread
From: Kris Van Hees @ 2024-06-04 20:42 UTC (permalink / raw)
To: eugene.loh; +Cc: dtrace, dtrace-devel
I do not know what to do with this patch. Either it is correct and then the
code should not be #if 0 .. #endif, or it is not, and then the patch should
not exist.
Consider that the tst.dlclose1.sh test has output like this:
test/unittest/usdt/tst.dlclose1.sh: Running timeout --signal=TERM 41 test/unittest/usdt/tst.dlclose1.sh /home/kvanhees/dtrace-bpf-user/build/dtrace
PASS.
started pid 68990
ID PROVIDER MODULE FUNCTION NAME
97026 test_prov68990 livelib.so go go
So that shows that the module component certain can contain a period (.),
so the module component can certainly contain a period (.) whereas we know
that the function component cannot. Likewise, provider name cannot contain
period as far as I know, nor can the probe name. So, only the module name
seems to be an issue.
On Tue, Jun 04, 2024 at 02:11:02PM -0400, eugene.loh--- via DTrace-devel wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
> dtprobed/dof_stash.c | 5 +++++
> libdtrace/dt_pid.c | 2 ++
> 2 files changed, 7 insertions(+)
>
> diff --git a/dtprobed/dof_stash.c b/dtprobed/dof_stash.c
> index 62418b66..44c67462 100644
> --- a/dtprobed/dof_stash.c
> +++ b/dtprobed/dof_stash.c
> @@ -589,6 +589,10 @@ dof_stash_write_parsed(pid_t pid, dev_t dev, ino_t ino, dt_list_t *accum)
> mod, fun, prb)) == NULL)
> goto err_provider;
>
> +#if 0
> +This does not make any sense. We are checking parsedfn against "." and "..",
> +but parsedfn comes from make_probespec_name(), whose output is of the form
> +"%s:%s:%s:%s" and therefore can never be "." or "..".
> /*
> * Ban "." and ".." as name components. Obviously names
> * containing dots are commonplace (shared libraries,
> @@ -604,6 +608,7 @@ dof_stash_write_parsed(pid_t pid, dev_t dev, ino_t ino, dt_list_t *accum)
> if (strcmp(parsedfn, ".") == 0 ||
> strcmp(parsedfn, "..") == 0)
> goto err_provider;
> +#endif
>
> op = "probe module";
>
> diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
> index 7c7d7e30..93a7ce76 100644
> --- a/libdtrace/dt_pid.c
> +++ b/libdtrace/dt_pid.c
> @@ -833,6 +833,7 @@ dt_pid_create_usdt_probes(dtrace_hdl_t *dtp, dt_proc_t *dpr, dtrace_probedesc_t
>
> assert(pvp->impl != NULL && pvp->impl->provide_probe != NULL);
>
> +#if 0
> if (strchr(pdp->prv, '.') != NULL ||
> strchr(pdp->mod, '.') != NULL ||
> strchr(pdp->fun, '.') != NULL ||
> @@ -840,6 +841,7 @@ dt_pid_create_usdt_probes(dtrace_hdl_t *dtp, dt_proc_t *dpr, dtrace_probedesc_t
> dt_dprintf("Probe component contains dots: cannot be a USDT probe.\n");
> return 0;
> }
> +#endif
>
> if (asprintf(&probepath, "%s/probes/%i/%s/%s/%s/%s", dtp->dt_dofstash_path,
> dpr->dpr_pid, pdp->prv[0] == '\0' ? "*" : pdp->prv,
> --
> 2.18.4
>
>
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 04/14] Track uprobe provider descriptions
2024-06-04 18:11 ` [PATCH 04/14] Track uprobe provider descriptions eugene.loh
@ 2024-06-04 21:10 ` Kris Van Hees
2024-06-07 21:40 ` Eugene Loh
0 siblings, 1 reply; 31+ messages in thread
From: Kris Van Hees @ 2024-06-04 21:10 UTC (permalink / raw)
To: eugene.loh; +Cc: dtrace, dtrace-devel
On Tue, Jun 04, 2024 at 02:11:03PM -0400, eugene.loh--- via DTrace-devel wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> A uprobe will have to decide which of its clauses to run for a
> given pid. For now, keep the provider description for each clause.
> There are many shortcomings to the way this is done in this patch,
> but it is quick and dirty and helps us bootstrap real support.
> Clean this up later, probably turning it into a growable array.
This is the wrong way to look at it, I think. The underlying concept is that
DTrace supports specifying probes that do not match anything yet (-Z). Those
are called retained enablings and they can exist for more than just uprobes.
So this needs to be more generic.
We also shouldn't do this non-discriminately for all clauses all the time.
Obviously, when -Z is not used, there is no point in doing this because there
cannot be a match-after-start. Even when -Z is specified, you might be able
to determine whether the probe specification for a probe matches perfectly
or whether it needs to be retained for match-after-start.
So, let's put some thought into designing this in the more generic case, so
that we can avoid going down a rabbit hole that gets tough to recover from.
(And whatever we store the retained enablings in probably should be added to
dtrace_hdl_t after dt_enablings.)
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
> libdtrace/dt_cc.c | 1 +
> libdtrace/dt_impl.h | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
> index d1ee3843..6bff7e0f 100644
> --- a/libdtrace/dt_cc.c
> +++ b/libdtrace/dt_cc.c
> @@ -215,6 +215,7 @@ dt_compile_one_clause(dtrace_hdl_t *dtp, dt_node_t *cnp, dt_node_t *pnp)
> */
> dt_cg(yypcb, cnp);
> sdp->dtsd_clause = dt_clause_create(dtp, dt_as(yypcb));
> + dtp->dt_uprovdescs[dtp->dt_clause_nextid - 1] = strdup(pnp->dn_desc->prv);
>
> assert(yypcb->pcb_stmt == sdp);
> if (dtrace_stmt_add(yypcb->pcb_hdl, yypcb->pcb_prog, sdp) != 0)
> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> index 01313ff3..1bf79d80 100644
> --- a/libdtrace/dt_impl.h
> +++ b/libdtrace/dt_impl.h
> @@ -310,6 +310,7 @@ struct dtrace_hdl {
> dt_pcb_t *dt_pcb; /* pointer to current parsing control block */
> ulong_t dt_gen; /* compiler generation number */
> uint_t dt_clause_nextid; /* next ID to use for programs */
> + char *dt_uprovdescs[256]; /* uprobe provider descriptor per clause... FIXME turn this into a growable array */
> dt_list_t dt_programs; /* linked list of dtrace_prog_t's */
> dt_list_t dt_xlators; /* linked list of dt_xlator_t's */
> dt_list_t dt_enablings; /* list of (to be) enabled probes */
> --
> 2.18.4
>
>
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 05/14] Add a hook for a provider-specific "update" function
2024-06-04 18:11 ` [PATCH 05/14] Add a hook for a provider-specific "update" function eugene.loh
@ 2024-06-04 21:38 ` Kris Van Hees
2024-06-10 22:14 ` Eugene Loh
0 siblings, 1 reply; 31+ messages in thread
From: Kris Van Hees @ 2024-06-04 21:38 UTC (permalink / raw)
To: eugene.loh; +Cc: dtrace, dtrace-devel
Much like the previous patch not considering the generic nature of retained
enablings, this one should be part of that design also. The functionality
we need to handle match-after-start probe specifications really should be
independent from any specific provider. It may (well, almost certainly will)
require a hook in providers that need to do special handling. But that would
be a specific hook for a specific purpose - not just an 'update' hook.
On Tue, Jun 04, 2024 at 02:11:04PM -0400, eugene.loh--- via DTrace-devel wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> For up-coming USDT-probe support, we need to update a BPF map
> -- at least when the dtrace session starts but possibly also later
> to support systemwide USDT tracing for processes that may start up
> later.
>
> One way to do this is with a USDT-specific update function.
>
> For now, let's add a hook for providers to have provider-specific
> update functions. User space can either call
>
> for (i = 0; i < ARRAY_SIZE(dt_providers); i++) {
> if (dt_providers[i]->update)
> dt_providers[i]->update(...);
> }
>
> any time it likes. Or it can call dt_usdt.update(...).
>
> This is for WIP. A different approach can be adopted later instead.
>
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
> libdtrace/dt_provider.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/libdtrace/dt_provider.h b/libdtrace/dt_provider.h
> index 17b1844c..21ff15ad 100644
> --- a/libdtrace/dt_provider.h
> +++ b/libdtrace/dt_provider.h
> @@ -68,6 +68,8 @@ typedef struct dt_provimpl {
> void *datap);
> void (*destroy)(dtrace_hdl_t *dtp, /* free provider data */
> void *datap);
> + void (*update)(dtrace_hdl_t *dtp, /* update provider-specific info */
> + void *datap);
> } dt_provimpl_t;
>
> /* list dt_dtrace first */
> --
> 2.18.4
>
>
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 03/14] Let USDT module names contain dots
2024-06-04 20:42 ` [DTrace-devel] " Kris Van Hees
@ 2024-06-04 22:30 ` Eugene Loh
2024-06-07 18:48 ` Nick Alcock
0 siblings, 1 reply; 31+ messages in thread
From: Eugene Loh @ 2024-06-04 22:30 UTC (permalink / raw)
To: dtrace-devel, dtrace
On 6/4/24 16:42, Kris Van Hees wrote:
> I do not know what to do with this patch. Either it is correct and then the
> code should not be #if 0 .. #endif, or it is not, and then the patch should
> not exist.
I was looking for feedback. If we're good with the patch, the "#if 0 /
#endif" stuff will be removed in the final version of the patch.
> Consider that the tst.dlclose1.sh test has output like this:
> test/unittest/usdt/tst.dlclose1.sh: Running timeout --signal=TERM 41 test/unittest/usdt/tst.dlclose1.sh /home/kvanhees/dtrace-bpf-user/build/dtrace
> PASS.
> started pid 68990
> ID PROVIDER MODULE FUNCTION NAME
> 97026 test_prov68990 livelib.so go go
>
> So that shows that the module component certain can contain a period (.),
> so the module component can certainly contain a period (.)
Right.
> whereas we know
> that the function component cannot. Likewise, provider name cannot contain
> period as far as I know, nor can the probe name. So, only the module name
> seems to be an issue.
There are basically two distinct cases here.
In the dof_stash.c case, I note that we are checking a string of the
form "%s:%s:%s:%s" to see if it is "." or "..". That simply does not
make sense. I think the indicated code should be removed.
In the dt_pid.c case, we check whether any of the probe components has a
".". As you point out, probemod may certainly have one. But why are we
enforcing this constraint for the other elements? E.g., if the function
component cannot have a ".", why are we enforcing that in dt_pid.c? I
assume the issue is broader than pid/usdt.
In short, the eventual point of the patch is to eliminate the two
indicated sections of code. In retrospect, the "#if 0 / #endif" stuff
was a dumb and confusing way of indicating that. Sorry.
> On Tue, Jun 04, 2024 at 02:11:02PM -0400, eugene.loh--- via DTrace-devel wrote:
>> From: Eugene Loh <eugene.loh@oracle.com>
>>
>> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
>> ---
>> dtprobed/dof_stash.c | 5 +++++
>> libdtrace/dt_pid.c | 2 ++
>> 2 files changed, 7 insertions(+)
>>
>> diff --git a/dtprobed/dof_stash.c b/dtprobed/dof_stash.c
>> index 62418b66..44c67462 100644
>> --- a/dtprobed/dof_stash.c
>> +++ b/dtprobed/dof_stash.c
>> @@ -589,6 +589,10 @@ dof_stash_write_parsed(pid_t pid, dev_t dev, ino_t ino, dt_list_t *accum)
>> mod, fun, prb)) == NULL)
>> goto err_provider;
>>
>> +#if 0
>> +This does not make any sense. We are checking parsedfn against "." and "..",
>> +but parsedfn comes from make_probespec_name(), whose output is of the form
>> +"%s:%s:%s:%s" and therefore can never be "." or "..".
>> /*
>> * Ban "." and ".." as name components. Obviously names
>> * containing dots are commonplace (shared libraries,
>> @@ -604,6 +608,7 @@ dof_stash_write_parsed(pid_t pid, dev_t dev, ino_t ino, dt_list_t *accum)
>> if (strcmp(parsedfn, ".") == 0 ||
>> strcmp(parsedfn, "..") == 0)
>> goto err_provider;
>> +#endif
>>
>> op = "probe module";
>>
>> diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
>> index 7c7d7e30..93a7ce76 100644
>> --- a/libdtrace/dt_pid.c
>> +++ b/libdtrace/dt_pid.c
>> @@ -833,6 +833,7 @@ dt_pid_create_usdt_probes(dtrace_hdl_t *dtp, dt_proc_t *dpr, dtrace_probedesc_t
>>
>> assert(pvp->impl != NULL && pvp->impl->provide_probe != NULL);
>>
>> +#if 0
>> if (strchr(pdp->prv, '.') != NULL ||
>> strchr(pdp->mod, '.') != NULL ||
>> strchr(pdp->fun, '.') != NULL ||
>> @@ -840,6 +841,7 @@ dt_pid_create_usdt_probes(dtrace_hdl_t *dtp, dt_proc_t *dpr, dtrace_probedesc_t
>> dt_dprintf("Probe component contains dots: cannot be a USDT probe.\n");
>> return 0;
>> }
>> +#endif
>>
>> if (asprintf(&probepath, "%s/probes/%i/%s/%s/%s/%s", dtp->dt_dofstash_path,
>> dpr->dpr_pid, pdp->prv[0] == '\0' ? "*" : pdp->prv,
>> --
>> 2.18.4
>>
>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 07/14] Create the BPF uprobes map
2024-06-04 18:11 ` [PATCH 07/14] Create the BPF uprobes map eugene.loh
@ 2024-06-05 4:33 ` Kris Van Hees
2024-06-10 20:55 ` Eugene Loh
0 siblings, 1 reply; 31+ messages in thread
From: Kris Van Hees @ 2024-06-05 4:33 UTC (permalink / raw)
To: eugene.loh; +Cc: dtrace, dtrace-devel
On Tue, Jun 04, 2024 at 02:11:06PM -0400, eugene.loh--- via DTrace-devel wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> WIP, sizing is hardwired for now.
> WIP, should add new pids and purge old ones (after the initial call).
>
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
> libdtrace/dt_bpf.c | 48 +++++++++++++++++++++++++++++++
> libdtrace/dt_dlibs.c | 1 +
> libdtrace/dt_impl.h | 1 +
> libdtrace/dt_prov_uprobe.c | 59 ++++++++++++++++++++++++++++++++++++++
> 4 files changed, 109 insertions(+)
>
> diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
> index 71c6a446..3aeae274 100644
> --- a/libdtrace/dt_bpf.c
> +++ b/libdtrace/dt_bpf.c
> @@ -946,6 +946,53 @@ gmap_create_probes(dtrace_hdl_t *dtp)
> return 0;
> }
>
> +/* also defined in dt_prov_uprobe.c */
This is a red flag - why would structures be defined in two different places?
That certainly risks really bad situations where one gets changed and not the
other. Why is there duplication? That should never happen.
> +typedef struct uprobe_map_key {
> + pid_t pid;
> + dtrace_id_t uprid;
> +} uprobe_map_key_t;
> +typedef struct uprobe_map_val {
> + dtrace_id_t prid;
> + long long mask;
> +} uprobe_map_val_t;
> +
> +/*
> + * Create the 'uprobes' BPF map.
How about naming it 'prmap' or something like that? Or 'pidmap'? It has more
to do with pids than with uprobes. (And then rename the struct names also,
and everything else that is currently based on the uprobes map name.)
> + *
> + * Uprobe information map. This is a global hash map for use
> + * with USDT and pid probes). It is indexed by (pid, probe ID),
> + * using the probe ID of the underlying probe. The value is a
> + * probe ID for the overlying probe and a bit mask indicating
> + * which clauses to execute for this pid.
> + *
> + * WIP. Just make up some sizes for now.
> + *
> + * How big is a probe ID? Sometimes, it's a dtrace_id_t.
> + * And uts/common/sys/dtrace_types.h gives us "typedef uint32_t dtrace_id_t".
> + * Meanwhile, libdtrace/dt_impl.h has "uint32_t dt_probe_id".
> + * So either uint32_t or dtrace_id_t is fine.
> + *
> + * How big should our bit mask be? Start with 8*sizeof(long long) bits.
> + *
> + * How many entries should we allow? Start with 1000.
> + *
> + * FIXME: is there any way two different overlying probes (differing by more
> + * than just pid) could map to the same underlying probe?
I expect it ought to be possible to place a pid offset probe on the location of
a USDT probe, thereby causing two overlying probes on the same underlying one.
> + */
> +static int
> +gmap_create_uprobes(dtrace_hdl_t *dtp)
> +{
> + dtp->dt_uprobesmap_fd = create_gmap(dtp, "uprobes", BPF_MAP_TYPE_HASH,
> + sizeof(uprobe_map_key_t), sizeof(uprobe_map_val_t), 1000);
> + if (dtp->dt_uprobesmap_fd == -1)
> + return -1;
> +
> + /* Populate the newly created map. */
> + dt_uprobe.update(dtp, NULL);
This seems to be something that ought to be done in a function called from
dtrace_go() that goes through the enablings (and perhaps also the retained
ones), and calls a provider hook for any probes that match the probe spec
that relates to it. Something like that. And then the provider code can do
whatever needs doing for that particular matching probe.
> +
> + return 0;
> +}
> +
> /*
> * Create the 'gvars' BPF map.
> *
> @@ -1051,6 +1098,7 @@ dt_bpf_gmap_create(dtrace_hdl_t *dtp)
> CREATE_MAP(scratchmem)
> CREATE_MAP(strtab)
> CREATE_MAP(probes)
> + CREATE_MAP(uprobes)
> CREATE_MAP(gvars)
> CREATE_MAP(lvars)
> CREATE_MAP(dvars)
> diff --git a/libdtrace/dt_dlibs.c b/libdtrace/dt_dlibs.c
> index bc883e11..7bbeb02c 100644
> --- a/libdtrace/dt_dlibs.c
> +++ b/libdtrace/dt_dlibs.c
> @@ -66,6 +66,7 @@ static const dt_ident_t dt_bpf_symbols[] = {
> DT_BPF_SYMBOL(lvars, DT_IDENT_PTR),
> DT_BPF_SYMBOL(mem, DT_IDENT_PTR),
> DT_BPF_SYMBOL(probes, DT_IDENT_PTR),
> + DT_BPF_SYMBOL(uprobes, DT_IDENT_PTR),
> DT_BPF_SYMBOL(scratchmem, DT_IDENT_PTR),
> DT_BPF_SYMBOL(specs, DT_IDENT_PTR),
> DT_BPF_SYMBOL(state, DT_IDENT_PTR),
> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> index 1bf79d80..935b96f4 100644
> --- a/libdtrace/dt_impl.h
> +++ b/libdtrace/dt_impl.h
> @@ -438,6 +438,7 @@ struct dtrace_hdl {
> int dt_aggmap_fd; /* file descriptor for the 'aggs' BPF map */
> int dt_genmap_fd; /* file descriptor for the 'agggen' BPF map */
> int dt_cpumap_fd; /* file descriptor for the 'cpuinfo' BPF map */
> + int dt_uprobesmap_fd; /* file descriptor for the 'uprobes' BPF map */
> dtrace_handle_err_f *dt_errhdlr; /* error handler, if any */
> void *dt_errarg; /* error handler argument */
> dtrace_handle_drop_f *dt_drophdlr; /* drop handler, if any */
> diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
> index 34906aa0..591f2fab 100644
> --- a/libdtrace/dt_prov_uprobe.c
> +++ b/libdtrace/dt_prov_uprobe.c
> @@ -45,6 +45,7 @@
> #include "dt_probe.h"
> #include "dt_pid.h"
> #include "dt_string.h"
> +#include "port.h"
>
> /* Provider name for the underlying probes. */
> static const char prvname[] = "uprobe";
> @@ -70,6 +71,15 @@ typedef struct list_probe {
> dt_probe_t *probe;
> } list_probe_t;
>
> +/* uprobe_map_[key|val]_t are also defined in dt_bpf.c */
> +typedef struct uprobe_map_key {
> + pid_t pid;
> + dtrace_id_t uprid;
> +} uprobe_map_key_t;
> +typedef struct uprobe_map_val {
> + dtrace_id_t prid;
> + long long mask;
> +} uprobe_map_val_t;
> typedef struct list_clause {
> dt_list_t list;
> uint_t clause;
> @@ -135,6 +145,54 @@ static void probe_destroy(dtrace_hdl_t *dtp, void *datap)
> free_probe_list(dtp, datap);
> }
>
> +/*
> + * Update the uprobe provider.
> + */
> +static void update_uprobe(dtrace_hdl_t *dtp, void *datap)
> +{
> + dt_probe_t *prp;
> +
> + for (prp = dt_list_next(&dtp->dt_enablings); prp != NULL;
> + prp = dt_list_next(prp)) {
> + pid_t pid;
> + const list_probe_t *pup;
> + dt_probe_t *uprp;
> + long long mask = 0, bit = 1;
> + list_clause_t *cl;
> + uprobe_map_key_t key;
> + uprobe_map_val_t val;
> +
> + /* Make sure it is an overlying pid or USDT probe. */
> + if (prp->prov->impl != &dt_pid && prp->prov->impl != &dt_usdt)
> + continue;
> +
> + /* FIXME passing in NULL pcb and dpr wreaks havoc on error reporting? */
> + pid = dt_pid_get_pid(prp->desc, dtp, NULL, NULL);
> +
> + for (pup = prp->prv_data; pup != NULL; pup = dt_list_next(pup)) {
> + dt_uprobe_t *upp;
> +
> + uprp = pup->probe;
> + upp = uprp->prv_data;
> + for (cl = dt_list_next(&upp->clauses); cl != NULL; cl = dt_list_next(cl)) {
> + if (gmatch(prp->desc->prv, dtp->dt_uprovdescs[cl->clause]))
> + mask |= bit;
> +
> + bit <<= 1;
> + }
> +
> + key.pid = pid;
> + key.uprid = uprp->desc->id;
> +
> + val.prid = prp->desc->id;
> + val.mask = mask;
> +
> + // FIXME check return value
> + dt_bpf_map_update(dtp->dt_uprobesmap_fd, &key, &val);
> + }
> + }
> +}
> +
>
> /*
> * Look up or create an underlying (real) probe, corresponding directly to a
> @@ -803,6 +861,7 @@ dt_provimpl_t dt_uprobe = {
> .probe_info = &probe_info,
> .detach = &detach,
> .probe_destroy = &probe_destroy_underlying,
> + .update = &update_uprobe,
> };
>
> /*
> --
> 2.18.4
>
>
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 02/14] Clean up prp/uprp variable names
2024-06-04 18:44 ` [DTrace-devel] " Kris Van Hees
@ 2024-06-05 18:18 ` Eugene Loh
0 siblings, 0 replies; 31+ messages in thread
From: Eugene Loh @ 2024-06-05 18:18 UTC (permalink / raw)
To: Kris Van Hees; +Cc: dtrace, dtrace-devel
On 6/4/24 14:44, Kris Van Hees wrote:
> On Tue, Jun 04, 2024 at 02:11:01PM -0400, eugene.loh--- via DTrace-devel wrote:
>> From: Eugene Loh <eugene.loh@oracle.com>
>>
>> Should prp be oprp? Or should we use opr and upr?
> This does not belong in the commit message. But yes, we should stick with
> prp for the overlying probe, which is consistent with the rest of the code
> base.
Yes, the comment was to solicit feedback like this. I'll go with "prp"
I guess, even though Nick (elsewhere) voted for "oprp". I liked "oprp"
because it's more explicit and the same number of letters as "uprp", but
sticking with "prp" is simpler and arguably clear enough given the
changes in this patch.
> Also, I think this patch should address a few other issues in this file, e.g.
> making sure comments are consistent with the code (e.g. if it mentions we are
> creating overlying and underlying probe lists, the code should also create
> tgem in that order - or the comment ought to be updated to match the code
> order). etc...
I didn't find much else in the file. I cleaned up some provide_probe()
stuff (I'm guessing it's what you're referring to), but if there is
something else egregious you're concerned about let me know.
>> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
>> ---
>> libdtrace/dt_prov_uprobe.c | 55 +++++++++++++++++++++++++++-----------
>> 1 file changed, 40 insertions(+), 15 deletions(-)
>>
>> diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
>> index afc1f628..cace406d 100644
>> --- a/libdtrace/dt_prov_uprobe.c
>> +++ b/libdtrace/dt_prov_uprobe.c
>> @@ -6,6 +6,31 @@
>> *
>> * The uprobe-based provider for DTrace (implementing pid and USDT providers).
>> */
>> +/*
>> + * This file uses both overlying probes (specified by the user) as well as
>> + * underlying probes (the uprobes recognized by the kernel). To minimize
>> + * confusion, this file should use consistent variable names:
>> + *
>> + * dt_probe_t *prp; // overlying probe
>> + * dt_probe_t *uprp; // underlying probe
>> + *
>> + * Either one might be returned by dt_probe_lookup() or
>> + * dt_probe_insert() or added to dt_enablings[] or dt_probes[].
>> + * Further, uprp might be returned by create_underlying().
>> + *
>> + * dt_uprobe_t *upp; // uprobe associated with an underlying probe
>> + *
>> + * list_probe_t *pop; // overlying probe list
>> + * list_probe_t *pup; // underlying probe list
>> + *
>> + * The provider-specific prv_data has these meanings:
>> + *
>> + * prp->prv_data // dt_list_t of associated underlying probes
>> + *
>> + * uprp->prv_data // upp (the associated uprobe)
>> + *
>> + * Finally, note that upp->probes is a dt_list_t of overlying probes.
>> + */
> This comment block is too verbose and I don't think it is really needed, if you
> are going to rename variables anyway to be consistent based on your proposal
> (as you do in this patch). So, the comment becomes unnecessary by the patch
> itself.
>
> Even if we were to retain a comment like this, I think it should be much more
> brief. But again, I think the patch itself accomplishes all that is needed,
> so no need to comment.
>
>> #include <sys/types.h>
>> #include <assert.h>
>> #include <errno.h>
>> @@ -118,7 +143,7 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
>> char mod[DTRACE_MODNAMELEN];
>> char prb[DTRACE_NAMELEN];
>> dtrace_probedesc_t pd;
>> - dt_probe_t *prp;
>> + dt_probe_t *uprp;
> I am OK with doing this renaming of variable name because you want some form
> of consustency throughout this file, but I don't believe it is really needed.
> This function only deals with one type of probes, identified both in the
> comment and the function name as underlying probes. So, the prp variable that
> is used in many places in DTrace source code to denote a probe pointer should
> not cause any confusion.
>
> But if you want to change it, no problem.
>
>> dt_uprobe_t *upp;
>> int is_enabled = 0;
>>
>> @@ -160,8 +185,8 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
>> pd.fun = psp->pps_fun;
>> pd.prb = prb;
>>
>> - prp = dt_probe_lookup(dtp, &pd);
>> - if (prp == NULL) {
>> + uprp = dt_probe_lookup(dtp, &pd);
>> + if (uprp == NULL) {
>> dt_provider_t *pvp;
>>
>> /* Get the provider for underlying probes. */
>> @@ -182,12 +207,12 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
>> if (upp->tp == NULL)
>> goto fail;
>>
>> - prp = dt_probe_insert(dtp, pvp, pd.prv, pd.mod, pd.fun, pd.prb,
>> + uprp = dt_probe_insert(dtp, pvp, pd.prv, pd.mod, pd.fun, pd.prb,
>> upp);
>> - if (prp == NULL)
>> + if (uprp == NULL)
>> goto fail;
>> } else
>> - upp = prp->prv_data;
>> + upp = uprp->prv_data;
>>
>> switch (psp->pps_type) {
>> case DTPPT_RETURN:
>> @@ -202,7 +227,7 @@ static dt_probe_t *create_underlying(dtrace_hdl_t *dtp,
>> */
>> }
>>
>> - return prp;
>> + return uprp;
>>
>> fail:
>> probe_destroy(dtp, upp);
>> @@ -394,8 +419,8 @@ static void enable_usdt(dtrace_hdl_t *dtp, dt_probe_t *prp)
>> static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
>> {
>> dt_irlist_t *dlp = &pcb->pcb_ir;
>> - const dt_probe_t *prp = pcb->pcb_probe;
>> - const dt_uprobe_t *upp = prp->prv_data;
>> + const dt_probe_t *uprp = pcb->pcb_probe;
>> + const dt_uprobe_t *upp = uprp->prv_data;
>> const list_probe_t *pop;
>> uint_t lbl_exit = pcb->pcb_exitlbl;
>>
>> @@ -508,8 +533,8 @@ copyout_val(dt_pcb_t *pcb, uint_t lbl, uint32_t val, int arg)
>> static int trampoline_is_enabled(dt_pcb_t *pcb, uint_t exitlbl)
>> {
>> dt_irlist_t *dlp = &pcb->pcb_ir;
>> - const dt_probe_t *prp = pcb->pcb_probe;
>> - const dt_uprobe_t *upp = prp->prv_data;
>> + const dt_probe_t *uprp = pcb->pcb_probe;
>> + const dt_uprobe_t *upp = uprp->prv_data;
>> const list_probe_t *pop;
>> uint_t lbl_assign = dt_irlist_label(dlp);
>> uint_t lbl_exit = pcb->pcb_exitlbl;
>> @@ -636,9 +661,9 @@ out:
>> return name;
>> }
>>
>> -static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
>> +static int attach(dtrace_hdl_t *dtp, const dt_probe_t *uprp, int bpf_fd)
>> {
>> - dt_uprobe_t *upp = prp->prv_data;
>> + dt_uprobe_t *upp = uprp->prv_data;
>> tp_probe_t *tpp = upp->tp;
>> FILE *f;
>> char *fn;
>> @@ -733,9 +758,9 @@ out:
>> * probes that didn't get created. If the removal fails for some reason we are
>> * out of luck - fortunately it is not harmful to the system as a whole.
>> */
>> -static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp)
>> +static void detach(dtrace_hdl_t *dtp, const dt_probe_t *uprp)
>> {
>> - dt_uprobe_t *upp = prp->prv_data;
>> + dt_uprobe_t *upp = uprp->prv_data;
>> tp_probe_t *tpp = upp->tp;
>>
>> if (!dt_tp_has_info(tpp))
>> --
>> 2.18.4
>>
>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 03/14] Let USDT module names contain dots
2024-06-04 22:30 ` Eugene Loh
@ 2024-06-07 18:48 ` Nick Alcock
2024-06-07 22:22 ` Eugene Loh
0 siblings, 1 reply; 31+ messages in thread
From: Nick Alcock @ 2024-06-07 18:48 UTC (permalink / raw)
To: Eugene Loh via DTrace-devel; +Cc: dtrace, Eugene Loh
On 4 Jun 2024, Eugene Loh via DTrace-devel stated:
> On 6/4/24 16:42, Kris Van Hees wrote:
>
>> I do not know what to do with this patch. Either it is correct and then the
>> code should not be #if 0 .. #endif, or it is not, and then the patch should
>> not exist.
>
> I was looking for feedback. If we're good with the patch, the "#if 0 / #endif" stuff will be removed in the final version of the
> patch.
>
>> Consider that the tst.dlclose1.sh test has output like this:
>> test/unittest/usdt/tst.dlclose1.sh: Running timeout --signal=TERM 41 test/unittest/usdt/tst.dlclose1.sh /home/kvanhees/dtrace-bpf-user/build/dtrace
>> PASS.
>> started pid 68990
>> ID PROVIDER MODULE FUNCTION NAME
>> 97026 test_prov68990 livelib.so go go
>>
>> So that shows that the module component certain can contain a period (.),
>> so the module component can certainly contain a period (.)
>
> Right.
>
>> whereas we know
>> that the function component cannot. Likewise, provider name cannot contain
>> period as far as I know, nor can the probe name. So, only the module name
>> seems to be an issue.
>
> There are basically two distinct cases here.
>
> In the dof_stash.c case, I note that we are checking a string of the form "%s:%s:%s:%s" to see if it is "." or "..". That simply
> does not make sense. I think the indicated code should be removed.
Yes -- that was a mistake. What we want to ensure is that none of the
probe components *consist solely of* "." or "..". This has to happen
before any attempt is made to write them to the DOF stash, and before
any attempt is made to *look them up* in there, to avoid unintended path
traversal.
This is, I'm afraid, a security issue in the presence of potentially
malicious DOF: we can't just do nothing.
--
NULL && (void)
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 04/14] Track uprobe provider descriptions
2024-06-04 21:10 ` [DTrace-devel] " Kris Van Hees
@ 2024-06-07 21:40 ` Eugene Loh
2024-06-07 22:16 ` Kris Van Hees
0 siblings, 1 reply; 31+ messages in thread
From: Eugene Loh @ 2024-06-07 21:40 UTC (permalink / raw)
To: dtrace, dtrace-devel
On 6/4/24 17:10, Kris Van Hees wrote:
> On Tue, Jun 04, 2024 at 02:11:03PM -0400, eugene.loh--- via DTrace-devel wrote:
>> From: Eugene Loh <eugene.loh@oracle.com>
>>
>> A uprobe will have to decide which of its clauses to run for a
>> given pid. For now, keep the provider description for each clause.
>> There are many shortcomings to the way this is done in this patch,
>> but it is quick and dirty and helps us bootstrap real support.
>> Clean this up later, probably turning it into a growable array.
> This is the wrong way to look at it, I think. The underlying concept is that
> DTrace supports specifying probes that do not match anything yet (-Z). Those
> are called retained enablings and they can exist for more than just uprobes.
> So this needs to be more generic.
Okay. I looked at legacy-DTrace retained enablings. They seem to be a
linked list -- (mainly) of arrays of dtrace_ecbdesc pointers. And
dtrace_ecbdesc has dtrace_probedesc, dtrace_preddesc, and
dtrace_actdesc. In DTrace on eBPF, dtrace_ecbdesc is much simpler --
it's really just a dtrace_probedesc. So it seems to me we could just
make our retained enablings be an array of probe desc pointers. That
would be a small change to this (small) patch.
> We also shouldn't do this non-discriminately for all clauses all the time.
> Obviously, when -Z is not used, there is no point in doing this because there
> cannot be a match-after-start.
FWIW, that's not how legacy DTrace seems to work: there *can* be
match-after-start without -Z. The -Z option only addresses whether
there is an initial match. If there is, then -Z is not needed, not even
for subsequent matches to work.
> Even when -Z is specified, you might be able
> to determine whether the probe specification for a probe matches perfectly
> or whether it needs to be retained for match-after-start.
Maybe. But what if you have pid1234:a.out:main:go, but pid 1234 has not
yet started up? (Yes, that's an extremely weird example.)
Mostly (going back to your "shouldn't non-discriminately" comment),
trying to decide whether to keep a probe desc seems unnecessary. Just
keep them all. They will likely be dwarfed by dtp->dt_probes[] anyhow.
> So, let's put some thought into designing this in the more generic case, so
> that we can avoid going down a rabbit hole that gets tough to recover from.
What do you think of:
struct dtrace_hdl {
[...]
dt_list_t dt_enablings; /* list of (to be) enabled probes */
+ dtrace_probedesc_t **dt_retained;
+ int dt_nretained;
+ int dt_maxretained;
struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by
dx_id */
[...]
}
> (And whatever we store the retained enablings in probably should be added to
> dtrace_hdl_t after dt_enablings.)
No problem.
>> diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
>> @@ -215,6 +215,7 @@ dt_compile_one_clause(dtrace_hdl_t *dtp, dt_node_t *cnp, dt_node_t *pnp)
>> */
>> dt_cg(yypcb, cnp);
>> sdp->dtsd_clause = dt_clause_create(dtp, dt_as(yypcb));
>> + dtp->dt_uprovdescs[dtp->dt_clause_nextid - 1] = strdup(pnp->dn_desc->prv);
>>
>> assert(yypcb->pcb_stmt == sdp);
>> if (dtrace_stmt_add(yypcb->pcb_hdl, yypcb->pcb_prog, sdp) != 0)
>> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
>> @@ -310,6 +310,7 @@ struct dtrace_hdl {
>> dt_pcb_t *dt_pcb; /* pointer to current parsing control block */
>> ulong_t dt_gen; /* compiler generation number */
>> uint_t dt_clause_nextid; /* next ID to use for programs */
>> + char *dt_uprovdescs[256]; /* uprobe provider descriptor per clause... FIXME turn this into a growable array */
>> dt_list_t dt_programs; /* linked list of dtrace_prog_t's */
>> dt_list_t dt_xlators; /* linked list of dt_xlator_t's */
>> dt_list_t dt_enablings; /* list of (to be) enabled probes */
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 04/14] Track uprobe provider descriptions
2024-06-07 21:40 ` Eugene Loh
@ 2024-06-07 22:16 ` Kris Van Hees
2024-06-10 21:23 ` Eugene Loh
0 siblings, 1 reply; 31+ messages in thread
From: Kris Van Hees @ 2024-06-07 22:16 UTC (permalink / raw)
To: Eugene Loh; +Cc: dtrace, dtrace-devel
On Fri, Jun 07, 2024 at 05:40:51PM -0400, Eugene Loh wrote:
> On 6/4/24 17:10, Kris Van Hees wrote:
>
> > On Tue, Jun 04, 2024 at 02:11:03PM -0400, eugene.loh--- via DTrace-devel wrote:
> > > From: Eugene Loh <eugene.loh@oracle.com>
> > >
> > > A uprobe will have to decide which of its clauses to run for a
> > > given pid. For now, keep the provider description for each clause.
> > > There are many shortcomings to the way this is done in this patch,
> > > but it is quick and dirty and helps us bootstrap real support.
> > > Clean this up later, probably turning it into a growable array.
> > This is the wrong way to look at it, I think. The underlying concept is that
> > DTrace supports specifying probes that do not match anything yet (-Z). Those
> > are called retained enablings and they can exist for more than just uprobes.
> > So this needs to be more generic.
>
> Okay. I looked at legacy-DTrace retained enablings. They seem to be a
> linked list -- (mainly) of arrays of dtrace_ecbdesc pointers. And
> dtrace_ecbdesc has dtrace_probedesc, dtrace_preddesc, and dtrace_actdesc.
> In DTrace on eBPF, dtrace_ecbdesc is much simpler -- it's really just a
> dtrace_probedesc. So it seems to me we could just make our retained
> enablings be an array of probe desc pointers. That would be a small change
> to this (small) patch.
>
> > We also shouldn't do this non-discriminately for all clauses all the time.
> > Obviously, when -Z is not used, there is no point in doing this because there
> > cannot be a match-after-start.
>
> FWIW, that's not how legacy DTrace seems to work: there *can* be
> match-after-start without -Z. The -Z option only addresses whether there is
> an initial match. If there is, then -Z is not needed, not even for
> subsequent matches to work.
Yes, I didnt think of that. But -Z is still relevant in view of the stuff I
mention below...
> > Even when -Z is specified, you might be able
> > to determine whether the probe specification for a probe matches perfectly
> > or whether it needs to be retained for match-after-start.
>
> Maybe. But what if you have pid1234:a.out:main:go, but pid 1234 has not yet
> started up? (Yes, that's an extremely weird example.)
No, when a pid* probe is not able to be resolved, dtrace flags an error:
$ sudo ./build/run-dtrace -n 'pid12:a.out:main:go,BEGIN { exit(0); }'
dtrace: invalid probe specifier pid12:a.out:main:go,BEGIN { exit(0); }: failed to grab process 12
# sudo ./build/run-dtrace -Zn 'pid12:a.out:main:go,BEGIN { exit(0); }'
dtrace: invalid probe specifier pid12:a.out:main:go,BEGIN { exit(0); }: failed to grab process 12
> Mostly (going back to your "shouldn't non-discriminately" comment), trying
> to decide whether to keep a probe desc seems unnecessary. Just keep them
> all. They will likely be dwarfed by dtp->dt_probes[] anyhow.
The issue is not space... but any retained enabling would have to be
re-evaluated everytime we need to see if there are new probes to be enabled.
And this will be happening while we are already tracing, so avoiding needless
matching is probably worth it.
> > So, let's put some thought into designing this in the more generic case, so
> > that we can avoid going down a rabbit hole that gets tough to recover from.
>
> What do you think of:
>
> struct dtrace_hdl {
> [...]
> dt_list_t dt_enablings; /* list of (to be) enabled probes */
> + dtrace_probedesc_t **dt_retained;
> + int dt_nretained;
> + int dt_maxretained;
I don't think you need both varoables. Just keeping dt_retained_sz (which is
your dt_maxretained, but made consistent with dt_probes_sz) is enough since
you can know what the last entry is when looping over it by checking if the
element is NULL. There should never be gaps, so the first NULL indicates that
you reached the end.
> struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id
> */
> [...]
> }
>
>
> > (And whatever we store the retained enablings in probably should be added to
> > dtrace_hdl_t after dt_enablings.)
>
> No problem.
>
> > > diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
> > > @@ -215,6 +215,7 @@ dt_compile_one_clause(dtrace_hdl_t *dtp, dt_node_t *cnp, dt_node_t *pnp)
> > > */
> > > dt_cg(yypcb, cnp);
> > > sdp->dtsd_clause = dt_clause_create(dtp, dt_as(yypcb));
> > > + dtp->dt_uprovdescs[dtp->dt_clause_nextid - 1] = strdup(pnp->dn_desc->prv);
> > > assert(yypcb->pcb_stmt == sdp);
> > > if (dtrace_stmt_add(yypcb->pcb_hdl, yypcb->pcb_prog, sdp) != 0)
> > > diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> > > @@ -310,6 +310,7 @@ struct dtrace_hdl {
> > > dt_pcb_t *dt_pcb; /* pointer to current parsing control block */
> > > ulong_t dt_gen; /* compiler generation number */
> > > uint_t dt_clause_nextid; /* next ID to use for programs */
> > > + char *dt_uprovdescs[256]; /* uprobe provider descriptor per clause... FIXME turn this into a growable array */
> > > dt_list_t dt_programs; /* linked list of dtrace_prog_t's */
> > > dt_list_t dt_xlators; /* linked list of dt_xlator_t's */
> > > dt_list_t dt_enablings; /* list of (to be) enabled probes */
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 03/14] Let USDT module names contain dots
2024-06-07 18:48 ` Nick Alcock
@ 2024-06-07 22:22 ` Eugene Loh
0 siblings, 0 replies; 31+ messages in thread
From: Eugene Loh @ 2024-06-07 22:22 UTC (permalink / raw)
To: dtrace, Eugene Loh via DTrace-devel
On 6/7/24 14:48, Nick Alcock wrote:
> On 4 Jun 2024, Eugene Loh via DTrace-devel stated:
>> There are basically two distinct cases here.
>>
>> In the dof_stash.c case, I note that we are checking a string of the form "%s:%s:%s:%s" to see if it is "." or "..". That simply
>> does not make sense. I think the indicated code should be removed.
> Yes -- that was a mistake. What we want to ensure is that none of the
> probe components *consist solely of* "." or "..". This has to happen
> before any attempt is made to write them to the DOF stash, and before
> any attempt is made to *look them up* in there, to avoid unintended path
> traversal.
>
> This is, I'm afraid, a security issue in the presence of potentially
> malicious DOF: we can't just do nothing.
The reason I say the string is of the form "%s:%s:%s:%s" is that we are checking a string returned by make_probespec_name().
I guess I'll turn the check into per-component checks and move them into make_probespec_name(). Incidentally, do your traversal concerns really apply to "."? I understand "..", but "." seems less criminal.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 07/14] Create the BPF uprobes map
2024-06-05 4:33 ` [DTrace-devel] " Kris Van Hees
@ 2024-06-10 20:55 ` Eugene Loh
0 siblings, 0 replies; 31+ messages in thread
From: Eugene Loh @ 2024-06-10 20:55 UTC (permalink / raw)
To: dtrace, dtrace-devel
On 6/5/24 00:33, Kris Van Hees wrote:
> On Tue, Jun 04, 2024 at 02:11:06PM -0400, eugene.loh--- via DTrace-devel wrote:
>> From: Eugene Loh <eugene.loh@oracle.com>
>>
>> WIP, sizing is hardwired for now.
>> WIP, should add new pids and purge old ones (after the initial call).
>>
>> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
>> ---
>> libdtrace/dt_bpf.c | 48 +++++++++++++++++++++++++++++++
>> libdtrace/dt_dlibs.c | 1 +
>> libdtrace/dt_impl.h | 1 +
>> libdtrace/dt_prov_uprobe.c | 59 ++++++++++++++++++++++++++++++++++++++
>> 4 files changed, 109 insertions(+)
>>
>> diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
>> index 71c6a446..3aeae274 100644
>> --- a/libdtrace/dt_bpf.c
>> +++ b/libdtrace/dt_bpf.c
>> @@ -946,6 +946,53 @@ gmap_create_probes(dtrace_hdl_t *dtp)
>> return 0;
>> }
>>
>> +/* also defined in dt_prov_uprobe.c */
> This is a red flag - why would structures be defined in two different places?
> That certainly risks really bad situations where one gets changed and not the
> other. Why is there duplication? That should never happen.
Sure. To be fair, such egregious coding is not (unfortunately)
unprecedented. Also, there is a comment in each location warning of the
other location. But, yeah, the situation is not ideal. It was a
placeholder awaiting a better solution.
One option is maybe to define these structs only in
libdtrace/dt_prov_uprobe.c. Then, have dtrace_hdl_t include (near the
map fd) two other components: one for the key size and one for the
value size. When we call the uprobe provider's populate() function, we
can set these sizes. Later, when we create the BPF map, we can retrieve
the sizes. It would mean that dtrace_hdl_t has some provider-specific
components, but I guess that line has already been crossed by having
dt_prov_pid and dt_prov_usdt.
What do you think? Do two such new "BPF map key and value size"
components to dtrace_hdl_t seem reasonable?
>> +typedef struct uprobe_map_key {
>> + pid_t pid;
>> + dtrace_id_t uprid;
>> +} uprobe_map_key_t;
>> +typedef struct uprobe_map_val {
>> + dtrace_id_t prid;
>> + long long mask;
>> +} uprobe_map_val_t;
>> +
>> +/*
>> + * Create the 'uprobes' BPF map.
> How about naming it 'prmap' or something like that? Or 'pidmap'? It has more
> to do with pids than with uprobes. (And then rename the struct names also,
> and everything else that is currently based on the uprobes map name.)
I don't get this. The map seems specific to uprobes. The pid enters
only as a component of the key; the other component is the uprobe. To
me, the map is less about pids than about uprobes, even if it is
somewhat related to both.
>> + *
>> + * Uprobe information map. This is a global hash map for use
>> + * with USDT and pid probes). It is indexed by (pid, probe ID),
>> + * using the probe ID of the underlying probe. The value is a
>> + * probe ID for the overlying probe and a bit mask indicating
>> + * which clauses to execute for this pid.
>> + *
>> + * WIP. Just make up some sizes for now.
>> + *
>> + * How big is a probe ID? Sometimes, it's a dtrace_id_t.
>> + * And uts/common/sys/dtrace_types.h gives us "typedef uint32_t dtrace_id_t".
>> + * Meanwhile, libdtrace/dt_impl.h has "uint32_t dt_probe_id".
>> + * So either uint32_t or dtrace_id_t is fine.
>> + *
>> + * How big should our bit mask be? Start with 8*sizeof(long long) bits.
>> + *
>> + * How many entries should we allow? Start with 1000.
>> + *
>> + * FIXME: is there any way two different overlying probes (differing by more
>> + * than just pid) could map to the same underlying probe?
> I expect it ought to be possible to place a pid offset probe on the location of
> a USDT probe, thereby causing two overlying probes on the same underlying one.
Good point. Yeah, I confirmed this with legacy DTrace. I set a USDT
probe and pid probe to fire at the same place. Each one reports its
uregs[R_PC]. Both fire. Both report the same PC.
But this doesn't work with DTrace on eBPF. I can set a USDT probe. I
can alternatively set a pid probe at that offset. But if I try to set
both probes at once, only one of them will fire. Presumably this is
because the uprobe trampoline, having matched on one overlying probe,
then jumps to the end. I guess that is a "bug" that could be fixed.
Apparently, we never test this scenario.
This "use case" (being generous here) seems to throw a monkey wrench
into the works. The plan was to look a PRID up in a BPF map. Well,
that scheme naively returns only one PRID, but now we're saying we could
have at least two. Also, given the PRID, the uprobe trampoline walks
through the clauses only once. Must the trampoline be set up to walk
through the clauses multiple times?
Or, can we simply not support this case (multiple overlying probes for a
given pid for the same underlying probe) for the time being?
>> + */
>> +static int
>> +gmap_create_uprobes(dtrace_hdl_t *dtp)
>> +{
>> + dtp->dt_uprobesmap_fd = create_gmap(dtp, "uprobes", BPF_MAP_TYPE_HASH,
>> + sizeof(uprobe_map_key_t), sizeof(uprobe_map_val_t), 1000);
>> + if (dtp->dt_uprobesmap_fd == -1)
>> + return -1;
>> +
>> + /* Populate the newly created map. */
>> + dt_uprobe.update(dtp, NULL);
> This seems to be something that ought to be done in a function called from
> dtrace_go() that goes through the enablings (and perhaps also the retained
> ones), and calls a provider hook for any probes that match the probe spec
> that relates to it. Something like that. And then the provider code can do
> whatever needs doing for that particular matching probe.
Hmm. Okay. I guess it ought to go between the calls to
dt_bpf_gmap_create() and dt_bpf_load_progs().
>> +
>> + return 0;
>> +}
>> +
>> /*
>> * Create the 'gvars' BPF map.
>> *
>> @@ -1051,6 +1098,7 @@ dt_bpf_gmap_create(dtrace_hdl_t *dtp)
>> CREATE_MAP(scratchmem)
>> CREATE_MAP(strtab)
>> CREATE_MAP(probes)
>> + CREATE_MAP(uprobes)
>> CREATE_MAP(gvars)
>> CREATE_MAP(lvars)
>> CREATE_MAP(dvars)
>> diff --git a/libdtrace/dt_dlibs.c b/libdtrace/dt_dlibs.c
>> index bc883e11..7bbeb02c 100644
>> --- a/libdtrace/dt_dlibs.c
>> +++ b/libdtrace/dt_dlibs.c
>> @@ -66,6 +66,7 @@ static const dt_ident_t dt_bpf_symbols[] = {
>> DT_BPF_SYMBOL(lvars, DT_IDENT_PTR),
>> DT_BPF_SYMBOL(mem, DT_IDENT_PTR),
>> DT_BPF_SYMBOL(probes, DT_IDENT_PTR),
>> + DT_BPF_SYMBOL(uprobes, DT_IDENT_PTR),
>> DT_BPF_SYMBOL(scratchmem, DT_IDENT_PTR),
>> DT_BPF_SYMBOL(specs, DT_IDENT_PTR),
>> DT_BPF_SYMBOL(state, DT_IDENT_PTR),
>> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
>> index 1bf79d80..935b96f4 100644
>> --- a/libdtrace/dt_impl.h
>> +++ b/libdtrace/dt_impl.h
>> @@ -438,6 +438,7 @@ struct dtrace_hdl {
>> int dt_aggmap_fd; /* file descriptor for the 'aggs' BPF map */
>> int dt_genmap_fd; /* file descriptor for the 'agggen' BPF map */
>> int dt_cpumap_fd; /* file descriptor for the 'cpuinfo' BPF map */
>> + int dt_uprobesmap_fd; /* file descriptor for the 'uprobes' BPF map */
>> dtrace_handle_err_f *dt_errhdlr; /* error handler, if any */
>> void *dt_errarg; /* error handler argument */
>> dtrace_handle_drop_f *dt_drophdlr; /* drop handler, if any */
>> diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c
>> index 34906aa0..591f2fab 100644
>> --- a/libdtrace/dt_prov_uprobe.c
>> +++ b/libdtrace/dt_prov_uprobe.c
>> @@ -45,6 +45,7 @@
>> #include "dt_probe.h"
>> #include "dt_pid.h"
>> #include "dt_string.h"
>> +#include "port.h"
>>
>> /* Provider name for the underlying probes. */
>> static const char prvname[] = "uprobe";
>> @@ -70,6 +71,15 @@ typedef struct list_probe {
>> dt_probe_t *probe;
>> } list_probe_t;
>>
>> +/* uprobe_map_[key|val]_t are also defined in dt_bpf.c */
>> +typedef struct uprobe_map_key {
>> + pid_t pid;
>> + dtrace_id_t uprid;
>> +} uprobe_map_key_t;
>> +typedef struct uprobe_map_val {
>> + dtrace_id_t prid;
>> + long long mask;
>> +} uprobe_map_val_t;
>> typedef struct list_clause {
>> dt_list_t list;
>> uint_t clause;
>> @@ -135,6 +145,54 @@ static void probe_destroy(dtrace_hdl_t *dtp, void *datap)
>> free_probe_list(dtp, datap);
>> }
>>
>> +/*
>> + * Update the uprobe provider.
>> + */
>> +static void update_uprobe(dtrace_hdl_t *dtp, void *datap)
>> +{
>> + dt_probe_t *prp;
>> +
>> + for (prp = dt_list_next(&dtp->dt_enablings); prp != NULL;
>> + prp = dt_list_next(prp)) {
>> + pid_t pid;
>> + const list_probe_t *pup;
>> + dt_probe_t *uprp;
>> + long long mask = 0, bit = 1;
>> + list_clause_t *cl;
>> + uprobe_map_key_t key;
>> + uprobe_map_val_t val;
>> +
>> + /* Make sure it is an overlying pid or USDT probe. */
>> + if (prp->prov->impl != &dt_pid && prp->prov->impl != &dt_usdt)
>> + continue;
>> +
>> + /* FIXME passing in NULL pcb and dpr wreaks havoc on error reporting? */
>> + pid = dt_pid_get_pid(prp->desc, dtp, NULL, NULL);
>> +
>> + for (pup = prp->prv_data; pup != NULL; pup = dt_list_next(pup)) {
>> + dt_uprobe_t *upp;
>> +
>> + uprp = pup->probe;
>> + upp = uprp->prv_data;
>> + for (cl = dt_list_next(&upp->clauses); cl != NULL; cl = dt_list_next(cl)) {
>> + if (gmatch(prp->desc->prv, dtp->dt_uprovdescs[cl->clause]))
>> + mask |= bit;
>> +
>> + bit <<= 1;
>> + }
>> +
>> + key.pid = pid;
>> + key.uprid = uprp->desc->id;
>> +
>> + val.prid = prp->desc->id;
>> + val.mask = mask;
>> +
>> + // FIXME check return value
>> + dt_bpf_map_update(dtp->dt_uprobesmap_fd, &key, &val);
>> + }
>> + }
>> +}
>> +
>>
>> /*
>> * Look up or create an underlying (real) probe, corresponding directly to a
>> @@ -803,6 +861,7 @@ dt_provimpl_t dt_uprobe = {
>> .probe_info = &probe_info,
>> .detach = &detach,
>> .probe_destroy = &probe_destroy_underlying,
>> + .update = &update_uprobe,
>> };
>>
>> /*
>> --
>> 2.18.4
>>
>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 04/14] Track uprobe provider descriptions
2024-06-07 22:16 ` Kris Van Hees
@ 2024-06-10 21:23 ` Eugene Loh
2024-06-10 21:31 ` Kris Van Hees
0 siblings, 1 reply; 31+ messages in thread
From: Eugene Loh @ 2024-06-10 21:23 UTC (permalink / raw)
To: dtrace, dtrace-devel
On 6/7/24 18:16, Kris Van Hees wrote:
> On Fri, Jun 07, 2024 at 05:40:51PM -0400, Eugene Loh wrote:
>> On 6/4/24 17:10, Kris Van Hees wrote:
>>
>>> On Tue, Jun 04, 2024 at 02:11:03PM -0400, eugene.loh--- via DTrace-devel wrote:
>>>> From: Eugene Loh <eugene.loh@oracle.com>
>>>>
>>>> A uprobe will have to decide which of its clauses to run for a
>>>> given pid. For now, keep the provider description for each clause.
>>>> There are many shortcomings to the way this is done in this patch,
>>>> but it is quick and dirty and helps us bootstrap real support.
>>>> Clean this up later, probably turning it into a growable array.
>>> This is the wrong way to look at it, I think. The underlying concept is that
>>> DTrace supports specifying probes that do not match anything yet (-Z). Those
>>> are called retained enablings and they can exist for more than just uprobes.
>>> So this needs to be more generic.
>> Okay. I looked at legacy-DTrace retained enablings. They seem to be a
>> linked list -- (mainly) of arrays of dtrace_ecbdesc pointers. And
>> dtrace_ecbdesc has dtrace_probedesc, dtrace_preddesc, and dtrace_actdesc.
>> In DTrace on eBPF, dtrace_ecbdesc is much simpler -- it's really just a
>> dtrace_probedesc. So it seems to me we could just make our retained
>> enablings be an array of probe desc pointers. That would be a small change
>> to this (small) patch.
>>
>>> We also shouldn't do this non-discriminately for all clauses all the time.
>>> Obviously, when -Z is not used, there is no point in doing this because there
>>> cannot be a match-after-start.
>> FWIW, that's not how legacy DTrace seems to work: there *can* be
>> match-after-start without -Z. The -Z option only addresses whether there is
>> an initial match. If there is, then -Z is not needed, not even for
>> subsequent matches to work.
> Yes, I didnt think of that. But -Z is still relevant in view of the stuff I
> mention below...
>
>>> Even when -Z is specified, you might be able
>>> to determine whether the probe specification for a probe matches perfectly
>>> or whether it needs to be retained for match-after-start.
>> Maybe. But what if you have pid1234:a.out:main:go, but pid 1234 has not yet
>> started up? (Yes, that's an extremely weird example.)
> No, when a pid* probe is not able to be resolved, dtrace flags an error:
>
> $ sudo ./build/run-dtrace -n 'pid12:a.out:main:go,BEGIN { exit(0); }'
> dtrace: invalid probe specifier pid12:a.out:main:go,BEGIN { exit(0); }: failed to grab process 12
> # sudo ./build/run-dtrace -Zn 'pid12:a.out:main:go,BEGIN { exit(0); }'
> dtrace: invalid probe specifier pid12:a.out:main:go,BEGIN { exit(0); }: failed to grab process 12
Hmm, seems a little strange to me. I do not understand why the pid
provider should act like that. With -Z, why not silently ignore probes
that aren't legit or for pids that do not exist yet? Anyhow, that's
maybe not important to think about at this point.
>> Mostly (going back to your "shouldn't non-discriminately" comment), trying
>> to decide whether to keep a probe desc seems unnecessary. Just keep them
>> all. They will likely be dwarfed by dtp->dt_probes[] anyhow.
> The issue is not space... but any retained enabling would have to be
> re-evaluated everytime we need to see if there are new probes to be enabled.
> And this will be happening while we are already tracing, so avoiding needless
> matching is probably worth it.
Good point. I suspect it is not yet important to identify all the
retained enablings that do not need to be retained. We can start the
implementation with all of them. As we think of cases where retained
enablings do not have to be re-evaluated, we can teach the
implementation later.
>>> So, let's put some thought into designing this in the more generic case, so
>>> that we can avoid going down a rabbit hole that gets tough to recover from.
>> What do you think of:
>>
>> struct dtrace_hdl {
>> [...]
>> dt_list_t dt_enablings; /* list of (to be) enabled probes */
>> + dtrace_probedesc_t **dt_retained;
>> + int dt_nretained;
>> + int dt_maxretained;
> I don't think you need both varoables. Just keeping dt_retained_sz (which is
> your dt_maxretained, but made consistent with dt_probes_sz) is enough since
> you can know what the last entry is when looping over it by checking if the
> element is NULL. There should never be gaps, so the first NULL indicates that
> you reached the end.
Good point. I simply was mimicking legacy DTrace, but that's not necessary.
>> struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id
>> */
>> [...]
>> }
>>
>>
>>> (And whatever we store the retained enablings in probably should be added to
>>> dtrace_hdl_t after dt_enablings.)
>> No problem.
>>
>>>> diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
>>>> @@ -215,6 +215,7 @@ dt_compile_one_clause(dtrace_hdl_t *dtp, dt_node_t *cnp, dt_node_t *pnp)
>>>> */
>>>> dt_cg(yypcb, cnp);
>>>> sdp->dtsd_clause = dt_clause_create(dtp, dt_as(yypcb));
>>>> + dtp->dt_uprovdescs[dtp->dt_clause_nextid - 1] = strdup(pnp->dn_desc->prv);
>>>> assert(yypcb->pcb_stmt == sdp);
>>>> if (dtrace_stmt_add(yypcb->pcb_hdl, yypcb->pcb_prog, sdp) != 0)
>>>> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
>>>> @@ -310,6 +310,7 @@ struct dtrace_hdl {
>>>> dt_pcb_t *dt_pcb; /* pointer to current parsing control block */
>>>> ulong_t dt_gen; /* compiler generation number */
>>>> uint_t dt_clause_nextid; /* next ID to use for programs */
>>>> + char *dt_uprovdescs[256]; /* uprobe provider descriptor per clause... FIXME turn this into a growable array */
>>>> dt_list_t dt_programs; /* linked list of dtrace_prog_t's */
>>>> dt_list_t dt_xlators; /* linked list of dt_xlator_t's */
>>>> dt_list_t dt_enablings; /* list of (to be) enabled probes */
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 04/14] Track uprobe provider descriptions
2024-06-10 21:23 ` Eugene Loh
@ 2024-06-10 21:31 ` Kris Van Hees
0 siblings, 0 replies; 31+ messages in thread
From: Kris Van Hees @ 2024-06-10 21:31 UTC (permalink / raw)
To: Eugene Loh; +Cc: dtrace, dtrace-devel
On Mon, Jun 10, 2024 at 05:23:48PM -0400, Eugene Loh wrote:
> On 6/7/24 18:16, Kris Van Hees wrote:
>
> > On Fri, Jun 07, 2024 at 05:40:51PM -0400, Eugene Loh wrote:
> > > On 6/4/24 17:10, Kris Van Hees wrote:
> > >
> > > > On Tue, Jun 04, 2024 at 02:11:03PM -0400, eugene.loh--- via DTrace-devel wrote:
> > > > > From: Eugene Loh <eugene.loh@oracle.com>
> > > > >
> > > > > A uprobe will have to decide which of its clauses to run for a
> > > > > given pid. For now, keep the provider description for each clause.
> > > > > There are many shortcomings to the way this is done in this patch,
> > > > > but it is quick and dirty and helps us bootstrap real support.
> > > > > Clean this up later, probably turning it into a growable array.
> > > > This is the wrong way to look at it, I think. The underlying concept is that
> > > > DTrace supports specifying probes that do not match anything yet (-Z). Those
> > > > are called retained enablings and they can exist for more than just uprobes.
> > > > So this needs to be more generic.
> > > Okay. I looked at legacy-DTrace retained enablings. They seem to be a
> > > linked list -- (mainly) of arrays of dtrace_ecbdesc pointers. And
> > > dtrace_ecbdesc has dtrace_probedesc, dtrace_preddesc, and dtrace_actdesc.
> > > In DTrace on eBPF, dtrace_ecbdesc is much simpler -- it's really just a
> > > dtrace_probedesc. So it seems to me we could just make our retained
> > > enablings be an array of probe desc pointers. That would be a small change
> > > to this (small) patch.
> > >
> > > > We also shouldn't do this non-discriminately for all clauses all the time.
> > > > Obviously, when -Z is not used, there is no point in doing this because there
> > > > cannot be a match-after-start.
> > > FWIW, that's not how legacy DTrace seems to work: there *can* be
> > > match-after-start without -Z. The -Z option only addresses whether there is
> > > an initial match. If there is, then -Z is not needed, not even for
> > > subsequent matches to work.
> > Yes, I didnt think of that. But -Z is still relevant in view of the stuff I
> > mention below...
> >
> > > > Even when -Z is specified, you might be able
> > > > to determine whether the probe specification for a probe matches perfectly
> > > > or whether it needs to be retained for match-after-start.
> > > Maybe. But what if you have pid1234:a.out:main:go, but pid 1234 has not yet
> > > started up? (Yes, that's an extremely weird example.)
> > No, when a pid* probe is not able to be resolved, dtrace flags an error:
> >
> > $ sudo ./build/run-dtrace -n 'pid12:a.out:main:go,BEGIN { exit(0); }'
> > dtrace: invalid probe specifier pid12:a.out:main:go,BEGIN { exit(0); }: failed to grab process 12
> > # sudo ./build/run-dtrace -Zn 'pid12:a.out:main:go,BEGIN { exit(0); }'
> > dtrace: invalid probe specifier pid12:a.out:main:go,BEGIN { exit(0); }: failed to grab process 12
>
> Hmm, seems a little strange to me. I do not understand why the pid provider
> should act like that. With -Z, why not silently ignore probes that aren't
> legit or for pids that do not exist yet? Anyhow, that's maybe not important
> to think about at this point.
We *may* be able to change that in the future, but the main reason is probably
simply that DTrace needs the pid to be able to resolve the rest of the probe
specification. I.e. you cannot determine whether a pid offset probe has a
valid offset without looking at the instructions, and for that you need to know
what object to look at, i.e. the pid needs to be able to be resolved into an
actual process/task.
> > > Mostly (going back to your "shouldn't non-discriminately" comment), trying
> > > to decide whether to keep a probe desc seems unnecessary. Just keep them
> > > all. They will likely be dwarfed by dtp->dt_probes[] anyhow.
> > The issue is not space... but any retained enabling would have to be
> > re-evaluated everytime we need to see if there are new probes to be enabled.
> > And this will be happening while we are already tracing, so avoiding needless
> > matching is probably worth it.
>
> Good point. I suspect it is not yet important to identify all the retained
> enablings that do not need to be retained. We can start the implementation
> with all of them. As we think of cases where retained enablings do not have
> to be re-evaluated, we can teach the implementation later.
Weell, retained enablings always have to be retained because processes can come
and go. If it is known that an enabling is meant to match future instances,
it is equally known that such matching may need to be done multiple times as
processes come and go.
> > > > So, let's put some thought into designing this in the more generic case, so
> > > > that we can avoid going down a rabbit hole that gets tough to recover from.
> > > What do you think of:
> > >
> > > struct dtrace_hdl {
> > > [...]
> > > dt_list_t dt_enablings; /* list of (to be) enabled probes */
> > > + dtrace_probedesc_t **dt_retained;
> > > + int dt_nretained;
> > > + int dt_maxretained;
> > I don't think you need both varoables. Just keeping dt_retained_sz (which is
> > your dt_maxretained, but made consistent with dt_probes_sz) is enough since
> > you can know what the last entry is when looping over it by checking if the
> > element is NULL. There should never be gaps, so the first NULL indicates that
> > you reached the end.
>
> Good point. I simply was mimicking legacy DTrace, but that's not necessary.
>
> > > struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id
> > > */
> > > [...]
> > > }
> > >
> > >
> > > > (And whatever we store the retained enablings in probably should be added to
> > > > dtrace_hdl_t after dt_enablings.)
> > > No problem.
> > >
> > > > > diff --git a/libdtrace/dt_cc.c b/libdtrace/dt_cc.c
> > > > > @@ -215,6 +215,7 @@ dt_compile_one_clause(dtrace_hdl_t *dtp, dt_node_t *cnp, dt_node_t *pnp)
> > > > > */
> > > > > dt_cg(yypcb, cnp);
> > > > > sdp->dtsd_clause = dt_clause_create(dtp, dt_as(yypcb));
> > > > > + dtp->dt_uprovdescs[dtp->dt_clause_nextid - 1] = strdup(pnp->dn_desc->prv);
> > > > > assert(yypcb->pcb_stmt == sdp);
> > > > > if (dtrace_stmt_add(yypcb->pcb_hdl, yypcb->pcb_prog, sdp) != 0)
> > > > > diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> > > > > @@ -310,6 +310,7 @@ struct dtrace_hdl {
> > > > > dt_pcb_t *dt_pcb; /* pointer to current parsing control block */
> > > > > ulong_t dt_gen; /* compiler generation number */
> > > > > uint_t dt_clause_nextid; /* next ID to use for programs */
> > > > > + char *dt_uprovdescs[256]; /* uprobe provider descriptor per clause... FIXME turn this into a growable array */
> > > > > dt_list_t dt_programs; /* linked list of dtrace_prog_t's */
> > > > > dt_list_t dt_xlators; /* linked list of dt_xlator_t's */
> > > > > dt_list_t dt_enablings; /* list of (to be) enabled probes */
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [DTrace-devel] [PATCH 05/14] Add a hook for a provider-specific "update" function
2024-06-04 21:38 ` [DTrace-devel] " Kris Van Hees
@ 2024-06-10 22:14 ` Eugene Loh
0 siblings, 0 replies; 31+ messages in thread
From: Eugene Loh @ 2024-06-10 22:14 UTC (permalink / raw)
To: dtrace, dtrace-devel
On 6/4/24 17:38, Kris Van Hees wrote:
> Much like the previous patch not considering the generic nature of retained
> enablings, this one should be part of that design also. The functionality
> we need to handle match-after-start probe specifications really should be
> independent from any specific provider. It may (well, almost certainly will)
> require a hook in providers that need to do special handling. But that would
> be a specific hook for a specific purpose - not just an 'update' hook.
I guess I do not know what you're asking for here. I think we agree
that a provider-specific hook is needed. So, what scope or purpose
should that hook have? The immediate purpose at hand is to update the
set of pids we're tracking for uprobes. I'm open to thinking about this
in other ways, but then some specific proposals would be helpful.
> On Tue, Jun 04, 2024 at 02:11:04PM -0400, eugene.loh--- via DTrace-devel wrote:
>> From: Eugene Loh <eugene.loh@oracle.com>
>>
>> For up-coming USDT-probe support, we need to update a BPF map
>> -- at least when the dtrace session starts but possibly also later
>> to support systemwide USDT tracing for processes that may start up
>> later.
>>
>> One way to do this is with a USDT-specific update function.
>>
>> For now, let's add a hook for providers to have provider-specific
>> update functions. User space can either call
>>
>> for (i = 0; i < ARRAY_SIZE(dt_providers); i++) {
>> if (dt_providers[i]->update)
>> dt_providers[i]->update(...);
>> }
>>
>> any time it likes. Or it can call dt_usdt.update(...).
>>
>> This is for WIP. A different approach can be adopted later instead.
>>
>> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
>> ---
>> libdtrace/dt_provider.h | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/libdtrace/dt_provider.h b/libdtrace/dt_provider.h
>> index 17b1844c..21ff15ad 100644
>> --- a/libdtrace/dt_provider.h
>> +++ b/libdtrace/dt_provider.h
>> @@ -68,6 +68,8 @@ typedef struct dt_provimpl {
>> void *datap);
>> void (*destroy)(dtrace_hdl_t *dtp, /* free provider data */
>> void *datap);
>> + void (*update)(dtrace_hdl_t *dtp, /* update provider-specific info */
>> + void *datap);
>> } dt_provimpl_t;
>>
>> /* list dt_dtrace first */
>> --
>> 2.18.4
>>
>>
>> _______________________________________________
>> DTrace-devel mailing list
>> DTrace-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/dtrace-devel
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2024-06-10 22:14 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-04 18:10 "proof of concept" for systemwide USDT eugene.loh
2024-06-04 18:11 ` [PATCH 01/14] Move comment closer to the code it describes eugene.loh
2024-06-04 18:21 ` [DTrace-devel] " Kris Van Hees
2024-06-04 18:11 ` [PATCH 02/14] Clean up prp/uprp variable names eugene.loh
2024-06-04 18:44 ` [DTrace-devel] " Kris Van Hees
2024-06-05 18:18 ` Eugene Loh
2024-06-04 18:11 ` [PATCH 03/14] Let USDT module names contain dots eugene.loh
2024-06-04 20:42 ` [DTrace-devel] " Kris Van Hees
2024-06-04 22:30 ` Eugene Loh
2024-06-07 18:48 ` Nick Alcock
2024-06-07 22:22 ` Eugene Loh
2024-06-04 18:11 ` [PATCH 04/14] Track uprobe provider descriptions eugene.loh
2024-06-04 21:10 ` [DTrace-devel] " Kris Van Hees
2024-06-07 21:40 ` Eugene Loh
2024-06-07 22:16 ` Kris Van Hees
2024-06-10 21:23 ` Eugene Loh
2024-06-10 21:31 ` Kris Van Hees
2024-06-04 18:11 ` [PATCH 05/14] Add a hook for a provider-specific "update" function eugene.loh
2024-06-04 21:38 ` [DTrace-devel] " Kris Van Hees
2024-06-10 22:14 ` Eugene Loh
2024-06-04 18:11 ` [PATCH 06/14] Add clauses to per-uprobe list eugene.loh
2024-06-04 18:11 ` [PATCH 07/14] Create the BPF uprobes map eugene.loh
2024-06-05 4:33 ` [DTrace-devel] " Kris Van Hees
2024-06-10 20:55 ` Eugene Loh
2024-06-04 18:11 ` [PATCH 08/14] Use uprobes map to call clauses conditionally eugene.loh
2024-06-04 18:11 ` [PATCH 09/14] Systemwide USDT WIP eugene.loh
2024-06-04 18:11 ` [PATCH 10/14] Fix the consumer's picture of the EPID eugene.loh
2024-06-04 18:11 ` [PATCH 11/14] Back out the previous patch eugene.loh
2024-06-04 18:11 ` [PATCH 12/14] Fix comments that hardwire DBUF_ offsets eugene.loh
2024-06-04 18:11 ` [PATCH 13/14] Clean up some comments eugene.loh
2024-06-04 18:11 ` [PATCH 14/14] Have the consumer get the PRID from the output buffer eugene.loh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox