Linux DTrace development list
 help / color / mirror / Atom feed
* [PATCH 19/38] Retain probe descriptions
@ 2024-06-27  5:38 eugene.loh
  2024-07-18 21:25 ` Kris Van Hees
  0 siblings, 1 reply; 5+ messages in thread
From: eugene.loh @ 2024-06-27  5:38 UTC (permalink / raw)
  To: dtrace, dtrace-devel

From: Eugene Loh <eugene.loh@oracle.com>

To support matching probes after dtrace has started, we retain
probe descriptions.

Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
 libdtrace/dt_impl.h    |  2 ++
 libdtrace/dt_program.c | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
index 2132eda2..445cd602 100644
--- a/libdtrace/dt_impl.h
+++ b/libdtrace/dt_impl.h
@@ -300,6 +300,8 @@ struct dtrace_hdl {
 	dt_list_t dt_programs;	/* linked list of dtrace_prog_t's */
 	dt_list_t dt_xlators;	/* linked list of dt_xlator_t's */
 	dt_list_t dt_enablings;	/* list of (to be) enabled probes */
+	dtrace_probedesc_t **dt_retained; /* array of retained pdescs */
+	int dt_maxretained;	/* number of retained pdescs allocated */
 	struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id */
 	id_t dt_xlatorid;	/* next dt_xlator_t id to assign */
 	dt_ident_t *dt_externs;	/* linked list of external symbol identifiers */
diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c
index a4b052fc..7106c249 100644
--- a/libdtrace/dt_program.c
+++ b/libdtrace/dt_program.c
@@ -163,7 +163,42 @@ dt_prog_stmt(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, dtrace_stmtdesc_t *sdp,
 {
 	pi_state_t		st;
 	dtrace_probedesc_t	*pdp = &sdp->dtsd_ecbdesc->dted_probe;
-	int			rc;
+	char			*sclause = sdp->dtsd_clause->di_name;
+	int			rc, nclause;
+
+	/* Get the clause number from the clause name "dt_clause_"$n. */
+	assert(strncmp(sclause, "dt_clause_", strlen("dt_clause_")) == 0);    // FIXME can probably drop this check (make sure the name starts "dt_clause_")
+	nclause = atoi(sclause + strlen("dt_clause_"));
+	assert(nclause < 65536);	// FIXME are we okay here with some arbitrary limit?
+
+	/* Grow the list of retained pdescs, if necessary. */
+	if (nclause >= dtp->dt_maxretained) {
+		int			newm;
+		dtrace_probedesc_t	**newr;
+
+		/* Figure how big to make the array. */
+		if (dtp->dt_maxretained > 0)
+			newm = 2 * dtp->dt_maxretained;
+		else
+			newm = 8;
+		while (newm <= nclause)
+			newm *= 2;
+
+		/* Allocate the bigger array. */
+		newr = dt_zalloc(dtp, newm * sizeof(dtrace_probedesc_t *));
+		if (newr == NULL)
+			return 0;
+
+		/* Copy data and reset the parameters. */
+		memcpy(newr, dtp->dt_retained, dtp->dt_maxretained * sizeof(dtrace_probedesc_t *));
+		dt_free(dtp, dtp->dt_retained);
+		dtp->dt_retained = newr;
+		dtp->dt_maxretained = newm;
+	}
+
+	/* Add probe description to retained pdescs. */
+	assert(dtp->dt_retained[nclause] == NULL);  // FIXME can probably drop this
+	dtp->dt_retained[nclause] = pdp;        // FIXME I think just pointing to pdp is okay.  No need to cache my own copy.
 
 	st.cnt = cnt;
 	st.idp = sdp->dtsd_clause;
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 19/38] Retain probe descriptions
  2024-06-27  5:38 [PATCH 19/38] Retain probe descriptions eugene.loh
@ 2024-07-18 21:25 ` Kris Van Hees
  2024-07-19  3:34   ` Eugene Loh
  0 siblings, 1 reply; 5+ messages in thread
From: Kris Van Hees @ 2024-07-18 21:25 UTC (permalink / raw)
  To: eugene.loh; +Cc: dtrace, dtrace-devel

I think this needs an actual description of what you are trying to do, because
retaining pdescs in and of itself does not seem meaningful.  They do not
contain a reference to the clause they were associated with when the code was
parsed and compiled) and it seems to me that you are doing a sort-of reverse
association, and are actually (indirectly) retaining clauses by keeping a
link of pdescs, one per clause.  And you identify the clause by its numeric
id as obtained from its identifier name (which is dynamically assigned during
compilation).  That seems fragile and rather non-intuitive.

If memory serves me right, the intent was to retain enablings.  Of course, in
the current implementation of DTrace, enablings are nothing more than probes
because the probes have BPF programs associated with them (unless they are
overlying probes, in which case the actual code that gets loaded is part of the
BPF program of the underlying probe (which will be in the enablings list).

When we are retaining enablings, it seems to me that it would make more sense
to keep a list of objects { pdesc, clause } where clause would be the
identifier name of the compiled clause (similar to a function name).  Using a
pointer to the actual clause instead is (I think) not going to work becauase
AFAIK the clause does not store a pointer to its own identiifier.

In the end, this will largely do the same thing as your code (here and when it
gets used later), but it removes the implied dependency on how identifier names
for clauses are generated.  That could change in the future, and doing so would
completely break this imlementation.

On Thu, Jun 27, 2024 at 01:38:10AM -0400, eugene.loh@oracle.com wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
> 
> To support matching probes after dtrace has started, we retain
> probe descriptions.
> 
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
>  libdtrace/dt_impl.h    |  2 ++
>  libdtrace/dt_program.c | 37 ++++++++++++++++++++++++++++++++++++-
>  2 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> index 2132eda2..445cd602 100644
> --- a/libdtrace/dt_impl.h
> +++ b/libdtrace/dt_impl.h
> @@ -300,6 +300,8 @@ struct dtrace_hdl {
>  	dt_list_t dt_programs;	/* linked list of dtrace_prog_t's */
>  	dt_list_t dt_xlators;	/* linked list of dt_xlator_t's */
>  	dt_list_t dt_enablings;	/* list of (to be) enabled probes */
> +	dtrace_probedesc_t **dt_retained; /* array of retained pdescs */
> +	int dt_maxretained;	/* number of retained pdescs allocated */
>  	struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id */
>  	id_t dt_xlatorid;	/* next dt_xlator_t id to assign */
>  	dt_ident_t *dt_externs;	/* linked list of external symbol identifiers */
> diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c
> index a4b052fc..7106c249 100644
> --- a/libdtrace/dt_program.c
> +++ b/libdtrace/dt_program.c
> @@ -163,7 +163,42 @@ dt_prog_stmt(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, dtrace_stmtdesc_t *sdp,
>  {
>  	pi_state_t		st;
>  	dtrace_probedesc_t	*pdp = &sdp->dtsd_ecbdesc->dted_probe;
> -	int			rc;
> +	char			*sclause = sdp->dtsd_clause->di_name;
> +	int			rc, nclause;
> +
> +	/* Get the clause number from the clause name "dt_clause_"$n. */
> +	assert(strncmp(sclause, "dt_clause_", strlen("dt_clause_")) == 0);    // FIXME can probably drop this check (make sure the name starts "dt_clause_")
> +	nclause = atoi(sclause + strlen("dt_clause_"));
> +	assert(nclause < 65536);	// FIXME are we okay here with some arbitrary limit?
> +
> +	/* Grow the list of retained pdescs, if necessary. */
> +	if (nclause >= dtp->dt_maxretained) {
> +		int			newm;
> +		dtrace_probedesc_t	**newr;
> +
> +		/* Figure how big to make the array. */
> +		if (dtp->dt_maxretained > 0)
> +			newm = 2 * dtp->dt_maxretained;
> +		else
> +			newm = 8;
> +		while (newm <= nclause)
> +			newm *= 2;
> +
> +		/* Allocate the bigger array. */
> +		newr = dt_zalloc(dtp, newm * sizeof(dtrace_probedesc_t *));
> +		if (newr == NULL)
> +			return 0;
> +
> +		/* Copy data and reset the parameters. */
> +		memcpy(newr, dtp->dt_retained, dtp->dt_maxretained * sizeof(dtrace_probedesc_t *));
> +		dt_free(dtp, dtp->dt_retained);
> +		dtp->dt_retained = newr;
> +		dtp->dt_maxretained = newm;
> +	}
> +
> +	/* Add probe description to retained pdescs. */
> +	assert(dtp->dt_retained[nclause] == NULL);  // FIXME can probably drop this
> +	dtp->dt_retained[nclause] = pdp;        // FIXME I think just pointing to pdp is okay.  No need to cache my own copy.
>  
>  	st.cnt = cnt;
>  	st.idp = sdp->dtsd_clause;
> -- 
> 2.18.4
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 19/38] Retain probe descriptions
  2024-07-18 21:25 ` Kris Van Hees
@ 2024-07-19  3:34   ` Eugene Loh
  2024-07-19 19:52     ` Kris Van Hees
  0 siblings, 1 reply; 5+ messages in thread
From: Eugene Loh @ 2024-07-19  3:34 UTC (permalink / raw)
  To: dtrace, dtrace-devel

What do you think about replacing the array dt_retained[n] (as in,
"dt_clause_$n") with a hash table that uses a string (like
"dt_clause_$n") to look up a probe description?

On 7/18/24 17:25, Kris Van Hees wrote:

> I think this needs an actual description of what you are trying to do,

An underlying probe gets a list of clauses to call in its trampoline.  
Actually, the underlying probe looks up a mask in some BPF map and then 
walks the clauses, calling them conditionally based on the mask bits.

To construct the mask for a particular overlying probe, we walk the 
clauses.  Each clause is associated with a particular probe 
description.  We compare the overlying probe to the respective probe 
description to decide whether to set that particular bit.

So, a uprp has a list of clauses.  We want some way of getting from 
clauses to probe descriptions.  The linear array using that $n was one 
way.  I can move over to the hash table based on clause string name if 
you prefer.

(Incidentally, a uprp's clause list is set during dt_probe_add_clause(), 
at which point we do not know the motivating probe description.

> because
> retaining pdescs in and of itself does not seem meaningful.  They do not
> contain a reference to the clause they were associated with when the code was
> parsed and compiled) and it seems to me that you are doing a sort-of reverse
> association, and are actually (indirectly) retaining clauses by keeping a
> link of pdescs, one per clause.  And you identify the clause by its numeric
> id as obtained from its identifier name (which is dynamically assigned during
> compilation).  That seems fragile and rather non-intuitive.
>
> If memory serves me right, the intent was to retain enablings.  Of course, in
> the current implementation of DTrace, enablings are nothing more than probes
> because the probes have BPF programs associated with them (unless they are
> overlying probes, in which case the actual code that gets loaded is part of the
> BPF program of the underlying probe (which will be in the enablings list).
>
> When we are retaining enablings, it seems to me that it would make more sense
> to keep a list of objects { pdesc, clause } where clause would be the
> identifier name of the compiled clause (similar to a function name).  Using a
> pointer to the actual clause instead is (I think) not going to work becauase
> AFAIK the clause does not store a pointer to its own identiifier.
>
> In the end, this will largely do the same thing as your code (here and when it
> gets used later), but it removes the implied dependency on how identifier names
> for clauses are generated.  That could change in the future, and doing so would
> completely break this imlementation.
>
> On Thu, Jun 27, 2024 at 01:38:10AM -0400, eugene.loh@oracle.com wrote:
>> From: Eugene Loh <eugene.loh@oracle.com>
>>
>> To support matching probes after dtrace has started, we retain
>> probe descriptions.
>>
>> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
>> ---
>>   libdtrace/dt_impl.h    |  2 ++
>>   libdtrace/dt_program.c | 37 ++++++++++++++++++++++++++++++++++++-
>>   2 files changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
>> index 2132eda2..445cd602 100644
>> --- a/libdtrace/dt_impl.h
>> +++ b/libdtrace/dt_impl.h
>> @@ -300,6 +300,8 @@ struct dtrace_hdl {
>>   	dt_list_t dt_programs;	/* linked list of dtrace_prog_t's */
>>   	dt_list_t dt_xlators;	/* linked list of dt_xlator_t's */
>>   	dt_list_t dt_enablings;	/* list of (to be) enabled probes */
>> +	dtrace_probedesc_t **dt_retained; /* array of retained pdescs */
>> +	int dt_maxretained;	/* number of retained pdescs allocated */
>>   	struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id */
>>   	id_t dt_xlatorid;	/* next dt_xlator_t id to assign */
>>   	dt_ident_t *dt_externs;	/* linked list of external symbol identifiers */
>> diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c
>> index a4b052fc..7106c249 100644
>> --- a/libdtrace/dt_program.c
>> +++ b/libdtrace/dt_program.c
>> @@ -163,7 +163,42 @@ dt_prog_stmt(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, dtrace_stmtdesc_t *sdp,
>>   {
>>   	pi_state_t		st;
>>   	dtrace_probedesc_t	*pdp = &sdp->dtsd_ecbdesc->dted_probe;
>> -	int			rc;
>> +	char			*sclause = sdp->dtsd_clause->di_name;
>> +	int			rc, nclause;
>> +
>> +	/* Get the clause number from the clause name "dt_clause_"$n. */
>> +	assert(strncmp(sclause, "dt_clause_", strlen("dt_clause_")) == 0);    // FIXME can probably drop this check (make sure the name starts "dt_clause_")
>> +	nclause = atoi(sclause + strlen("dt_clause_"));
>> +	assert(nclause < 65536);	// FIXME are we okay here with some arbitrary limit?
>> +
>> +	/* Grow the list of retained pdescs, if necessary. */
>> +	if (nclause >= dtp->dt_maxretained) {
>> +		int			newm;
>> +		dtrace_probedesc_t	**newr;
>> +
>> +		/* Figure how big to make the array. */
>> +		if (dtp->dt_maxretained > 0)
>> +			newm = 2 * dtp->dt_maxretained;
>> +		else
>> +			newm = 8;
>> +		while (newm <= nclause)
>> +			newm *= 2;
>> +
>> +		/* Allocate the bigger array. */
>> +		newr = dt_zalloc(dtp, newm * sizeof(dtrace_probedesc_t *));
>> +		if (newr == NULL)
>> +			return 0;
>> +
>> +		/* Copy data and reset the parameters. */
>> +		memcpy(newr, dtp->dt_retained, dtp->dt_maxretained * sizeof(dtrace_probedesc_t *));
>> +		dt_free(dtp, dtp->dt_retained);
>> +		dtp->dt_retained = newr;
>> +		dtp->dt_maxretained = newm;
>> +	}
>> +
>> +	/* Add probe description to retained pdescs. */
>> +	assert(dtp->dt_retained[nclause] == NULL);  // FIXME can probably drop this
>> +	dtp->dt_retained[nclause] = pdp;        // FIXME I think just pointing to pdp is okay.  No need to cache my own copy.
>>   
>>   	st.cnt = cnt;
>>   	st.idp = sdp->dtsd_clause;
>> -- 
>> 2.18.4
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 19/38] Retain probe descriptions
  2024-07-19  3:34   ` Eugene Loh
@ 2024-07-19 19:52     ` Kris Van Hees
  2024-07-22 18:37       ` Eugene Loh
  0 siblings, 1 reply; 5+ messages in thread
From: Kris Van Hees @ 2024-07-19 19:52 UTC (permalink / raw)
  To: Eugene Loh; +Cc: dtrace, dtrace-devel

On Thu, Jul 18, 2024 at 11:34:05PM -0400, Eugene Loh wrote:
> What do you think about replacing the array dt_retained[n] (as in,
> "dt_clause_$n") with a hash table that uses a string (like
> "dt_clause_$n") to look up a probe description?
> 
> On 7/18/24 17:25, Kris Van Hees wrote:
> 
> > I think this needs an actual description of what you are trying to do,
> 
> An underlying probe gets a list of clauses to call in its trampoline. 
> Actually, the underlying probe looks up a mask in some BPF map and then
> walks the clauses, calling them conditionally based on the mask bits.
> 
> To construct the mask for a particular overlying probe, we walk the
> clauses.  Each clause is associated with a particular probe description.  We
> compare the overlying probe to the respective probe description to decide
> whether to set that particular bit.
> 
> So, a uprp has a list of clauses.  We want some way of getting from clauses
> to probe descriptions.  The linear array using that $n was one way.  I can
> move over to the hash table based on clause string name if you prefer.
> 
> (Incidentally, a uprp's clause list is set during dt_probe_add_clause(), at
> which point we do not know the motivating probe description.

Well, yes and no.  At the point of compilation, when there is not a single
task yet with the USDT probes we're wanting to probe, there won't be any
underlying probe(s) yet.  So all there will be are compiled clauses, and the
statements they belong to know the probedesc through the ecbdesc.  So, we
cannot even count on the BPF program already existing with all clauses linked
into it so all that needs to be done is setting the mask.

I am not clear how this fits in the design you came up with.

Even for the case where we already have the underlying probe (and thus the BPF
prgroam), I think it might perhaps be a cleaner design to not just store
clauses in the clause list, but perhaps store statements (which gives access
to the pdescs).  Then, when we encounter a wildcard match that relates us to
an underlying probe that exists, you can just go through the statement list
for that underlying probe and use a running index in that list to know which
bits to set in the bitmap.

> > because
> > retaining pdescs in and of itself does not seem meaningful.  They do not
> > contain a reference to the clause they were associated with when the code was
> > parsed and compiled) and it seems to me that you are doing a sort-of reverse
> > association, and are actually (indirectly) retaining clauses by keeping a
> > link of pdescs, one per clause.  And you identify the clause by its numeric
> > id as obtained from its identifier name (which is dynamically assigned during
> > compilation).  That seems fragile and rather non-intuitive.
> > 
> > If memory serves me right, the intent was to retain enablings.  Of course, in
> > the current implementation of DTrace, enablings are nothing more than probes
> > because the probes have BPF programs associated with them (unless they are
> > overlying probes, in which case the actual code that gets loaded is part of the
> > BPF program of the underlying probe (which will be in the enablings list).
> > 
> > When we are retaining enablings, it seems to me that it would make more sense
> > to keep a list of objects { pdesc, clause } where clause would be the
> > identifier name of the compiled clause (similar to a function name).  Using a
> > pointer to the actual clause instead is (I think) not going to work becauase
> > AFAIK the clause does not store a pointer to its own identiifier.
> > 
> > In the end, this will largely do the same thing as your code (here and when it
> > gets used later), but it removes the implied dependency on how identifier names
> > for clauses are generated.  That could change in the future, and doing so would
> > completely break this imlementation.
> > 
> > On Thu, Jun 27, 2024 at 01:38:10AM -0400, eugene.loh@oracle.com wrote:
> > > From: Eugene Loh <eugene.loh@oracle.com>
> > > 
> > > To support matching probes after dtrace has started, we retain
> > > probe descriptions.
> > > 
> > > Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> > > ---
> > >   libdtrace/dt_impl.h    |  2 ++
> > >   libdtrace/dt_program.c | 37 ++++++++++++++++++++++++++++++++++++-
> > >   2 files changed, 38 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
> > > index 2132eda2..445cd602 100644
> > > --- a/libdtrace/dt_impl.h
> > > +++ b/libdtrace/dt_impl.h
> > > @@ -300,6 +300,8 @@ struct dtrace_hdl {
> > >   	dt_list_t dt_programs;	/* linked list of dtrace_prog_t's */
> > >   	dt_list_t dt_xlators;	/* linked list of dt_xlator_t's */
> > >   	dt_list_t dt_enablings;	/* list of (to be) enabled probes */
> > > +	dtrace_probedesc_t **dt_retained; /* array of retained pdescs */
> > > +	int dt_maxretained;	/* number of retained pdescs allocated */
> > >   	struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id */
> > >   	id_t dt_xlatorid;	/* next dt_xlator_t id to assign */
> > >   	dt_ident_t *dt_externs;	/* linked list of external symbol identifiers */
> > > diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c
> > > index a4b052fc..7106c249 100644
> > > --- a/libdtrace/dt_program.c
> > > +++ b/libdtrace/dt_program.c
> > > @@ -163,7 +163,42 @@ dt_prog_stmt(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, dtrace_stmtdesc_t *sdp,
> > >   {
> > >   	pi_state_t		st;
> > >   	dtrace_probedesc_t	*pdp = &sdp->dtsd_ecbdesc->dted_probe;
> > > -	int			rc;
> > > +	char			*sclause = sdp->dtsd_clause->di_name;
> > > +	int			rc, nclause;
> > > +
> > > +	/* Get the clause number from the clause name "dt_clause_"$n. */
> > > +	assert(strncmp(sclause, "dt_clause_", strlen("dt_clause_")) == 0);    // FIXME can probably drop this check (make sure the name starts "dt_clause_")
> > > +	nclause = atoi(sclause + strlen("dt_clause_"));
> > > +	assert(nclause < 65536);	// FIXME are we okay here with some arbitrary limit?
> > > +
> > > +	/* Grow the list of retained pdescs, if necessary. */
> > > +	if (nclause >= dtp->dt_maxretained) {
> > > +		int			newm;
> > > +		dtrace_probedesc_t	**newr;
> > > +
> > > +		/* Figure how big to make the array. */
> > > +		if (dtp->dt_maxretained > 0)
> > > +			newm = 2 * dtp->dt_maxretained;
> > > +		else
> > > +			newm = 8;
> > > +		while (newm <= nclause)
> > > +			newm *= 2;
> > > +
> > > +		/* Allocate the bigger array. */
> > > +		newr = dt_zalloc(dtp, newm * sizeof(dtrace_probedesc_t *));
> > > +		if (newr == NULL)
> > > +			return 0;
> > > +
> > > +		/* Copy data and reset the parameters. */
> > > +		memcpy(newr, dtp->dt_retained, dtp->dt_maxretained * sizeof(dtrace_probedesc_t *));
> > > +		dt_free(dtp, dtp->dt_retained);
> > > +		dtp->dt_retained = newr;
> > > +		dtp->dt_maxretained = newm;
> > > +	}
> > > +
> > > +	/* Add probe description to retained pdescs. */
> > > +	assert(dtp->dt_retained[nclause] == NULL);  // FIXME can probably drop this
> > > +	dtp->dt_retained[nclause] = pdp;        // FIXME I think just pointing to pdp is okay.  No need to cache my own copy.
> > >   	st.cnt = cnt;
> > >   	st.idp = sdp->dtsd_clause;
> > > -- 
> > > 2.18.4
> > > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 19/38] Retain probe descriptions
  2024-07-19 19:52     ` Kris Van Hees
@ 2024-07-22 18:37       ` Eugene Loh
  0 siblings, 0 replies; 5+ messages in thread
From: Eugene Loh @ 2024-07-22 18:37 UTC (permalink / raw)
  To: dtrace, dtrace-devel

On 7/19/24 15:52, Kris Van Hees wrote:

> On Thu, Jul 18, 2024 at 11:34:05PM -0400, Eugene Loh wrote:
>> What do you think about replacing the array dt_retained[n] (as in,
>> "dt_clause_$n") with a hash table that uses a string (like
>> "dt_clause_$n") to look up a probe description?
>>
>> On 7/18/24 17:25, Kris Van Hees wrote:
>>
>>> I think this needs an actual description of what you are trying to do,
>> An underlying probe gets a list of clauses to call in its trampoline.
>> Actually, the underlying probe looks up a mask in some BPF map and then
>> walks the clauses, calling them conditionally based on the mask bits.
>>
>> To construct the mask for a particular overlying probe, we walk the
>> clauses.  Each clause is associated with a particular probe description.  We
>> compare the overlying probe to the respective probe description to decide
>> whether to set that particular bit.
>>
>> So, a uprp has a list of clauses.  We want some way of getting from clauses
>> to probe descriptions.  The linear array using that $n was one way.  I can
>> move over to the hash table based on clause string name if you prefer.
>>
>> (Incidentally, a uprp's clause list is set during dt_probe_add_clause(), at
>> which point we do not know the motivating probe description.

Just to be clear, all I mean here is that dt_probe_add_clause() is being 
fed a specific prp;  dt_probe_add_clause() is not being given the probe 
description that led to this prp.

> Well, yes and no.  At the point of compilation, when there is not a single
> task yet with the USDT probes we're wanting to probe, there won't be any
> underlying probe(s) yet.

I don't want to get side-tracked here, but when we compile and call 
dt_setcontext(), we actually call dt_pid_create_probes() and do create 
underlying probes (well, for processes that already exist). FWIW.  But 
that might not matter for our discussion.

> So all there will be are compiled clauses, and the
> statements they belong to know the probedesc through the ecbdesc.  So, we
> cannot even count on the BPF program already existing with all clauses linked
> into it so all that needs to be done is setting the mask.
>
> I am not clear how this fits in the design you came up with.

The mask is constructed after everything has been compiled.  I'm a 
little unclear what you're getting at.  For an underlying probe, there 
should be a list of provider descriptions.  Once you have that, you can 
both generate the BPF trampoline and (for a particular pid) construct a 
mask.  Which of those two tasks you do first or second doesn't matter.

Again, I'm unclear what you're saying, but (in case it's helpful) 
constructing the mask doesn't need the BPF trampoline;  it simply needs 
the list of provider descriptions that BPF trampoline cg will use.

> Even for the case where we already have the underlying probe (and thus the BPF
> prgroam), I think it might perhaps be a cleaner design to not just store
> clauses in the clause list, but perhaps store statements (which gives access
> to the pdescs).  Then, when we encounter a wildcard match that relates us to
> an underlying probe that exists, you can just go through the statement list
> for that underlying probe and use a running index in that list to know which
> bits to set in the bitmap.

We only need the provider descriptions.  I understood you earlier to 
suggest keeping entire probe descriptions.  Are you (now) actually 
suggesting keeping statements?  If so, how about a hash table that takes 
a clause string name (like "dt_clause_37") and looks up a statement, 
from which we get a probe description, from which we get a provider 
description?  Would that be in line with what you're suggesting?  I 
don't see the value in storing statements inside of probe descriptions, 
but I confess I'm having trouble thinking straight about any of this.

>>> because
>>> retaining pdescs in and of itself does not seem meaningful.  They do not
>>> contain a reference to the clause they were associated with when the code was
>>> parsed and compiled) and it seems to me that you are doing a sort-of reverse
>>> association, and are actually (indirectly) retaining clauses by keeping a
>>> link of pdescs, one per clause.  And you identify the clause by its numeric
>>> id as obtained from its identifier name (which is dynamically assigned during
>>> compilation).  That seems fragile and rather non-intuitive.
>>>
>>> If memory serves me right, the intent was to retain enablings.  Of course, in
>>> the current implementation of DTrace, enablings are nothing more than probes
>>> because the probes have BPF programs associated with them (unless they are
>>> overlying probes, in which case the actual code that gets loaded is part of the
>>> BPF program of the underlying probe (which will be in the enablings list).
>>>
>>> When we are retaining enablings, it seems to me that it would make more sense
>>> to keep a list of objects { pdesc, clause } where clause would be the
>>> identifier name of the compiled clause (similar to a function name).  Using a
>>> pointer to the actual clause instead is (I think) not going to work becauase
>>> AFAIK the clause does not store a pointer to its own identiifier.
>>>
>>> In the end, this will largely do the same thing as your code (here and when it
>>> gets used later), but it removes the implied dependency on how identifier names
>>> for clauses are generated.  That could change in the future, and doing so would
>>> completely break this imlementation.
>>>
>>> On Thu, Jun 27, 2024 at 01:38:10AM -0400, eugene.loh@oracle.com wrote:
>>>> From: Eugene Loh <eugene.loh@oracle.com>
>>>>
>>>> To support matching probes after dtrace has started, we retain
>>>> probe descriptions.
>>>>
>>>> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
>>>> ---
>>>>    libdtrace/dt_impl.h    |  2 ++
>>>>    libdtrace/dt_program.c | 37 ++++++++++++++++++++++++++++++++++++-
>>>>    2 files changed, 38 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h
>>>> index 2132eda2..445cd602 100644
>>>> --- a/libdtrace/dt_impl.h
>>>> +++ b/libdtrace/dt_impl.h
>>>> @@ -300,6 +300,8 @@ struct dtrace_hdl {
>>>>    	dt_list_t dt_programs;	/* linked list of dtrace_prog_t's */
>>>>    	dt_list_t dt_xlators;	/* linked list of dt_xlator_t's */
>>>>    	dt_list_t dt_enablings;	/* list of (to be) enabled probes */
>>>> +	dtrace_probedesc_t **dt_retained; /* array of retained pdescs */
>>>> +	int dt_maxretained;	/* number of retained pdescs allocated */
>>>>    	struct dt_xlator **dt_xlatormap; /* dt_xlator_t's indexed by dx_id */
>>>>    	id_t dt_xlatorid;	/* next dt_xlator_t id to assign */
>>>>    	dt_ident_t *dt_externs;	/* linked list of external symbol identifiers */
>>>> diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c
>>>> index a4b052fc..7106c249 100644
>>>> --- a/libdtrace/dt_program.c
>>>> +++ b/libdtrace/dt_program.c
>>>> @@ -163,7 +163,42 @@ dt_prog_stmt(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, dtrace_stmtdesc_t *sdp,
>>>>    {
>>>>    	pi_state_t		st;
>>>>    	dtrace_probedesc_t	*pdp = &sdp->dtsd_ecbdesc->dted_probe;
>>>> -	int			rc;
>>>> +	char			*sclause = sdp->dtsd_clause->di_name;
>>>> +	int			rc, nclause;
>>>> +
>>>> +	/* Get the clause number from the clause name "dt_clause_"$n. */
>>>> +	assert(strncmp(sclause, "dt_clause_", strlen("dt_clause_")) == 0);    // FIXME can probably drop this check (make sure the name starts "dt_clause_")
>>>> +	nclause = atoi(sclause + strlen("dt_clause_"));
>>>> +	assert(nclause < 65536);	// FIXME are we okay here with some arbitrary limit?
>>>> +
>>>> +	/* Grow the list of retained pdescs, if necessary. */
>>>> +	if (nclause >= dtp->dt_maxretained) {
>>>> +		int			newm;
>>>> +		dtrace_probedesc_t	**newr;
>>>> +
>>>> +		/* Figure how big to make the array. */
>>>> +		if (dtp->dt_maxretained > 0)
>>>> +			newm = 2 * dtp->dt_maxretained;
>>>> +		else
>>>> +			newm = 8;
>>>> +		while (newm <= nclause)
>>>> +			newm *= 2;
>>>> +
>>>> +		/* Allocate the bigger array. */
>>>> +		newr = dt_zalloc(dtp, newm * sizeof(dtrace_probedesc_t *));
>>>> +		if (newr == NULL)
>>>> +			return 0;
>>>> +
>>>> +		/* Copy data and reset the parameters. */
>>>> +		memcpy(newr, dtp->dt_retained, dtp->dt_maxretained * sizeof(dtrace_probedesc_t *));
>>>> +		dt_free(dtp, dtp->dt_retained);
>>>> +		dtp->dt_retained = newr;
>>>> +		dtp->dt_maxretained = newm;
>>>> +	}
>>>> +
>>>> +	/* Add probe description to retained pdescs. */
>>>> +	assert(dtp->dt_retained[nclause] == NULL);  // FIXME can probably drop this
>>>> +	dtp->dt_retained[nclause] = pdp;        // FIXME I think just pointing to pdp is okay.  No need to cache my own copy.
>>>>    	st.cnt = cnt;
>>>>    	st.idp = sdp->dtsd_clause;
>>>> -- 
>>>> 2.18.4
>>>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-22 18:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-27  5:38 [PATCH 19/38] Retain probe descriptions eugene.loh
2024-07-18 21:25 ` Kris Van Hees
2024-07-19  3:34   ` Eugene Loh
2024-07-19 19:52     ` Kris Van Hees
2024-07-22 18:37       ` Eugene Loh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox