Linux DTrace development list
 help / color / mirror / Atom feed
From: Kris Van Hees <kris.van.hees@oracle.com>
To: eugene.loh@oracle.com
Cc: dtrace@lists.linux.dev, dtrace-devel@oss.oracle.com
Subject: Re: [PATCH] Improve dt_pid_create_usdt_probes() error handling
Date: Mon, 6 Jan 2025 14:13:43 -0500	[thread overview]
Message-ID: <Z3wrZ7dmS/vaVxF3@oracle.com> (raw)
In-Reply-To: <20241202042741.12328-1-eugene.loh@oracle.com>

Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>

... although I think that the problem is (in part) on the design side of
USDT handling, where we do a bit too much work at this point of probe
discovery, when we do not even know whether we will be using those probes.

But that is subject matter for a more extensive reworking of the code.

This patch seems sufficient to deal with the existing problem of DTrace
being affected by unruly processes.

On Sun, Dec 01, 2024 at 11:27:41PM -0500, eugene.loh@oracle.com wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
> 
> In one run of the test suite, roughly 90 consecutive tests failed.
> The exact circumstances are difficult to reconstruct, but basically
> they complained with messages like:
> 
>     dtrace: failed to compile script ...: failed to grab process 23132
> 
> The same pid was always indicated.  Then, the errors stopped.
> 
> Apparently, the problem was trying "dtrace -s foo.d" where the script
> has a BEGIN (or END) probe.  Then we get to
> 
>     cmd/dtrace.c compile_file
>     ->  libdtrace/dt_cc.c dt_program_compile
>     ->  libdtrace/dt_cc.c dt_compile
>     ->  libdtrace/dt_cc.c dt_compile_one_clause
>     ->  libdtrace/dt_cc.c dt_setcontext
>     ->  libdtrace/dt_pid.c dt_pid_create_usdt_probes
> 
> We look for processes that might have USDT probes.  Since the provider
> description is blank, we check every process in .../run/dtrace.  It
> turned out, that pid 23132 could not be locked.  This sent an error
> back up the call stack.
> 
> We do not know why process 23132 could not be locked, but having dtrace
> fail under these circumstances (using BEGIN or END probe) seems severe.
> 
> Change dt_pid_create_usdt_probes() error handling.  Even if there is a
> problem with some USDT processes, report success anyhow if there was a
> glob pid specification.  If, on the other hand, a pid was specifically
> requested, then a problem with that pid results in an error.
> 
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
>  libdtrace/dt_pid.c | 35 +++++++++++++++++++++++++++++++++--
>  1 file changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
> index e0a26d5aa..fd94a0706 100644
> --- a/libdtrace/dt_pid.c
> +++ b/libdtrace/dt_pid.c
> @@ -1252,7 +1252,8 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
>  				      DTRACE_PROC_SHORTLIVED) < 0) {
>  			dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB,
>  			    "failed to grab process %d", (int)pid);
> -			return -1;
> +			err = 1;  // FIXME but do we want to set the error if we end up return 0?
> +			continue;
>  		}
>  		dpr = dt_proc_lookup(dtp, pid);
>  		assert(dpr != NULL);
> @@ -1272,7 +1273,37 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
>  	free(globpat);
>  	globfree(&globbuf);
>  
> -	return err ? -1 : 0;
> +	/* If no errors, report success. */
> +	if (err == 0)
> +		return 0;
> +
> +	/* If provider description was blank, report success. */
> +	if (pdp->prv[0] == '\0')
> +		return 0;
> +
> +	/* Look to see if the provider description had a pid glob. */
> +	for (i = strlen(pdp->prv) - 1; i >= 0; i--) {
> +		/*
> +		 * If we hit a '*' before a nondigit, we have a pid glob.
> +		 * So, even though err==0, we declare success.
> +		 */
> +		if (pdp->prv[i] == '*')
> +			return 0;
> +
> +		/*
> +		 * If we hit a nondigit before a '*', we do not have a pid glob.
> +		 * Since a pid was specified explicitly, err==1 means an error.
> +		 */
> +		if (!isdigit(pdp->prv[i]))
> +			return -1;
> +	}
> +
> +	/*
> +	 * If the provider description was exclusively digits,
> +	 * it was not a legitimate USDT provider description.
> +	 * So it makes perfect sense not to return any probes.
> +	 */
> +	return 0;
>  }
>  
>  int
> -- 
> 2.43.5
> 

      reply	other threads:[~2025-01-06 19:13 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-02  4:27 [PATCH] Improve dt_pid_create_usdt_probes() error handling eugene.loh
2025-01-06 19:13 ` Kris Van Hees [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z3wrZ7dmS/vaVxF3@oracle.com \
    --to=kris.van.hees@oracle.com \
    --cc=dtrace-devel@oss.oracle.com \
    --cc=dtrace@lists.linux.dev \
    --cc=eugene.loh@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox