From: Kris Van Hees <kris.van.hees@oracle.com>
To: eugene.loh@oracle.com
Cc: dtrace@lists.linux.dev, dtrace-devel@oss.oracle.com
Subject: Re: [PATCH] Improve dt_pid_create_usdt_probes() error handling
Date: Mon, 6 Jan 2025 14:13:43 -0500 [thread overview]
Message-ID: <Z3wrZ7dmS/vaVxF3@oracle.com> (raw)
In-Reply-To: <20241202042741.12328-1-eugene.loh@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
... although I think that the problem is (in part) on the design side of
USDT handling, where we do a bit too much work at this point of probe
discovery, when we do not even know whether we will be using those probes.
But that is subject matter for a more extensive reworking of the code.
This patch seems sufficient to deal with the existing problem of DTrace
being affected by unruly processes.
On Sun, Dec 01, 2024 at 11:27:41PM -0500, eugene.loh@oracle.com wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> In one run of the test suite, roughly 90 consecutive tests failed.
> The exact circumstances are difficult to reconstruct, but basically
> they complained with messages like:
>
> dtrace: failed to compile script ...: failed to grab process 23132
>
> The same pid was always indicated. Then, the errors stopped.
>
> Apparently, the problem was trying "dtrace -s foo.d" where the script
> has a BEGIN (or END) probe. Then we get to
>
> cmd/dtrace.c compile_file
> -> libdtrace/dt_cc.c dt_program_compile
> -> libdtrace/dt_cc.c dt_compile
> -> libdtrace/dt_cc.c dt_compile_one_clause
> -> libdtrace/dt_cc.c dt_setcontext
> -> libdtrace/dt_pid.c dt_pid_create_usdt_probes
>
> We look for processes that might have USDT probes. Since the provider
> description is blank, we check every process in .../run/dtrace. It
> turned out, that pid 23132 could not be locked. This sent an error
> back up the call stack.
>
> We do not know why process 23132 could not be locked, but having dtrace
> fail under these circumstances (using BEGIN or END probe) seems severe.
>
> Change dt_pid_create_usdt_probes() error handling. Even if there is a
> problem with some USDT processes, report success anyhow if there was a
> glob pid specification. If, on the other hand, a pid was specifically
> requested, then a problem with that pid results in an error.
>
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
> libdtrace/dt_pid.c | 35 +++++++++++++++++++++++++++++++++--
> 1 file changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
> index e0a26d5aa..fd94a0706 100644
> --- a/libdtrace/dt_pid.c
> +++ b/libdtrace/dt_pid.c
> @@ -1252,7 +1252,8 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
> DTRACE_PROC_SHORTLIVED) < 0) {
> dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB,
> "failed to grab process %d", (int)pid);
> - return -1;
> + err = 1; // FIXME but do we want to set the error if we end up return 0?
> + continue;
> }
> dpr = dt_proc_lookup(dtp, pid);
> assert(dpr != NULL);
> @@ -1272,7 +1273,37 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
> free(globpat);
> globfree(&globbuf);
>
> - return err ? -1 : 0;
> + /* If no errors, report success. */
> + if (err == 0)
> + return 0;
> +
> + /* If provider description was blank, report success. */
> + if (pdp->prv[0] == '\0')
> + return 0;
> +
> + /* Look to see if the provider description had a pid glob. */
> + for (i = strlen(pdp->prv) - 1; i >= 0; i--) {
> + /*
> + * If we hit a '*' before a nondigit, we have a pid glob.
> + * So, even though err==0, we declare success.
> + */
> + if (pdp->prv[i] == '*')
> + return 0;
> +
> + /*
> + * If we hit a nondigit before a '*', we do not have a pid glob.
> + * Since a pid was specified explicitly, err==1 means an error.
> + */
> + if (!isdigit(pdp->prv[i]))
> + return -1;
> + }
> +
> + /*
> + * If the provider description was exclusively digits,
> + * it was not a legitimate USDT provider description.
> + * So it makes perfect sense not to return any probes.
> + */
> + return 0;
> }
>
> int
> --
> 2.43.5
>
prev parent reply other threads:[~2025-01-06 19:13 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-02 4:27 [PATCH] Improve dt_pid_create_usdt_probes() error handling eugene.loh
2025-01-06 19:13 ` Kris Van Hees [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z3wrZ7dmS/vaVxF3@oracle.com \
--to=kris.van.hees@oracle.com \
--cc=dtrace-devel@oss.oracle.com \
--cc=dtrace@lists.linux.dev \
--cc=eugene.loh@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.