* [PATCH] Improve dt_pid_create_usdt_probes() error handling
@ 2024-12-02 4:27 eugene.loh
2025-01-06 19:13 ` Kris Van Hees
0 siblings, 1 reply; 2+ messages in thread
From: eugene.loh @ 2024-12-02 4:27 UTC (permalink / raw)
To: dtrace, dtrace-devel
From: Eugene Loh <eugene.loh@oracle.com>
In one run of the test suite, roughly 90 consecutive tests failed.
The exact circumstances are difficult to reconstruct, but basically
they complained with messages like:
dtrace: failed to compile script ...: failed to grab process 23132
The same pid was always indicated. Then, the errors stopped.
Apparently, the problem was trying "dtrace -s foo.d" where the script
has a BEGIN (or END) probe. Then we get to
cmd/dtrace.c compile_file
-> libdtrace/dt_cc.c dt_program_compile
-> libdtrace/dt_cc.c dt_compile
-> libdtrace/dt_cc.c dt_compile_one_clause
-> libdtrace/dt_cc.c dt_setcontext
-> libdtrace/dt_pid.c dt_pid_create_usdt_probes
We look for processes that might have USDT probes. Since the provider
description is blank, we check every process in .../run/dtrace. It
turned out, that pid 23132 could not be locked. This sent an error
back up the call stack.
We do not know why process 23132 could not be locked, but having dtrace
fail under these circumstances (using BEGIN or END probe) seems severe.
Change dt_pid_create_usdt_probes() error handling. Even if there is a
problem with some USDT processes, report success anyhow if there was a
glob pid specification. If, on the other hand, a pid was specifically
requested, then a problem with that pid results in an error.
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
---
libdtrace/dt_pid.c | 35 +++++++++++++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)
diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
index e0a26d5aa..fd94a0706 100644
--- a/libdtrace/dt_pid.c
+++ b/libdtrace/dt_pid.c
@@ -1252,7 +1252,8 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
DTRACE_PROC_SHORTLIVED) < 0) {
dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB,
"failed to grab process %d", (int)pid);
- return -1;
+ err = 1; // FIXME but do we want to set the error if we end up return 0?
+ continue;
}
dpr = dt_proc_lookup(dtp, pid);
assert(dpr != NULL);
@@ -1272,7 +1273,37 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
free(globpat);
globfree(&globbuf);
- return err ? -1 : 0;
+ /* If no errors, report success. */
+ if (err == 0)
+ return 0;
+
+ /* If provider description was blank, report success. */
+ if (pdp->prv[0] == '\0')
+ return 0;
+
+ /* Look to see if the provider description had a pid glob. */
+ for (i = strlen(pdp->prv) - 1; i >= 0; i--) {
+ /*
+ * If we hit a '*' before a nondigit, we have a pid glob.
+ * So, even though err==0, we declare success.
+ */
+ if (pdp->prv[i] == '*')
+ return 0;
+
+ /*
+ * If we hit a nondigit before a '*', we do not have a pid glob.
+ * Since a pid was specified explicitly, err==1 means an error.
+ */
+ if (!isdigit(pdp->prv[i]))
+ return -1;
+ }
+
+ /*
+ * If the provider description was exclusively digits,
+ * it was not a legitimate USDT provider description.
+ * So it makes perfect sense not to return any probes.
+ */
+ return 0;
}
int
--
2.43.5
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] Improve dt_pid_create_usdt_probes() error handling
2024-12-02 4:27 [PATCH] Improve dt_pid_create_usdt_probes() error handling eugene.loh
@ 2025-01-06 19:13 ` Kris Van Hees
0 siblings, 0 replies; 2+ messages in thread
From: Kris Van Hees @ 2025-01-06 19:13 UTC (permalink / raw)
To: eugene.loh; +Cc: dtrace, dtrace-devel
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
... although I think that the problem is (in part) on the design side of
USDT handling, where we do a bit too much work at this point of probe
discovery, when we do not even know whether we will be using those probes.
But that is subject matter for a more extensive reworking of the code.
This patch seems sufficient to deal with the existing problem of DTrace
being affected by unruly processes.
On Sun, Dec 01, 2024 at 11:27:41PM -0500, eugene.loh@oracle.com wrote:
> From: Eugene Loh <eugene.loh@oracle.com>
>
> In one run of the test suite, roughly 90 consecutive tests failed.
> The exact circumstances are difficult to reconstruct, but basically
> they complained with messages like:
>
> dtrace: failed to compile script ...: failed to grab process 23132
>
> The same pid was always indicated. Then, the errors stopped.
>
> Apparently, the problem was trying "dtrace -s foo.d" where the script
> has a BEGIN (or END) probe. Then we get to
>
> cmd/dtrace.c compile_file
> -> libdtrace/dt_cc.c dt_program_compile
> -> libdtrace/dt_cc.c dt_compile
> -> libdtrace/dt_cc.c dt_compile_one_clause
> -> libdtrace/dt_cc.c dt_setcontext
> -> libdtrace/dt_pid.c dt_pid_create_usdt_probes
>
> We look for processes that might have USDT probes. Since the provider
> description is blank, we check every process in .../run/dtrace. It
> turned out, that pid 23132 could not be locked. This sent an error
> back up the call stack.
>
> We do not know why process 23132 could not be locked, but having dtrace
> fail under these circumstances (using BEGIN or END probe) seems severe.
>
> Change dt_pid_create_usdt_probes() error handling. Even if there is a
> problem with some USDT processes, report success anyhow if there was a
> glob pid specification. If, on the other hand, a pid was specifically
> requested, then a problem with that pid results in an error.
>
> Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
> ---
> libdtrace/dt_pid.c | 35 +++++++++++++++++++++++++++++++++--
> 1 file changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
> index e0a26d5aa..fd94a0706 100644
> --- a/libdtrace/dt_pid.c
> +++ b/libdtrace/dt_pid.c
> @@ -1252,7 +1252,8 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
> DTRACE_PROC_SHORTLIVED) < 0) {
> dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB,
> "failed to grab process %d", (int)pid);
> - return -1;
> + err = 1; // FIXME but do we want to set the error if we end up return 0?
> + continue;
> }
> dpr = dt_proc_lookup(dtp, pid);
> assert(dpr != NULL);
> @@ -1272,7 +1273,37 @@ dt_pid_create_usdt_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *
> free(globpat);
> globfree(&globbuf);
>
> - return err ? -1 : 0;
> + /* If no errors, report success. */
> + if (err == 0)
> + return 0;
> +
> + /* If provider description was blank, report success. */
> + if (pdp->prv[0] == '\0')
> + return 0;
> +
> + /* Look to see if the provider description had a pid glob. */
> + for (i = strlen(pdp->prv) - 1; i >= 0; i--) {
> + /*
> + * If we hit a '*' before a nondigit, we have a pid glob.
> + * So, even though err==0, we declare success.
> + */
> + if (pdp->prv[i] == '*')
> + return 0;
> +
> + /*
> + * If we hit a nondigit before a '*', we do not have a pid glob.
> + * Since a pid was specified explicitly, err==1 means an error.
> + */
> + if (!isdigit(pdp->prv[i]))
> + return -1;
> + }
> +
> + /*
> + * If the provider description was exclusively digits,
> + * it was not a legitimate USDT provider description.
> + * So it makes perfect sense not to return any probes.
> + */
> + return 0;
> }
>
> int
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-01-06 19:13 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-02 4:27 [PATCH] Improve dt_pid_create_usdt_probes() error handling eugene.loh
2025-01-06 19:13 ` Kris Van Hees
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.