Re: [PATCH 06/15] tools subcmd: Robust fallback and existence checks for process reaping

Linux Perf Users
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Ian Rogers" <irogers@google.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 06/15] tools subcmd: Robust fallback and existence checks for process reaping
Date: Mon, 01 Jun 2026 06:28:55 +0000	[thread overview]
Message-ID: <20260601062856.092B01F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260601061401.1541457-7-irogers@google.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] Early returns for errors fail to set `cmd->finished = 1`, causing catastrophic `waitpid` errors and arbitrary process reaping in downstream calls.
- [Medium] The implementation contradicts the commit message by aggressively reaping processes with `waitpid()` instead of preserving zombies with `kill()`, and completely fails to fix the PID reuse vulnerability.

Pre-existing issues:
- [High] `tpebs_stop()` clears `tpebs_cmd.pid` to 0 before calling `finish_command()`, causing `waitpid(0)` to accidentally reap unrelated child processes.
- [High] `tpebs_send_record_cmd()` drops `tpebs_mtx` while polling, allowing concurrent entry into shutdown sequences.
--

commit 69e7ae6fd81118d00c12a9bae1bdf8a2eb52f3fd
Author: Ian Rogers <irogers@google.com>
tools subcmd: Robust fallback and existence checks for process reaping

This commit modifies the check_if_command_finished() function to handle
invalid PIDs gracefully and introduces a fallback mechanism when
/proc/pid/status is inaccessible.

> Transition the procfs status fopen() fallback from active waitpid() calls
> to robust kill(pid, 0) existence and ESRCH detection logic. This allows
> zombie processes to remain until explicitly reaped by finish_command(),
> eradicating the risk of infinite loops induced by kernel PID reuse across
> unrelated system daemons.

[Severity: Medium]
Does the implementation contradict the commit message?

The commit message states we transition to kill(pid, 0) existence and ESRCH
detection logic to allow zombie processes to remain.

However, looking at check_if_command_finished(), waitpid() with WNOHANG is
used instead, which actively reaps the process.

Additionally, since the fopen("/proc/<pid>/status") check is retained as the
primary path, if an unrelated daemon reuses the PID, fopen will succeed, read
its running state, and return 0 indefinitely. Does this mean the infinite loop
vulnerability the commit message claims to eradicate is still present?

> diff --git a/tools/lib/subcmd/run-command.c b/tools/lib/subcmd/run-command.c
> --- a/tools/lib/subcmd/run-command.c
> +++ b/tools/lib/subcmd/run-command.c
> @@ -233,7 +233,14 @@ int check_if_command_finished(struct child_process *cmd)
>  	char filename[6 + MAX_STRLEN_TYPE(typeof(cmd->pid)) + 7 + 1];
>  	char status_line[256];
>  	FILE *status_file;
> +#endif
> +
> +	if (cmd->finished)
> +		return 1;
> +	if (cmd->pid <= 0)
> +		return 1;

[Severity: High]
Does returning early here without setting cmd->finished = 1 cause issues later?

If check_if_command_finished() returns 1 but cmd->finished is still 0, a
subsequent call to finish_command() will eventually call wait_or_whine(),
which executes a blocking waitpid(cmd->pid, ...).

If cmd->pid is -1 due to a fork failure, executing waitpid(-1, ...) instructs
the kernel to blindly wait for any child process, potentially reaping parallel
tests and causing state corruption.

> +
> +#ifdef __linux__
>  	/*
>  	 * Check by reading /proc/<pid>/status as calling waitpid causes
>  	 * stdout/stderr to be closed and data lost.
> @@ -241,8 +248,48 @@ int check_if_command_finished(struct child_process *cmd)
>  	sprintf(filename, "/proc/%u/status", cmd->pid);
>  	status_file = fopen(filename, "r");
>  	if (status_file == NULL) {
> -		/* Open failed assume finish_command was called. */
> -		return true;
> +		int status;
> +		pid_t waiting;
> +
> +		/*
> +		 * fopen() can fail with ENOENT if the process has been reaped.
> +		 * It can also fail with EMFILE/ENFILE if RLIMIT_NOFILE is reached.
> +		 * In those cases, use waitpid(..., WNOHANG) to robustly check
> +		 * and reap the process if it has exited.
> +		 */
> +		if (errno == ENOENT)
> +			return 1;

[Severity: High]
Are we also missing cmd->finished = 1 here?

> +
> +		waiting = waitpid(cmd->pid, &status, WNOHANG);

[Severity: Medium]
This is the waitpid() call mentioned earlier that seems to contradict the
commit message's claim of using kill().

> +		if (waiting == cmd->pid) {
> +			int result;
> +			int code;
> +
> +			cmd->finished = 1;

[ ... ]

> +			cmd->finish_result = result;
> +			return 1;
> +		}
> +		if (waiting < 0 && (errno == ECHILD || errno == ESRCH))
> +			return 1;

[Severity: Medium]
Can waitpid() actually return ESRCH?

[Severity: High]
Are we missing cmd->finished = 1 here as well, similar to the other early
returns?

While reviewing these changes, I noticed a couple of related issues in the
subsystem:

tools/perf/util/intel-tpebs.c:tpebs_stop() {
        ...
        tpebs_cmd.pid = 0;
        mutex_unlock(tpebs_mtx_get());
        pthread_join(tpebs_reader_thread, NULL);
        mutex_lock(tpebs_mtx_get());
        close(control_fd[0]);
        ...
        ret = finish_command(&tpebs_cmd);
        ...
}

[Severity: High]
This isn't a bug introduced by this patch, but does setting tpebs_cmd.pid to 0
before calling finish_command() cause waitpid(0) to execute?

Passing PID 0 to waitpid() instructs the kernel to wait for any child process
in the current process group. Could this intercept and reap unrelated
background child processes spawned by the perf tool, permanently losing their
exit statuses?

tools/perf/util/intel-tpebs.c:tpebs_send_record_cmd() {
        ...
        mutex_unlock(tpebs_mtx_get());

        /* Send perf record command.*/
        len = strlen(msg);
        ret = write(control_fd[1], msg, len);
        ...
}

[Severity: High]
This isn't a bug introduced by this patch, but does dropping tpebs_mtx while
polling in tpebs_send_record_cmd() break mutual exclusion?

If tpebs_stop() is called concurrently, it invokes
tpebs_send_record_cmd(EVLIST_CTL_CMD_STOP_TAG), dropping the lock during the
shutdown sequence. Does this allow another thread to enter tpebs_stop()
simultaneously, leading to undefined behavior like double pthread_join(),
double close(), or double calls to finish_command()?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260601061401.1541457-1-irogers@google.com?part=6

next prev parent reply	other threads:[~2026-06-01  6:28 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 23:04 [PATCH v1 00/14] perf test: Harness improvements Ian Rogers
2026-05-13 23:04 ` [PATCH v1 01/14] perf jevents.py: Make generated C code more kernel style Ian Rogers
2026-05-13 23:04 ` [PATCH v1 02/14] perf pmu-events: Add API to get metric table name and iterate tables Ian Rogers
2026-05-14 11:42   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 03/14] perf test: Drain pipe after child finishes to avoid losing output Ian Rogers
2026-05-13 23:04 ` [PATCH v1 04/14] perf test: Support dynamic test suites with setup callback and private data Ian Rogers
2026-05-14 12:10   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 05/14] perf test pmu-events: A sub-test per metric table Ian Rogers
2026-05-13 23:04 ` [PATCH v1 06/14] perf test: Refactor parallel poll loop to drain all pipes simultaneously Ian Rogers
2026-05-14 14:27   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 07/14] perf test: Show snippet failure output for verbose=1 Ian Rogers
2026-05-14 15:50   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 08/14] perf test: Add summary reporting Ian Rogers
2026-05-14 16:10   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 09/14] perf test: Fix subtest status alignment for multi-digit indexes Ian Rogers
2026-05-13 23:04 ` [PATCH v1 10/14] perf test: Skip shebang and SPDX comments in shell test descriptions Ian Rogers
2026-05-13 23:04 ` [PATCH v1 11/14] perf test: Split monolithic 'util' test suite into sub-tests Ian Rogers
2026-05-13 23:04 ` [PATCH v1 12/14] perf test: Add -j/--junit option for JUnit XML test reports Ian Rogers
2026-05-14 17:48   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 13/14] perf test: Add shell test to validate JUnit XML reporting output Ian Rogers
2026-05-13 23:04 ` [PATCH v1 14/14] perf test: Remove /usr/bin/cc dependency from Intel PT shell test Ian Rogers
2026-05-14 18:28   ` sashiko-bot
2026-05-31  5:27 ` [PATCH v2 00/14] perf test: Accelerate parallel test harness and add JUnit XML reporting Ian Rogers
2026-05-31  5:27   ` [PATCH v2 01/14] perf jevents.py: Make generated C code more kernel style Ian Rogers
2026-05-31  5:36     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 02/14] perf pmu-events: Add API to get metric table name and iterate tables Ian Rogers
2026-05-31  5:36     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 03/14] perf test: Drain pipe after child finishes to avoid losing output Ian Rogers
2026-05-31  5:37     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 04/14] perf test: Support dynamic test suites with setup callback and private data Ian Rogers
2026-05-31  5:27   ` [PATCH v2 05/14] perf test pmu-events: A sub-test per metric table Ian Rogers
2026-05-31  5:27   ` [PATCH v2 06/14] perf test: Refactor parallel poll loop to drain all pipes simultaneously Ian Rogers
2026-05-31  5:39     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 07/14] perf test: Show snippet failure output for verbose=1 Ian Rogers
2026-05-31  5:37     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 08/14] perf test: Add summary reporting Ian Rogers
2026-05-31  5:38     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 09/14] perf test: Fix subtest status alignment for multi-digit indexes Ian Rogers
2026-05-31  5:27   ` [PATCH v2 10/14] perf test: Skip shebang and SPDX comments in shell test descriptions Ian Rogers
2026-05-31  5:46     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 11/14] perf test: Split monolithic 'util' test suite into sub-tests Ian Rogers
2026-05-31  5:48     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 12/14] perf test: Add -j/--junit option for JUnit XML test reports Ian Rogers
2026-05-31  5:43     ` sashiko-bot
2026-05-31  5:27   ` [PATCH v2 13/14] perf test: Add shell test to validate JUnit XML reporting output Ian Rogers
2026-05-31  5:27   ` [PATCH v2 14/14] perf test: Remove /usr/bin/cc dependency from Intel PT shell test Ian Rogers
2026-05-31  5:47     ` sashiko-bot
2026-05-31  6:37   ` [PATCH v3 00/14] perf test: Accelerate parallel test harness and add JUnit XML reporting Ian Rogers
2026-05-31  6:37     ` [PATCH v3 01/14] perf jevents.py: Make generated C code more kernel style Ian Rogers
2026-05-31  6:46       ` sashiko-bot
2026-05-31  6:37     ` [PATCH v3 02/14] perf pmu-events: Add API to get metric table name and iterate tables Ian Rogers
2026-05-31  6:37     ` [PATCH v3 03/14] perf test: Drain pipe after child finishes to avoid losing output Ian Rogers
2026-05-31  6:37     ` [PATCH v3 04/14] perf test: Support dynamic test suites with setup callback and private data Ian Rogers
2026-05-31  6:37     ` [PATCH v3 05/14] perf test pmu-events: A sub-test per metric table Ian Rogers
2026-05-31  6:37     ` [PATCH v3 06/14] perf test: Refactor parallel poll loop to drain all pipes simultaneously Ian Rogers
2026-05-31  6:55       ` sashiko-bot
2026-05-31  6:37     ` [PATCH v3 07/14] perf test: Show snippet failure output for verbose=1 Ian Rogers
2026-05-31  6:47       ` sashiko-bot
2026-05-31  6:37     ` [PATCH v3 08/14] perf test: Add summary reporting Ian Rogers
2026-05-31  6:50       ` sashiko-bot
2026-05-31  6:37     ` [PATCH v3 09/14] perf test: Fix subtest status alignment for multi-digit indexes Ian Rogers
2026-05-31  6:37     ` [PATCH v3 10/14] perf test: Skip shebang and SPDX comments in shell test descriptions Ian Rogers
2026-05-31  6:52       ` sashiko-bot
2026-05-31  6:37     ` [PATCH v3 11/14] perf test: Split monolithic 'util' test suite into sub-tests Ian Rogers
2026-05-31  6:37     ` [PATCH v3 12/14] perf test: Add -j/--junit option for JUnit XML test reports Ian Rogers
2026-05-31  6:37     ` [PATCH v3 13/14] perf test: Add shell test to validate JUnit XML reporting output Ian Rogers
2026-05-31  6:37     ` [PATCH v3 14/14] perf test: Remove /usr/bin/cc dependency from Intel PT shell test Ian Rogers
2026-05-31  6:58       ` sashiko-bot
2026-05-31  8:22     ` [PATCH v4 00/15] perf test: Accelerate parallel test harness and add JUnit XML reporting Ian Rogers
2026-05-31  8:22       ` [PATCH v4 01/15] perf jevents.py: Make generated C code more kernel style Ian Rogers
2026-05-31  8:22       ` [PATCH v4 02/15] perf pmu-events: Add API to get metric table name and iterate tables Ian Rogers
2026-05-31  8:22       ` [PATCH v4 03/15] perf test: Drain pipe after child finishes to avoid losing output Ian Rogers
2026-05-31  8:22       ` [PATCH v4 04/15] perf test: Support dynamic test suites with setup callback and private data Ian Rogers
2026-05-31  8:22       ` [PATCH v4 05/15] perf test pmu-events: A sub-test per metric table Ian Rogers
2026-05-31  8:22       ` [PATCH v4 06/15] tools subcmd: Robust fallback and existence checks for process reaping Ian Rogers
2026-05-31  8:33         ` sashiko-bot
2026-05-31  8:22       ` [PATCH v4 07/15] perf test: Refactor parallel poll loop to drain all pipes simultaneously Ian Rogers
2026-05-31  8:34         ` sashiko-bot
2026-05-31  8:22       ` [PATCH v4 08/15] perf test: Show snippet failure output for verbose=1 Ian Rogers
2026-05-31  8:31         ` sashiko-bot
2026-05-31  8:22       ` [PATCH v4 09/15] perf test: Add summary reporting Ian Rogers
2026-05-31  8:33         ` sashiko-bot
2026-05-31  8:22       ` [PATCH v4 10/15] perf test: Fix subtest status alignment for multi-digit indexes Ian Rogers
2026-05-31  8:33         ` sashiko-bot
2026-05-31  8:22       ` [PATCH v4 11/15] perf test: Skip shebang and SPDX comments in shell test descriptions Ian Rogers
2026-05-31  8:22       ` [PATCH v4 12/15] perf test: Split monolithic 'util' test suite into sub-tests Ian Rogers
2026-05-31  8:22       ` [PATCH v4 13/15] perf test: Add -j/--junit option for JUnit XML test reports Ian Rogers
2026-05-31  8:41         ` sashiko-bot
2026-05-31  8:22       ` [PATCH v4 14/15] perf test: Add shell test to validate JUnit XML reporting output Ian Rogers
2026-05-31  8:22       ` [PATCH v4 15/15] perf test: Remove /usr/bin/cc dependency from Intel PT shell test Ian Rogers
2026-05-31  8:38         ` sashiko-bot
2026-06-01  0:05       ` [PATCH v5 00/15] perf test: Accelerate parallel test harness and add JUnit XML reporting Ian Rogers
2026-06-01  0:05         ` [PATCH 01/15] perf jevents.py: Make generated C code more kernel style Ian Rogers
2026-06-01  0:05         ` [PATCH 02/15] perf pmu-events: Add API to get metric table name and iterate tables Ian Rogers
2026-06-01  0:05         ` [PATCH 03/15] perf test: Drain pipe after child finishes to avoid losing output Ian Rogers
2026-06-01  0:05         ` [PATCH 04/15] perf test: Support dynamic test suites with setup callback and private data Ian Rogers
2026-06-01  0:05         ` [PATCH 05/15] perf test pmu-events: A sub-test per metric table Ian Rogers
2026-06-01  0:05         ` [PATCH 06/15] tools subcmd: Robust fallback and existence checks for process reaping Ian Rogers
2026-06-01  0:19           ` sashiko-bot
2026-06-01  0:05         ` [PATCH 07/15] perf test: Refactor parallel poll loop to drain all pipes simultaneously Ian Rogers
2026-06-01  0:19           ` sashiko-bot
2026-06-01  0:05         ` [PATCH 08/15] perf test: Show snippet failure output for verbose=1 Ian Rogers
2026-06-01  0:05         ` [PATCH 09/15] perf test: Add summary reporting Ian Rogers
2026-06-01  0:17           ` sashiko-bot
2026-06-01  0:05         ` [PATCH 10/15] perf test: Fix subtest status alignment for multi-digit indexes Ian Rogers
2026-06-01  0:05         ` [PATCH 11/15] perf test: Skip shebang and SPDX comments in shell test descriptions Ian Rogers
2026-06-01  0:05         ` [PATCH 12/15] perf test: Split monolithic 'util' test suite into sub-tests Ian Rogers
2026-06-01  0:05         ` [PATCH 13/15] perf test: Add -j/--junit option for JUnit XML test reports Ian Rogers
2026-06-01  0:23           ` sashiko-bot
2026-06-01  0:05         ` [PATCH 14/15] perf test: Add shell test to validate JUnit XML reporting output Ian Rogers
2026-06-01  0:05         ` [PATCH 15/15] perf test: Remove /usr/bin/cc dependency from Intel PT shell test Ian Rogers
2026-06-01  0:23           ` sashiko-bot
2026-06-01  6:13         ` [PATCH v6 00/15] perf test: Accelerate parallel test harness and add JUnit XML reporting Ian Rogers
2026-06-01  6:13           ` [PATCH 01/15] perf jevents.py: Make generated C code more kernel style Ian Rogers
2026-06-01  6:25             ` sashiko-bot
2026-06-01  6:13           ` [PATCH 02/15] perf pmu-events: Add API to get metric table name and iterate tables Ian Rogers
2026-06-01  6:13           ` [PATCH 03/15] perf test: Drain pipe after child finishes to avoid losing output Ian Rogers
2026-06-01  6:13           ` [PATCH 04/15] perf test: Support dynamic test suites with setup callback and private data Ian Rogers
2026-06-01  6:27             ` sashiko-bot
2026-06-01  6:13           ` [PATCH 05/15] perf test pmu-events: A sub-test per metric table Ian Rogers
2026-06-01  6:13           ` [PATCH 06/15] tools subcmd: Robust fallback and existence checks for process reaping Ian Rogers
2026-06-01  6:28             ` sashiko-bot [this message]
2026-06-01  6:13           ` [PATCH 07/15] perf test: Refactor parallel poll loop to drain all pipes simultaneously Ian Rogers
2026-06-01  6:28             ` sashiko-bot
2026-06-01  6:13           ` [PATCH 08/15] perf test: Show snippet failure output for verbose=1 Ian Rogers
2026-06-01  6:25             ` sashiko-bot
2026-06-01  6:13           ` [PATCH 09/15] perf test: Add summary reporting Ian Rogers
2026-06-01  6:24             ` sashiko-bot
2026-06-01  6:13           ` [PATCH 10/15] perf test: Fix subtest status alignment for multi-digit indexes Ian Rogers
2026-06-01  6:13           ` [PATCH 11/15] perf test: Skip shebang and SPDX comments in shell test descriptions Ian Rogers
2026-06-01  6:13           ` [PATCH 12/15] perf test: Split monolithic 'util' test suite into sub-tests Ian Rogers
2026-06-01  6:13           ` [PATCH 13/15] perf test: Add -j/--junit option for JUnit XML test reports Ian Rogers
2026-06-01  6:33             ` sashiko-bot
2026-06-01  6:14           ` [PATCH 14/15] perf test: Add shell test to validate JUnit XML reporting output Ian Rogers
2026-06-01  6:14           ` [PATCH 15/15] perf test: Remove /usr/bin/cc dependency from Intel PT shell test Ian Rogers
2026-06-01  6:37             ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260601062856.092B01F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=irogers@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox