Re: [PATCH v1 06/14] perf test: Refactor parallel poll loop to drain all pipes simultaneously

Linux Perf Users
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Ian Rogers" <irogers@google.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v1 06/14] perf test: Refactor parallel poll loop to drain all pipes simultaneously
Date: Thu, 14 May 2026 14:27:41 +0000	[thread overview]
Message-ID: <20260514142741.EC2ECC2BCC9@smtp.kernel.org> (raw)
In-Reply-To: <20260513230450.529380-7-irogers@google.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Medium] The parallel test harness enters a 100% CPU busy loop if a child process closes its stderr pipe (reaching EOF) but continues executing.
- [Medium] If the test harness hits the open file limit (RLIMIT_NOFILE), it enters an infinite deadlock unable to reap finished children.
- [Medium] The error and signal cleanup path in `__cmd_test` relies on the legacy `finish_test()` function, causing memory leaks and false failure reporting.
- [Low] The patch introduces a malformed function declaration that breaks compilation (Syntax Error).
--

commit 5bdeb841dfa855250491399c245c42f015b0d82b
Author: Ian Rogers <irogers@google.com>

perf test: Refactor parallel poll loop to drain all pipes simultaneously

This commit refactors the perf test harness to poll and drain output pipes
from all running children simultaneously into per-child buffers. This prevents
child processes from blocking indefinitely when their pipe buffers fill up.

> diff --git a/tools/lib/subcmd/run-command.c b/tools/lib/subcmd/run-command.c
> index b7510f83209a0..60e7df3673169 100644
> --- a/tools/lib/subcmd/run-command.c
> +++ b/tools/lib/subcmd/run-command.c
> @@ -241,8 +241,8 @@ int check_if_command_finished(struct child_process *cmd)
>  	sprintf(filename, "/proc/%u/status", cmd->pid);
>  	status_file = fopen(filename, "r");
>  	if (status_file == NULL) {
> -		/* Open failed assume finish_command was called. */
> -		return true;
> +		/* Open failed. Only assume finished if process no longer exists. */
> +		return errno == ENOENT ? 1 : 0;
>  	}

If finish_tests_parallel() launches many tests concurrently and hits the open
file limit (RLIMIT_NOFILE), fopen() will fail with EMFILE.

Since this now returns 0, the test is assumed to still be running. Could this
lead to an infinite loop where tests are never marked as finished, and no file
descriptors are ever freed to break the deadlock?

> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> index a350db071674a..ad1b941731ca1 100644
> --- a/tools/perf/tests/builtin-test.c
> +++ b/tools/perf/tests/builtin-test.c
[ ... ]
> @@ -508,6 +511,187 @@ static void finish_test(struct child_test **child_tests, int running_test, int c
>  	zfree(&child_tests[running_test]);
>  }
>  
> +static int finish_tests_parallel(struct child_test **child_tests, size_t num_tests, int width)
> +{
> +	size_t next_to_print = 0;
> +	struct pollfd *pfds;
> +	size_t *pfd_indices;
> +	size_t num_pfds = 0;
> +static void drain_child_process_err(struct child_test *child)

This appears to be a duplicate partial declaration of finish_tests_parallel()
left over from a merge or refactor. Does this syntax error break the build?

> +{
> +	char buf[512];
> +	ssize_t len;
> +
> +	while ((len = read(child->process.err, buf, sizeof(buf) - 1)) > 0) {
> +		buf[len] = '\0';
> +		strbuf_addstr(&child->err_output, buf);
> +	}
> +}
[ ... ]
> +		if (num_pfds == 0) {
> +			if (running_count > 0)
> +				usleep(10 * 1000);
> +		} else {
> +			int pret = poll(pfds, num_pfds, 100);
> +
> +			if (pret > 0) {
> +				for (p = 0; p < num_pfds; p++) {
> +					if (pfds[p].revents) {
> +						size_t idx = pfd_indices[p];
> +						struct child_test *child = child_tests[idx];
> +
> +						drain_child_process_err(child);
> +					}
> +				}
> +			}
> +		}

If a child process closes its stderr pipe (reaching EOF) but continues
executing, poll() returns POLLHUP. drain_child_process_err() will read 0 bytes,
but it does not close child->process.err or set it to -1.

Will the next iteration add this open file descriptor back to the poll() array,
causing poll() to return immediately and create a tight busy loop until the
child finally exits?

[ ... ]
> @@ -670,8 +854,9 @@ static int __cmd_test(struct test_suite **suites, int argc, const char *argv[],
>  		}
>  		if (!sequential) {
>  			/* Parallel mode starts tests but doesn't finish them. Do that now. */
> -			for (size_t x = 0; x < num_tests; x++)
> -				finish_test(child_tests, x, num_tests, width);
> +			err = finish_tests_parallel(child_tests, num_tests, width);
> +			if (err)
> +				goto err_out;
>  		}
>  	}
>  err_out:

If finish_tests_parallel() encounters an error and jumps to err_out, the cleanup
path loops over all children and calls the legacy finish_test() function. 

Because finish_test() uses a local string buffer, it does not release the
dynamically allocated child->err_output populated by finish_tests_parallel().
Does this leak memory?

Additionally, finish_test() unconditionally calls finish_command(). If a child
was already reaped in finish_tests_parallel() (where child->done is true),
could this second call lead to ECHILD errors and corrupted test results?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260513230450.529380-1-irogers@google.com?part=6

next prev parent reply	other threads:[~2026-05-14 14:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 23:04 [PATCH v1 00/14] perf test: Harness improvements Ian Rogers
2026-05-13 23:04 ` [PATCH v1 01/14] perf jevents.py: Make generated C code more kernel style Ian Rogers
2026-05-13 23:04 ` [PATCH v1 02/14] perf pmu-events: Add API to get metric table name and iterate tables Ian Rogers
2026-05-14 11:42   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 03/14] perf test: Drain pipe after child finishes to avoid losing output Ian Rogers
2026-05-13 23:04 ` [PATCH v1 04/14] perf test: Support dynamic test suites with setup callback and private data Ian Rogers
2026-05-14 12:10   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 05/14] perf test pmu-events: A sub-test per metric table Ian Rogers
2026-05-13 23:04 ` [PATCH v1 06/14] perf test: Refactor parallel poll loop to drain all pipes simultaneously Ian Rogers
2026-05-14 14:27   ` sashiko-bot [this message]
2026-05-13 23:04 ` [PATCH v1 07/14] perf test: Show snippet failure output for verbose=1 Ian Rogers
2026-05-14 15:50   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 08/14] perf test: Add summary reporting Ian Rogers
2026-05-14 16:10   ` sashiko-bot
2026-05-13 23:04 ` [PATCH v1 09/14] perf test: Fix subtest status alignment for multi-digit indexes Ian Rogers
2026-05-13 23:04 ` [PATCH v1 10/14] perf test: Skip shebang and SPDX comments in shell test descriptions Ian Rogers
2026-05-13 23:04 ` [PATCH v1 11/14] perf test: Split monolithic 'util' test suite into sub-tests Ian Rogers
2026-05-13 23:04 ` [PATCH v1 12/14] perf test: Add -j/--junit option for JUnit XML test reports Ian Rogers
2026-05-13 23:04 ` [PATCH v1 13/14] perf test: Add shell test to validate JUnit XML reporting output Ian Rogers
2026-05-13 23:04 ` [PATCH v1 14/14] perf test: Remove /usr/bin/cc dependency from Intel PT shell test Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514142741.EC2ECC2BCC9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=irogers@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox