Re: [PATCH v2 07/12] perf tests: Fix flakiness in trace record and replay test

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Ian Rogers" <irogers@google.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v2 07/12] perf tests: Fix flakiness in trace record and replay test
Date: Tue, 16 Jun 2026 06:27:42 +0000	[thread overview]
Message-ID: <20260616062743.1F12D1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260616061404.41929-8-irogers@google.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] The `perf_record_with_retry` helper declares a `local logfile` which shadows the caller's `logfile` variable, breaking permission checks in callers like `test_uid`.
- [Medium] The `perf_record_cleanup` function indiscriminately deletes all temporary log files globally, destroying test isolation for parallel runs.
--

commit 26fa367e5372dff196ce764853c2479a57a2ed92
Author: Ian Rogers <irogers@google.com>

perf tests: Fix flakiness in trace record and replay test
    
This commit updates the trace_record_replay.sh test to use a robust retry
loop for capturing the required nanosleep events. It generalizes the
perf_record_with_retry helper to support a custom command prefix via the
PERF_RECORD_CMD environment variable.

> diff --git a/tools/perf/tests/shell/lib/perf_record.sh b/tools/perf/tests/shell/lib/perf_record.sh
> index fe5721427e58b..2c705840d5547 100644
> --- a/tools/perf/tests/shell/lib/perf_record.sh
> +++ b/tools/perf/tests/shell/lib/perf_record.sh
> @@ -21,9 +21,14 @@ perf_record_with_retry() {
>    local duration
>    local first_run=true
>    local ret=1
> +  local cmd_prefix="perf record"
> +  if [ -n "${PERF_RECORD_CMD}" ]; then
> +    cmd_prefix="${PERF_RECORD_CMD}"
> +  fi
> +
>    for duration in 0.01 0.1 0.3 1.0 2.0; do
>      rm -f "${perfdata}".old
> -    perf record "$@" -o "${perfdata}" ${testprog_base} ${duration} > "$logfile" 2>&1
> +    ${cmd_prefix} "$@" -o "${perfdata}" ${testprog_base} ${duration} > "$logfile" 2>&1

[Severity: Medium]
Does this code shadow the caller's logfile variable, breaking permission
checks in callers like test_uid()?

In perf_record_with_retry(), a local logfile is declared:

tools/perf/tests/shell/lib/perf_record.sh:perf_record_with_retry() {
  ...
  local logfile
  logfile=$(mktemp /tmp/__perf_record_retry.XXXXXX.log)
  ...
}

But the caller test_uid() attempts to verify the failure reason by reading
its own unwritten logfile:

tools/perf/tests/shell/record.sh:test_uid() {
  local logfile
  logfile="/tmp/__perf_record_retry.$(id -u).$BASHPID.log"
  local ret=0
  perf_record_with_retry "${perfdata}" "check_uid" "perf test -w thloop" \
    -aB --synth=no --uid "$(id -u)" || ret=$?
  if [ $ret -eq 2 ]; then
    if grep -q -E "libbpf.*EPERM|Access to performance monitoring|Permission denied|Failure to open any events" \
      "$logfile"
    ...
}

Because the local logfile in perf_record_with_retry() captures the output,
the grep in test_uid() silently fails.

> diff --git a/tools/perf/tests/shell/trace_record_replay.sh b/tools/perf/tests/shell/trace_record_replay.sh
> index 88d30a03dcecb..f27e32b186978 100755
> --- a/tools/perf/tests/shell/trace_record_replay.sh
> +++ b/tools/perf/tests/shell/trace_record_replay.sh

[ ... ]

> @@ -6,16 +6,26 @@
>  
>  # shellcheck source=lib/probe.sh
>  . "$(dirname $0)"/lib/probe.sh
> +# shellcheck source=lib/perf_record.sh
> +. "$(dirname $0)"/lib/perf_record.sh
>  
>  skip_if_no_perf_trace || exit 2
>  [ "$(id -u)" = 0 ] || exit 2
>  
>  file=$(mktemp /tmp/temporary_file.XXXXX)
>  
> -perf trace record -o ${file} sleep 1 || exit 1
> -if ! perf trace -i ${file} 2>&1 | grep nanosleep; then
> +check_nanosleep() {
> +  perf trace -i "${file}" 2>&1 | grep -q nanosleep
> +}
> +
> +PERF_RECORD_CMD="perf trace record" perf_record_with_retry "${file}" "check_nanosleep" "sleep"
> +err=$?
> +
> +perf_record_cleanup

[Severity: Medium]
Could calling perf_record_cleanup() here clobber the log files of other tests
running concurrently?

The perf_record_cleanup() function performs a wildcard deletion:

tools/perf/tests/shell/lib/perf_record.sh:perf_record_cleanup() {
  rm -f /tmp/__perf_record_retry.*.log
}

This will indiscriminately delete all temporary log files globally,
destroying test isolation and potentially leading to race conditions and
unpredictable flaky failures across the test suite (e.g., via perf test -p).

> +rm -f ${file}
> +
> +if [ $err -ne 0 ]; then
>  	echo "Failed: cannot find *nanosleep syscall"
>  	exit 1
>  fi

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260616061404.41929-1-irogers@google.com?part=7

next prev parent reply	other threads:[~2026-06-16  6:27 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16  1:27 [PATCH v1 00/12] perf tests: Enhancements, speedups, and flakiness fixes Ian Rogers
2026-06-16  1:27 ` [PATCH v1 01/12] perf parse-events: Restrict core PMU bypass to --cputype option Ian Rogers
2026-06-16  1:44   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 02/12] perf test: Truncate test description to fit terminal width Ian Rogers
2026-06-16  1:38   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 03/12] perf tests workloads: Support sub-second durations in noploop and thloop Ian Rogers
2026-06-16  1:35   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 04/12] perf tests: Add robust record retry helper and use subsecond workloads Ian Rogers
2026-06-16  1:38   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 05/12] perf tests: Skip metrics validation if system-wide recording lacks permission Ian Rogers
2026-06-16  1:41   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 06/12] perf tests: Fix Python JIT dump profiling test failure Ian Rogers
2026-06-16  1:39   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 07/12] perf tests: Fix flakiness in trace record and replay test Ian Rogers
2026-06-16  1:42   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 08/12] perf tests: Fix flakiness in BPF counters test on hybrid systems Ian Rogers
2026-06-16  1:35   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 09/12] perf tests: Fix flakiness in branch stack sampling tests Ian Rogers
2026-06-16  1:27 ` [PATCH v1 10/12] perf tests: Speed up off-cpu profiling tests Ian Rogers
2026-06-16  1:41   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 11/12] perf tests: Speed up lock contention analysis shell test Ian Rogers
2026-06-16  1:27 ` [PATCH v1 12/12] perf tests: Speed up metrics checking shell tests Ian Rogers
2026-06-16  6:13 ` [PATCH v2 00/12] perf tests: Enhance robustness, speed up execution, and fix flakiness Ian Rogers
2026-06-16  6:13   ` [PATCH v2 01/12] perf parse-events: Restrict core PMU bypass to --cputype option Ian Rogers
2026-06-16  6:31     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 02/12] perf test: Truncate test description to fit terminal width Ian Rogers
2026-06-16  6:24     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 03/12] perf tests workloads: Support sub-second durations in noploop and thloop Ian Rogers
2026-06-16  6:22     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 04/12] perf tests: Add robust record retry helper and use subsecond workloads Ian Rogers
2026-06-16  6:27     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 05/12] perf tests: Skip metrics validation if system-wide recording lacks permission Ian Rogers
2026-06-16  6:13   ` [PATCH v2 06/12] perf tests: Fix Python JIT dump profiling test failure Ian Rogers
2026-06-16  6:27     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 07/12] perf tests: Fix flakiness in trace record and replay test Ian Rogers
2026-06-16  6:27     ` sashiko-bot [this message]
2026-06-16  6:14   ` [PATCH v2 08/12] perf tests: Fix flakiness in BPF counters test on hybrid systems Ian Rogers
2026-06-16  6:14   ` [PATCH v2 09/12] perf tests: Fix flakiness in branch stack sampling tests Ian Rogers
2026-06-16  6:14   ` [PATCH v2 10/12] perf tests: Speed up off-cpu profiling tests Ian Rogers
2026-06-16  6:25     ` sashiko-bot
2026-06-16  6:14   ` [PATCH v2 11/12] perf tests: Speed up lock contention analysis shell test Ian Rogers
2026-06-16  6:14   ` [PATCH v2 12/12] perf tests: Speed up metrics checking shell tests Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260616062743.1F12D1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=irogers@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.