Re: [PATCH v2 07/12] perf tests: Fix flakiness in trace record and replay test

Linux Perf Users
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Ian Rogers" <irogers@google.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v2 07/12] perf tests: Fix flakiness in trace record and replay test
Date: Tue, 16 Jun 2026 06:27:42 +0000	[thread overview]
Message-ID: <20260616062743.1F12D1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260616061404.41929-8-irogers@google.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] The `perf_record_with_retry` helper declares a `local logfile` which shadows the caller's `logfile` variable, breaking permission checks in callers like `test_uid`.
- [Medium] The `perf_record_cleanup` function indiscriminately deletes all temporary log files globally, destroying test isolation for parallel runs.
--

commit 26fa367e5372dff196ce764853c2479a57a2ed92
Author: Ian Rogers <irogers@google.com>

perf tests: Fix flakiness in trace record and replay test
    
This commit updates the trace_record_replay.sh test to use a robust retry
loop for capturing the required nanosleep events. It generalizes the
perf_record_with_retry helper to support a custom command prefix via the
PERF_RECORD_CMD environment variable.

> diff --git a/tools/perf/tests/shell/lib/perf_record.sh b/tools/perf/tests/shell/lib/perf_record.sh
> index fe5721427e58b..2c705840d5547 100644
> --- a/tools/perf/tests/shell/lib/perf_record.sh
> +++ b/tools/perf/tests/shell/lib/perf_record.sh
> @@ -21,9 +21,14 @@ perf_record_with_retry() {
>    local duration
>    local first_run=true
>    local ret=1
> +  local cmd_prefix="perf record"
> +  if [ -n "${PERF_RECORD_CMD}" ]; then
> +    cmd_prefix="${PERF_RECORD_CMD}"
> +  fi
> +
>    for duration in 0.01 0.1 0.3 1.0 2.0; do
>      rm -f "${perfdata}".old
> -    perf record "$@" -o "${perfdata}" ${testprog_base} ${duration} > "$logfile" 2>&1
> +    ${cmd_prefix} "$@" -o "${perfdata}" ${testprog_base} ${duration} > "$logfile" 2>&1

[Severity: Medium]
Does this code shadow the caller's logfile variable, breaking permission
checks in callers like test_uid()?

In perf_record_with_retry(), a local logfile is declared:

tools/perf/tests/shell/lib/perf_record.sh:perf_record_with_retry() {
  ...
  local logfile
  logfile=$(mktemp /tmp/__perf_record_retry.XXXXXX.log)
  ...
}

But the caller test_uid() attempts to verify the failure reason by reading
its own unwritten logfile:

tools/perf/tests/shell/record.sh:test_uid() {
  local logfile
  logfile="/tmp/__perf_record_retry.$(id -u).$BASHPID.log"
  local ret=0
  perf_record_with_retry "${perfdata}" "check_uid" "perf test -w thloop" \
    -aB --synth=no --uid "$(id -u)" || ret=$?
  if [ $ret -eq 2 ]; then
    if grep -q -E "libbpf.*EPERM|Access to performance monitoring|Permission denied|Failure to open any events" \
      "$logfile"
    ...
}

Because the local logfile in perf_record_with_retry() captures the output,
the grep in test_uid() silently fails.

> diff --git a/tools/perf/tests/shell/trace_record_replay.sh b/tools/perf/tests/shell/trace_record_replay.sh
> index 88d30a03dcecb..f27e32b186978 100755
> --- a/tools/perf/tests/shell/trace_record_replay.sh
> +++ b/tools/perf/tests/shell/trace_record_replay.sh

[ ... ]

> @@ -6,16 +6,26 @@
>  
>  # shellcheck source=lib/probe.sh
>  . "$(dirname $0)"/lib/probe.sh
> +# shellcheck source=lib/perf_record.sh
> +. "$(dirname $0)"/lib/perf_record.sh
>  
>  skip_if_no_perf_trace || exit 2
>  [ "$(id -u)" = 0 ] || exit 2
>  
>  file=$(mktemp /tmp/temporary_file.XXXXX)
>  
> -perf trace record -o ${file} sleep 1 || exit 1
> -if ! perf trace -i ${file} 2>&1 | grep nanosleep; then
> +check_nanosleep() {
> +  perf trace -i "${file}" 2>&1 | grep -q nanosleep
> +}
> +
> +PERF_RECORD_CMD="perf trace record" perf_record_with_retry "${file}" "check_nanosleep" "sleep"
> +err=$?
> +
> +perf_record_cleanup

[Severity: Medium]
Could calling perf_record_cleanup() here clobber the log files of other tests
running concurrently?

The perf_record_cleanup() function performs a wildcard deletion:

tools/perf/tests/shell/lib/perf_record.sh:perf_record_cleanup() {
  rm -f /tmp/__perf_record_retry.*.log
}

This will indiscriminately delete all temporary log files globally,
destroying test isolation and potentially leading to race conditions and
unpredictable flaky failures across the test suite (e.g., via perf test -p).

> +rm -f ${file}
> +
> +if [ $err -ne 0 ]; then
>  	echo "Failed: cannot find *nanosleep syscall"
>  	exit 1
>  fi

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260616061404.41929-1-irogers@google.com?part=7

next prev parent reply	other threads:[~2026-06-16  6:27 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16  1:27 [PATCH v1 00/12] perf tests: Enhancements, speedups, and flakiness fixes Ian Rogers
2026-06-16  1:27 ` [PATCH v1 01/12] perf parse-events: Restrict core PMU bypass to --cputype option Ian Rogers
2026-06-16  1:44   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 02/12] perf test: Truncate test description to fit terminal width Ian Rogers
2026-06-16  1:38   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 03/12] perf tests workloads: Support sub-second durations in noploop and thloop Ian Rogers
2026-06-16  1:35   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 04/12] perf tests: Add robust record retry helper and use subsecond workloads Ian Rogers
2026-06-16  1:38   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 05/12] perf tests: Skip metrics validation if system-wide recording lacks permission Ian Rogers
2026-06-16  1:41   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 06/12] perf tests: Fix Python JIT dump profiling test failure Ian Rogers
2026-06-16  1:39   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 07/12] perf tests: Fix flakiness in trace record and replay test Ian Rogers
2026-06-16  1:42   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 08/12] perf tests: Fix flakiness in BPF counters test on hybrid systems Ian Rogers
2026-06-16  1:35   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 09/12] perf tests: Fix flakiness in branch stack sampling tests Ian Rogers
2026-06-16  1:27 ` [PATCH v1 10/12] perf tests: Speed up off-cpu profiling tests Ian Rogers
2026-06-16  1:41   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 11/12] perf tests: Speed up lock contention analysis shell test Ian Rogers
2026-06-16  1:27 ` [PATCH v1 12/12] perf tests: Speed up metrics checking shell tests Ian Rogers
2026-06-16  6:13 ` [PATCH v2 00/12] perf tests: Enhance robustness, speed up execution, and fix flakiness Ian Rogers
2026-06-16  6:13   ` [PATCH v2 01/12] perf parse-events: Restrict core PMU bypass to --cputype option Ian Rogers
2026-06-16  6:31     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 02/12] perf test: Truncate test description to fit terminal width Ian Rogers
2026-06-16  6:24     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 03/12] perf tests workloads: Support sub-second durations in noploop and thloop Ian Rogers
2026-06-16  6:22     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 04/12] perf tests: Add robust record retry helper and use subsecond workloads Ian Rogers
2026-06-16  6:27     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 05/12] perf tests: Skip metrics validation if system-wide recording lacks permission Ian Rogers
2026-06-16  6:13   ` [PATCH v2 06/12] perf tests: Fix Python JIT dump profiling test failure Ian Rogers
2026-06-16  6:27     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 07/12] perf tests: Fix flakiness in trace record and replay test Ian Rogers
2026-06-16  6:27     ` sashiko-bot [this message]
2026-06-16  6:14   ` [PATCH v2 08/12] perf tests: Fix flakiness in BPF counters test on hybrid systems Ian Rogers
2026-06-16  6:14   ` [PATCH v2 09/12] perf tests: Fix flakiness in branch stack sampling tests Ian Rogers
2026-06-16  6:14   ` [PATCH v2 10/12] perf tests: Speed up off-cpu profiling tests Ian Rogers
2026-06-16  6:25     ` sashiko-bot
2026-06-16  6:14   ` [PATCH v2 11/12] perf tests: Speed up lock contention analysis shell test Ian Rogers
2026-06-16  6:14   ` [PATCH v2 12/12] perf tests: Speed up metrics checking shell tests Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260616062743.1F12D1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=irogers@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox