[PATCH v3 08/13] perf tests: Fix flakiness in BPF counters test on hybrid systems

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ian Rogers <irogers@google.com>
To: irogers@google.com, acme@kernel.org, namhyung@kernel.org
Cc: adrian.hunter@intel.com, james.clark@linaro.org,
	jolsa@kernel.org,  leo.yan@arm.com, linux-kernel@vger.kernel.org,
	 linux-perf-users@vger.kernel.org, mingo@redhat.com,
	peterz@infradead.org,  thomas.falcon@intel.com,
	tmricht@linux.ibm.com
Subject: [PATCH v3 08/13] perf tests: Fix flakiness in BPF counters test on hybrid systems
Date: Tue, 16 Jun 2026 09:48:13 -0700	[thread overview]
Message-ID: <20260616164819.370939-9-irogers@google.com> (raw)
In-Reply-To: <20260616164819.370939-1-irogers@google.com>

The `perf stat --bpf-counters test` fails intermittently on hybrid
architectures or systems with dynamic frequency scaling (DVFS). This
happens because the test workload (`sqrtloop`) runs for a fixed 1-second
duration, and the CPU frequency can scale dynamically between idle and
maximum frequency. As the first run runs on a cold CPU and the second run
runs on a warmed-up CPU (or vice versa), the number of instructions
executed in 1 second differs by up to 2.2x, violating the comparison
tolerance.

Also, when running as root, BPF tracepoints and scheduling programs
trigger frequently. Since standard `perf stat -e instructions` measures
both user and kernel space instructions, it counts BPF helper and program
execution overheads, whereas the BPF counters themselves do not self-
measure. This introduces a large kernel-space instruction count
discrepancy between standard and BPF counters.

Fix these issues by:
1. Switching the workload to a strictly deterministic, iteration-based
   workload: `awk 'BEGIN { for (i=0; i<10000000; i++) sum+=i }'`. We pin
the
   workload to a single random allowed CPU using `taskset -c $CPU` via a
bash array.
2. Restricting the counted event to user-space only (`instructions:u` or
`/u`).
3. Tightening the comparison tolerance from 20% to 15%.

These modifications isolate the measurements to user-space instructions of
the deterministic loop, which executes a virtually identical number of
instructions on both runs (with less than 0.001% variation), eliminating
Dynamic Frequency Scaling (DVFS), kernel scheduling noise, and BPF helper
self-measurement overheads.

Fixes: 2c0cb9f56020 ("perf test: Add a shell test for 'perf stat --bpf-counters' new option")
Assisted-by: Antigravity:gemini-3.1-pro
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/shell/stat_bpf_counters.sh | 28 +++++++++++++--------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh
index 35463358b273..11de77ee38ad 100755
--- a/tools/perf/tests/shell/stat_bpf_counters.sh
+++ b/tools/perf/tests/shell/stat_bpf_counters.sh
@@ -4,21 +4,26 @@

 set -e

-workload="perf test -w sqrtloop"
+# Get the first allowed CPU
+CPU=$(taskset -c -p $$ | awk -F': ' '{print $2}' | awk -F'[,-]' '{print $1}')
+if [ -z "$CPU" ]; then
+	CPU=0
+fi
+workload=(taskset -c "$CPU" awk 'BEGIN { for (i=0; i<10000000; i++) sum+=i }')

-# check whether $2 is within +/- 20% of $1
+# check whether $2 is within +/- 15% of $1
 compare_number()
 {
 	first_num=$1
 	second_num=$2

-	# upper bound is first_num * 120%
-	upper=$(expr $first_num + $first_num / 5 )
-	# lower bound is first_num * 80%
-	lower=$(expr $first_num - $first_num / 5 )
+	# upper bound is first_num * 115%
+	upper=$(expr $first_num + $first_num / 20 \* 3 )
+	# lower bound is first_num * 85%
+	lower=$(expr $first_num - $first_num / 20 \* 3 )

 	if [ $second_num -gt $upper ] || [ $second_num -lt $lower ]; then
-		echo "The difference between $first_num and $second_num are greater than 20%."
+		echo "The difference between $first_num and $second_num are greater than 15%."
 		exit 1
 	fi
 }
@@ -41,11 +46,12 @@ check_counts()
 test_bpf_counters()
 {
 	printf "Testing --bpf-counters "
-	base_instructions=$(perf stat --no-big-num -e instructions -- $workload 2>&1 | \
+	base_instructions=$(perf stat --no-big-num -e instructions:u -- "${workload[@]}" 2>&1 | \
 				awk -v i=0 -v c=0 '/instructions/ { \
 					if ($1 != "<not") { i++; c += $1 } \
 				} END { if (i > 0) printf "%.0f", c; else print "<not" }')
-	bpf_instructions=$(perf stat --no-big-num --bpf-counters -e instructions -- $workload  2>&1 | \
+	bpf_instructions=$(perf stat --no-big-num --bpf-counters -e instructions:u \
+				-- "${workload[@]}"  2>&1 | \
 				awk -v i=0 -v c=0 '/instructions/ { \
 					if ($1 != "<not") { i++; c += $1 } \
 				} END { if (i > 0) printf "%.0f", c; else print "<not" }')
@@ -57,7 +63,9 @@ test_bpf_counters()
 test_bpf_modifier()
 {
 	printf "Testing bpf event modifier "
-	stat_output=$(perf stat --no-big-num -e instructions/name=base_instructions/,instructions/name=bpf_instructions/b -- $workload 2>&1)
+	stat_output=$(perf stat --no-big-num \
+		-e instructions/name=base_instructions/u,instructions/name=bpf_instructions/bu \
+		-- "${workload[@]}" 2>&1)
 	base_instructions=$(echo "$stat_output"| \
 				awk -v i=0 -v c=0 '/base_instructions/ { \
 					if ($1 != "<not") { i++; c += $1 } \
-- 
2.54.0.1189.g8c84645362-goog

next prev parent reply	other threads:[~2026-06-16 16:48 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16  1:27 [PATCH v1 00/12] perf tests: Enhancements, speedups, and flakiness fixes Ian Rogers
2026-06-16  1:27 ` [PATCH v1 01/12] perf parse-events: Restrict core PMU bypass to --cputype option Ian Rogers
2026-06-16  1:44   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 02/12] perf test: Truncate test description to fit terminal width Ian Rogers
2026-06-16  1:38   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 03/12] perf tests workloads: Support sub-second durations in noploop and thloop Ian Rogers
2026-06-16  1:35   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 04/12] perf tests: Add robust record retry helper and use subsecond workloads Ian Rogers
2026-06-16  1:38   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 05/12] perf tests: Skip metrics validation if system-wide recording lacks permission Ian Rogers
2026-06-16  1:41   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 06/12] perf tests: Fix Python JIT dump profiling test failure Ian Rogers
2026-06-16  1:39   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 07/12] perf tests: Fix flakiness in trace record and replay test Ian Rogers
2026-06-16  1:42   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 08/12] perf tests: Fix flakiness in BPF counters test on hybrid systems Ian Rogers
2026-06-16  1:35   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 09/12] perf tests: Fix flakiness in branch stack sampling tests Ian Rogers
2026-06-16  1:27 ` [PATCH v1 10/12] perf tests: Speed up off-cpu profiling tests Ian Rogers
2026-06-16  1:41   ` sashiko-bot
2026-06-16  1:27 ` [PATCH v1 11/12] perf tests: Speed up lock contention analysis shell test Ian Rogers
2026-06-16  1:27 ` [PATCH v1 12/12] perf tests: Speed up metrics checking shell tests Ian Rogers
2026-06-16  6:13 ` [PATCH v2 00/12] perf tests: Enhance robustness, speed up execution, and fix flakiness Ian Rogers
2026-06-16  6:13   ` [PATCH v2 01/12] perf parse-events: Restrict core PMU bypass to --cputype option Ian Rogers
2026-06-16  6:31     ` sashiko-bot
2026-06-16 15:14     ` Arnaldo Carvalho de Melo
2026-06-16 15:17     ` Arnaldo Carvalho de Melo
2026-06-16  6:13   ` [PATCH v2 02/12] perf test: Truncate test description to fit terminal width Ian Rogers
2026-06-16  6:24     ` sashiko-bot
2026-06-16 15:25     ` Arnaldo Carvalho de Melo
2026-06-16  6:13   ` [PATCH v2 03/12] perf tests workloads: Support sub-second durations in noploop and thloop Ian Rogers
2026-06-16  6:22     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 04/12] perf tests: Add robust record retry helper and use subsecond workloads Ian Rogers
2026-06-16  6:27     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 05/12] perf tests: Skip metrics validation if system-wide recording lacks permission Ian Rogers
2026-06-16  6:13   ` [PATCH v2 06/12] perf tests: Fix Python JIT dump profiling test failure Ian Rogers
2026-06-16  6:27     ` sashiko-bot
2026-06-16  6:13   ` [PATCH v2 07/12] perf tests: Fix flakiness in trace record and replay test Ian Rogers
2026-06-16  6:27     ` sashiko-bot
2026-06-16  6:14   ` [PATCH v2 08/12] perf tests: Fix flakiness in BPF counters test on hybrid systems Ian Rogers
2026-06-16  6:14   ` [PATCH v2 09/12] perf tests: Fix flakiness in branch stack sampling tests Ian Rogers
2026-06-16  6:14   ` [PATCH v2 10/12] perf tests: Speed up off-cpu profiling tests Ian Rogers
2026-06-16  6:25     ` sashiko-bot
2026-06-16  6:14   ` [PATCH v2 11/12] perf tests: Speed up lock contention analysis shell test Ian Rogers
2026-06-16  6:14   ` [PATCH v2 12/12] perf tests: Speed up metrics checking shell tests Ian Rogers
2026-06-16 16:48   ` [PATCH v3 00/13] perf tests: Robustness and performance improvements Ian Rogers
2026-06-16 16:48     ` [PATCH v3 01/13] perf parse-events: Restrict core PMU bypass to --cputype option Ian Rogers
2026-06-16 16:48     ` [PATCH v3 02/13] perf test: Truncate test description to fit terminal width Ian Rogers
2026-06-16 16:48     ` [PATCH v3 03/13] perf tests workloads: Support sub-second durations in noploop and thloop Ian Rogers
2026-06-16 16:48     ` [PATCH v3 04/13] perf tests: Add robust record retry helper and use subsecond workloads Ian Rogers
2026-06-16 16:48     ` [PATCH v3 05/13] perf tests: Skip metrics validation if system-wide recording lacks permission Ian Rogers
2026-06-16 16:48     ` [PATCH v3 06/13] perf tests: Fix Python JIT dump profiling test failure Ian Rogers
2026-06-16 16:48     ` [PATCH v3 07/13] perf tests: Fix flakiness in trace record and replay test Ian Rogers
2026-06-16 16:48     ` Ian Rogers [this message]
2026-06-16 16:48     ` [PATCH v3 09/13] perf tests: Fix flakiness in branch stack sampling tests Ian Rogers
2026-06-16 16:48     ` [PATCH v3 10/13] perf tests: Speed up off-cpu profiling tests Ian Rogers
2026-06-16 16:48     ` [PATCH v3 11/13] perf tests: Speed up lock contention analysis shell test Ian Rogers
2026-06-16 16:48     ` [PATCH v3 12/13] perf tests: Speed up metrics checking shell tests Ian Rogers
2026-06-16 16:48     ` [PATCH v3 13/13] perf tests: Include error output for skipped tests in JUnit XML Ian Rogers

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:35463358b27 dfblob:11de77ee38a )
 OR (
bs:"[PATCH v3 08/13] perf tests: Fix flakiness in BPF counters test on hybrid systems" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260616164819.370939-9-irogers@google.com \
    --to=irogers@google.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=james.clark@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=leo.yan@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=thomas.falcon@intel.com \
    --cc=tmricht@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.