From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F4293AF64D for ; Tue, 23 Jun 2026 21:28:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782250141; cv=none; b=gOx53ixPr0DJjc/0/jqgZpcMTQ7kLajN426FybEW4f6mJE4mkg7v2WPftoPJR3CGXOqrt/FJdAV1VdhFHGVmYJXZ00zmHR6xcT+2dJBmwQLTwpdTzsJQB/mzgfSJXLOgP/PT5fZ1LV6o6GtKSNlQnoHNfYWsCmKSMi2ZBHyIezA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782250141; c=relaxed/simple; bh=o+ABm3dkiCUhLoTt96yKe7Xc1iDg0sm5q0uDAutw5/o=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=W55U6sdgVxzlCNamb5egHT6fm5jwcFjTbP22gOY0JfhJQvByqrvzLW5kyXNMYSIsVcuBdzbW2o/PFbwuHmzHi7e+q+dhs+7MS9/CLifFKzpRp9j+GTN542zzn536EsXhUlhi43tNR1DkuvnpzrFwmmGs47o0n4DtBTnMya01n4c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TLUTRA9k; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TLUTRA9k" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A63D31F000E9; Tue, 23 Jun 2026 21:28:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782250139; bh=oNILxZGHc2b8wEOzil8RP7rCTro9MjFJwbjttJ373LQ=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=TLUTRA9k+7z4GfXCA/eWbZB1RZZJTrrbMIfWtbfmuwdYD+imXumqLMJ0vQ0kDuC1Q +ny9Qky0thNZ4BJFcgWaQ1IKgN0F7DCRR3IowIBpZ4tflyGN8e0Th/+2XJ9VMuakfi Mc2Iy6VQ4T2G7A3xMNIuv3VjQHL7b9KYiGHNv185X9kE9GWK2R6CAUoRRtY7YEyG1l ZsGopT3PjCGD8kDS6ILWQtUSyW9vk7kfNwlsxHfY5QL8uMePuUlVgsFi2Spvia3OWb 5/hd3PXlrmlMrLGjDz52X9o76/O+zv9DqGxBSyMaFru5xy8I7RZXat39QACXhIX3bO R0pWEauX5pxnQ== From: sashiko-bot@kernel.org Subject: Re: [PATCH v4 04/13] perf tests: Add robust record retry helper and use subsecond workloads Reply-To: sashiko-reviews@lists.linux.dev To: "Ian Rogers" Cc: linux-perf-users@vger.kernel.org In-Reply-To: <20260623012758.2291858-5-irogers@google.com> References: <20260616164819.370939-1-irogers@google.com> <20260623012758.2291858-1-irogers@google.com> <20260623012758.2291858-5-irogers@google.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Tue, 23 Jun 2026 21:28:59 +0000 Message-Id: <20260623212859.A63D31F000E9@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 3 potential issue(= s) to consider: - [Medium] Commit message inaccuracy and incomplete refactoring regarding `= pipe_test.sh` and `stat_all_pfm.sh`. - [Medium] Synchronous command execution in `test_kvm_record_report` lacks = error checking, causing fatal script aborts under `set -e`. - [Medium] `perf_record_with_retry` permanently leaks and disables the glob= al `set -e` (errexit) flag when invoked within an OR list. -- commit d89e4e460291f1a110feac7d87fda40226859958 Author: Ian Rogers perf tests: Add robust record retry helper and use subsecond workloads Introduce perf_record_with_retry and perf_record_cleanup in a shared library tests/shell/lib/perf_record.sh to prevent record test failures caused by transient recording or workload delays. > Update record.sh, record_lbr.sh, pipe_test.sh, kvm.sh, and > stat_all_pfm.sh to use this robust record retry logic. These tests now > start with very short durations (e.g. 0.01 seconds) and scale up if the > initial recording failed to capture samples, significantly improving test > execution speed on success while remaining resilient to slow systems. [Severity: Medium] This isn't a bug, but the commit message states that stat_all_pfm.sh is upd= ated=20 to use the robust retry logic, yet stat_all_pfm.sh isn't modified in this p= atch. > diff --git a/tools/perf/tests/shell/kvm.sh b/tools/perf/tests/shell/kvm.sh > index f88e859025c42..a5396f8e6fe5c 100755 > --- a/tools/perf/tests/shell/kvm.sh > +++ b/tools/perf/tests/shell/kvm.sh [ ... ] > @@ -60,22 +71,26 @@ test_kvm_stat() { > test_kvm_record_report() { > echo "Testing perf kvm record/report" > =20 > - echo "Recording kvm profile for pid ${qemu_pid}..." [ ... ] > + local duration > + local success=3Dfalse > + for duration in 1 2 4 8; do > + echo "Recording kvm profile for pid ${qemu_pid} (duration ${duration}s= )..." > + rm -f "${perfdata}" "${perfdata}".old > + > + perf kvm --host record -p "${qemu_pid}" -o "${perfdata}" \ > + -e cpu-clock sleep ${duration} [Severity: Medium] Does this unguarded execution bypass the retry loop if it fails? Since the kvm.sh script operates with `set -e` enabled globally, if=20 `perf kvm record` returns a non-zero exit code here, the script will immedi= ately=20 abort entirely. Should this be guarded, perhaps by appending `|| true` to prevent the fatal= =20 abort so the retry loop can continue? > + > + if [ -e "${perfdata}" ] && \ > + perf kvm -i "${perfdata}" report --stdio 2>&1 | grep -q "Event count"= ; then > + success=3Dtrue > + break > + fi > + echo "No samples or report failed, retrying..." > + done > diff --git a/tools/perf/tests/shell/lib/perf_record.sh b/tools/perf/tests= /shell/lib/perf_record.sh > new file mode 100644 > index 0000000000000..e137fa75370de > --- /dev/null > +++ b/tools/perf/tests/shell/lib/perf_record.sh > @@ -0,0 +1,53 @@ > +# SPDX-License-Identifier: GPL-2.0 > + > +PERF_RECORD_LOGS=3D() > + > +perf_record_with_retry() { [ ... ] > + # Save the e flag state and disable it > + local save_e > + if [[ $- =3D=3D *e* ]]; then > + save_e=3D"set -e" > + else > + save_e=3D"set +e" > + fi [Severity: Medium] Does this logic permanently disable the `set -e` flag in the caller when=20 invoked within an OR list? In bash, executing a function on the left side of an `||` operator temporar= ily=20 suspends `errexit` and explicitly removes the `e` flag from `$-`. For example, when `record.sh` invokes this function: perf_record_with_retry ... || ret=3D$? `[[ $- =3D=3D *e* ]]` will evaluate to false, causing the function to save = the state=20 as `set +e`. When the function finishes, it evaluates this saved state,=20 permanently disabling `set -e` globally for the remainder of the calling sc= ript=20 and silently masking all subsequent command failures. > + set +e > + > + local duration > + local first_run=3Dtrue > + local ret=3D1 > diff --git a/tools/perf/tests/shell/pipe_test.sh b/tools/perf/tests/shell= /pipe_test.sh > index e459aa99a9515..ce68d850c9838 100755 > --- a/tools/perf/tests/shell/pipe_test.sh > +++ b/tools/perf/tests/shell/pipe_test.sh > @@ -12,8 +12,8 @@ skip_test_missing_symbol ${sym} > =20 > data=3D$(mktemp /tmp/perf.data.XXXXXX) > data2=3D$(mktemp /tmp/perf.data2.XXXXXX) > -prog=3D"perf test -w noploop" > -[ "$(uname -m)" =3D "s390x" ] && prog=3D"$prog 3" > +prog=3D"perf test -w noploop 0.1" > +[ "$(uname -m)" =3D "s390x" ] && prog=3D"perf test -w noploop 3" [Severity: Medium] Will this hardcoded shorter duration make the test more brittle? The commit message indicates this script was updated to use the robust retr= y=20 logic, but it appears to only use a hardcoded 0.1s duration here. Relying o= n a=20 single 0.1s run without retries might lead to transient failures on slow sy= stems. > err=3D0 > =20 > set -e --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260623012758.2291= 858-1-irogers@google.com?part=3D4