* [PATCH v1] perf test: Be more tolerant of metricgroup failures
@ 2024-04-03 16:48 Ian Rogers
2024-04-04 18:08 ` Namhyung Kim
[not found] ` <CAGa8GX5MhP3FUhafN5QivHaE3Fg+p5MgvTq3SW6MQy4NeZ1sYQ@mail.gmail.com>
0 siblings, 2 replies; 5+ messages in thread
From: Ian Rogers @ 2024-04-03 16:48 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Athira Rajeev, linux-perf-users,
linux-kernel
Previously "set -e" meant any non-zero exit code from perf stat would
cause a test failure. As a non-zero exit happens when there aren't
sufficient permissions, check for this case and make the exit code
2/skip for it.
Signed-off-by: Ian Rogers <irogers@google.com>
---
.../perf/tests/shell/stat_all_metricgroups.sh | 28 +++++++++++++++----
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh b/tools/perf/tests/shell/stat_all_metricgroups.sh
index 55ef9c9ded2d..d6db192b9f18 100755
--- a/tools/perf/tests/shell/stat_all_metricgroups.sh
+++ b/tools/perf/tests/shell/stat_all_metricgroups.sh
@@ -1,9 +1,7 @@
-#!/bin/sh
+#!/bin/bash
# perf all metricgroups test
# SPDX-License-Identifier: GPL-2.0
-set -e
-
ParanoidAndNotRoot()
{
[ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ]
@@ -14,11 +12,29 @@ if ParanoidAndNotRoot 0
then
system_wide_flag=""
fi
-
+err=0
for m in $(perf list --raw-dump metricgroups)
do
echo "Testing $m"
- perf stat -M "$m" $system_wide_flag sleep 0.01
+ result=$(perf stat -M "$m" $system_wide_flag sleep 0.01 2>&1)
+ result_err=$?
+ if [[ $result_err -gt 0 ]]
+ then
+ if [[ "$result" =~ \
+ "Access to performance monitoring and observability operations is limited" ]]
+ then
+ echo "Permission failure"
+ echo $result
+ if [[ $err -eq 0 ]]
+ then
+ err=2 # Skip
+ fi
+ else
+ echo "Metric group $m failed"
+ echo $result
+ err=1 # Fail
+ fi
+ fi
done
-exit 0
+exit $err
--
2.44.0.478.gd926399ef9-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH v1] perf test: Be more tolerant of metricgroup failures 2024-04-03 16:48 [PATCH v1] perf test: Be more tolerant of metricgroup failures Ian Rogers @ 2024-04-04 18:08 ` Namhyung Kim [not found] ` <CAGa8GX5MhP3FUhafN5QivHaE3Fg+p5MgvTq3SW6MQy4NeZ1sYQ@mail.gmail.com> 1 sibling, 0 replies; 5+ messages in thread From: Namhyung Kim @ 2024-04-04 18:08 UTC (permalink / raw) To: Ian Rogers Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Athira Rajeev, linux-perf-users, linux-kernel Hi Ian, On Wed, Apr 3, 2024 at 9:48 AM Ian Rogers <irogers@google.com> wrote: > > Previously "set -e" meant any non-zero exit code from perf stat would > cause a test failure. As a non-zero exit happens when there aren't > sufficient permissions, check for this case and make the exit code > 2/skip for it. > > Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Thanks, Namhyung > --- > .../perf/tests/shell/stat_all_metricgroups.sh | 28 +++++++++++++++---- > 1 file changed, 22 insertions(+), 6 deletions(-) > > diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh b/tools/perf/tests/shell/stat_all_metricgroups.sh > index 55ef9c9ded2d..d6db192b9f18 100755 > --- a/tools/perf/tests/shell/stat_all_metricgroups.sh > +++ b/tools/perf/tests/shell/stat_all_metricgroups.sh > @@ -1,9 +1,7 @@ > -#!/bin/sh > +#!/bin/bash > # perf all metricgroups test > # SPDX-License-Identifier: GPL-2.0 > > -set -e > - > ParanoidAndNotRoot() > { > [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ] > @@ -14,11 +12,29 @@ if ParanoidAndNotRoot 0 > then > system_wide_flag="" > fi > - > +err=0 > for m in $(perf list --raw-dump metricgroups) > do > echo "Testing $m" > - perf stat -M "$m" $system_wide_flag sleep 0.01 > + result=$(perf stat -M "$m" $system_wide_flag sleep 0.01 2>&1) > + result_err=$? > + if [[ $result_err -gt 0 ]] > + then > + if [[ "$result" =~ \ > + "Access to performance monitoring and observability operations is limited" ]] > + then > + echo "Permission failure" > + echo $result > + if [[ $err -eq 0 ]] > + then > + err=2 # Skip > + fi > + else > + echo "Metric group $m failed" > + echo $result > + err=1 # Fail > + fi > + fi > done > > -exit 0 > +exit $err > -- > 2.44.0.478.gd926399ef9-goog > ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <CAGa8GX5MhP3FUhafN5QivHaE3Fg+p5MgvTq3SW6MQy4NeZ1sYQ@mail.gmail.com>]
* Re: [PATCH v1] perf test: Be more tolerant of metricgroup failures [not found] ` <CAGa8GX5MhP3FUhafN5QivHaE3Fg+p5MgvTq3SW6MQy4NeZ1sYQ@mail.gmail.com> @ 2024-04-22 15:42 ` Ian Rogers 2024-04-25 18:58 ` Arnaldo Carvalho de Melo 2024-04-29 14:49 ` Veronika Molnarova 0 siblings, 2 replies; 5+ messages in thread From: Ian Rogers @ 2024-04-22 15:42 UTC (permalink / raw) To: Veronika Molnarova Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Athira Rajeev, linux-perf-users, linux-kernel On Mon, Apr 22, 2024 at 4:51 AM Veronika Molnarova <vmolnaro@redhat.com> wrote: > > Hi Ian, > > On Wed, Apr 3, 2024 at 6:48 PM Ian Rogers <irogers@google.com> wrote: >> >> Previously "set -e" meant any non-zero exit code from perf stat would >> cause a test failure. As a non-zero exit happens when there aren't >> sufficient permissions, check for this case and make the exit code >> 2/skip for it. >> >> Signed-off-by: Ian Rogers <irogers@google.com> >> --- >> .../perf/tests/shell/stat_all_metricgroups.sh | 28 +++++++++++++++---- >> 1 file changed, 22 insertions(+), 6 deletions(-) >> >> diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh b/tools/perf/tests/shell/stat_all_metricgroups.sh >> index 55ef9c9ded2d..d6db192b9f18 100755 >> --- a/tools/perf/tests/shell/stat_all_metricgroups.sh >> +++ b/tools/perf/tests/shell/stat_all_metricgroups.sh >> @@ -1,9 +1,7 @@ >> -#!/bin/sh >> +#!/bin/bash >> # perf all metricgroups test >> # SPDX-License-Identifier: GPL-2.0 >> >> -set -e >> - >> ParanoidAndNotRoot() >> { >> [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ] >> @@ -14,11 +12,29 @@ if ParanoidAndNotRoot 0 >> then >> system_wide_flag="" >> fi >> - >> +err=0 >> for m in $(perf list --raw-dump metricgroups) >> do >> echo "Testing $m" >> - perf stat -M "$m" $system_wide_flag sleep 0.01 >> + result=$(perf stat -M "$m" $system_wide_flag sleep 0.01 2>&1) >> + result_err=$? >> + if [[ $result_err -gt 0 ]] >> + then >> + if [[ "$result" =~ \ >> + "Access to performance monitoring and observability operations is limited" ]] >> + then >> + echo "Permission failure" >> + echo $result >> + if [[ $err -eq 0 ]] >> + then >> + err=2 # Skip >> + fi >> + else >> + echo "Metric group $m failed" >> + echo $result >> + err=1 # Fail >> + fi >> + fi >> done >> >> -exit 0 >> +exit $err >> -- >> 2.44.0.478.gd926399ef9-goog >> >> > > The patch looks good and thanks for taking care of it. > > Just wanted to check what is the desired outcome for metric groups > with events that are invalid in per-thread mode causing the test to fail. > > ``` > $ ./stat_all_metricgroups.sh > Testing smi > Metric group smi failed > Error: Invalid event (msr/smi/u) in per-thread mode, enable system wide with '-a'. > ``` > > Wouldn't it be better if in these cases the test would result in skip instead of fail? Hi Veronika, I agree that fail isn't best here. I'm wondering: - why doesn't msr/smi/ support per-thread mode? Can't we save/restore the count on a context switch? The implementation is here: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/arch/x86/events/msr.c?h=perf-tools-next#n234 There's clearly something going on as pperf appears to have other restrictions: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/arch/x86/events/msr.c?h=perf-tools-next#n115 I'm wondering if aggregation is working right if these counters are more than per hyperthread (I'm guessing why the restrictions exist). - the tool error message is doing pretty good. In the test I guess we can spot the per-thread error and turn the fail to a skip. It's a shame to bucket things as skip, but it seems easier than listing metrics in the test or spotting particular events. I can do a v2 to add this. Thanks, Ian > Thanks, > Veronika > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] perf test: Be more tolerant of metricgroup failures 2024-04-22 15:42 ` Ian Rogers @ 2024-04-25 18:58 ` Arnaldo Carvalho de Melo 2024-04-29 14:49 ` Veronika Molnarova 1 sibling, 0 replies; 5+ messages in thread From: Arnaldo Carvalho de Melo @ 2024-04-25 18:58 UTC (permalink / raw) To: Ian Rogers Cc: Veronika Molnarova, Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Athira Rajeev, linux-perf-users, linux-kernel On Mon, Apr 22, 2024 at 08:42:17AM -0700, Ian Rogers wrote: > On Mon, Apr 22, 2024 at 4:51 AM Veronika Molnarova <vmolnaro@redhat.com> wrote: > > > > Hi Ian, > > > > On Wed, Apr 3, 2024 at 6:48 PM Ian Rogers <irogers@google.com> wrote: > >> > >> Previously "set -e" meant any non-zero exit code from perf stat would > >> cause a test failure. As a non-zero exit happens when there aren't > >> sufficient permissions, check for this case and make the exit code > >> 2/skip for it. > >> > >> Signed-off-by: Ian Rogers <irogers@google.com> > >> --- > >> .../perf/tests/shell/stat_all_metricgroups.sh | 28 +++++++++++++++---- > >> 1 file changed, 22 insertions(+), 6 deletions(-) > >> > >> diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh b/tools/perf/tests/shell/stat_all_metricgroups.sh > >> index 55ef9c9ded2d..d6db192b9f18 100755 > >> --- a/tools/perf/tests/shell/stat_all_metricgroups.sh > >> +++ b/tools/perf/tests/shell/stat_all_metricgroups.sh > >> @@ -1,9 +1,7 @@ > >> -#!/bin/sh > >> +#!/bin/bash > >> # perf all metricgroups test > >> # SPDX-License-Identifier: GPL-2.0 > >> > >> -set -e > >> - > >> ParanoidAndNotRoot() > >> { > >> [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ] > >> @@ -14,11 +12,29 @@ if ParanoidAndNotRoot 0 > >> then > >> system_wide_flag="" > >> fi > >> - > >> +err=0 > >> for m in $(perf list --raw-dump metricgroups) > >> do > >> echo "Testing $m" > >> - perf stat -M "$m" $system_wide_flag sleep 0.01 > >> + result=$(perf stat -M "$m" $system_wide_flag sleep 0.01 2>&1) > >> + result_err=$? > >> + if [[ $result_err -gt 0 ]] > >> + then > >> + if [[ "$result" =~ \ > >> + "Access to performance monitoring and observability operations is limited" ]] > >> + then > >> + echo "Permission failure" > >> + echo $result > >> + if [[ $err -eq 0 ]] > >> + then > >> + err=2 # Skip > >> + fi > >> + else > >> + echo "Metric group $m failed" > >> + echo $result > >> + err=1 # Fail > >> + fi > >> + fi > >> done > >> > >> -exit 0 > >> +exit $err > >> -- > >> 2.44.0.478.gd926399ef9-goog > >> > >> > > > > The patch looks good and thanks for taking care of it. > > > > Just wanted to check what is the desired outcome for metric groups > > with events that are invalid in per-thread mode causing the test to fail. > > > > ``` > > $ ./stat_all_metricgroups.sh > > Testing smi > > Metric group smi failed > > Error: Invalid event (msr/smi/u) in per-thread mode, enable system wide with '-a'. > > ``` > > > > Wouldn't it be better if in these cases the test would result in skip instead of fail? > > Hi Veronika, > > I agree that fail isn't best here. I'm wondering: > > - why doesn't msr/smi/ support per-thread mode? Can't we save/restore > the count on a context switch? The implementation is here: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/arch/x86/events/msr.c?h=perf-tools-next#n234 > There's clearly something going on as pperf appears to have other restrictions: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/arch/x86/events/msr.c?h=perf-tools-next#n115 > I'm wondering if aggregation is working right if these counters are > more than per hyperthread (I'm guessing why the restrictions exist). > > - the tool error message is doing pretty good. In the test I guess we > can spot the per-thread error and turn the fail to a skip. It's a > shame to bucket things as skip, but it seems easier than listing > metrics in the test or spotting particular events. > > I can do a v2 to add this. Ok, I'll wait for v2 then. - Arnaldo ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1] perf test: Be more tolerant of metricgroup failures 2024-04-22 15:42 ` Ian Rogers 2024-04-25 18:58 ` Arnaldo Carvalho de Melo @ 2024-04-29 14:49 ` Veronika Molnarova 1 sibling, 0 replies; 5+ messages in thread From: Veronika Molnarova @ 2024-04-29 14:49 UTC (permalink / raw) To: Ian Rogers Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter, Athira Rajeev, linux-perf-users, linux-kernel On 4/22/24 17:42, Ian Rogers wrote: > On Mon, Apr 22, 2024 at 4:51 AM Veronika Molnarova <vmolnaro@redhat.com> wrote: >> >> Hi Ian, >> >> On Wed, Apr 3, 2024 at 6:48 PM Ian Rogers <irogers@google.com> wrote: >>> >>> Previously "set -e" meant any non-zero exit code from perf stat would >>> cause a test failure. As a non-zero exit happens when there aren't >>> sufficient permissions, check for this case and make the exit code >>> 2/skip for it. >>> >>> Signed-off-by: Ian Rogers <irogers@google.com> >>> --- >>> .../perf/tests/shell/stat_all_metricgroups.sh | 28 +++++++++++++++---- >>> 1 file changed, 22 insertions(+), 6 deletions(-) >>> >>> diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh b/tools/perf/tests/shell/stat_all_metricgroups.sh >>> index 55ef9c9ded2d..d6db192b9f18 100755 >>> --- a/tools/perf/tests/shell/stat_all_metricgroups.sh >>> +++ b/tools/perf/tests/shell/stat_all_metricgroups.sh >>> @@ -1,9 +1,7 @@ >>> -#!/bin/sh >>> +#!/bin/bash >>> # perf all metricgroups test >>> # SPDX-License-Identifier: GPL-2.0 >>> >>> -set -e >>> - >>> ParanoidAndNotRoot() >>> { >>> [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ] >>> @@ -14,11 +12,29 @@ if ParanoidAndNotRoot 0 >>> then >>> system_wide_flag="" >>> fi >>> - >>> +err=0 >>> for m in $(perf list --raw-dump metricgroups) >>> do >>> echo "Testing $m" >>> - perf stat -M "$m" $system_wide_flag sleep 0.01 >>> + result=$(perf stat -M "$m" $system_wide_flag sleep 0.01 2>&1) >>> + result_err=$? >>> + if [[ $result_err -gt 0 ]] >>> + then >>> + if [[ "$result" =~ \ >>> + "Access to performance monitoring and observability operations is limited" ]] >>> + then >>> + echo "Permission failure" >>> + echo $result >>> + if [[ $err -eq 0 ]] >>> + then >>> + err=2 # Skip >>> + fi >>> + else >>> + echo "Metric group $m failed" >>> + echo $result >>> + err=1 # Fail >>> + fi >>> + fi >>> done >>> >>> -exit 0 >>> +exit $err >>> -- >>> 2.44.0.478.gd926399ef9-goog >>> >>> >> >> The patch looks good and thanks for taking care of it. >> >> Just wanted to check what is the desired outcome for metric groups >> with events that are invalid in per-thread mode causing the test to fail. >> >> ``` >> $ ./stat_all_metricgroups.sh >> Testing smi >> Metric group smi failed >> Error: Invalid event (msr/smi/u) in per-thread mode, enable system wide with '-a'. >> ``` >> >> Wouldn't it be better if in these cases the test would result in skip instead of fail? > > Hi Veronika, > > I agree that fail isn't best here. I'm wondering: > > - why doesn't msr/smi/ support per-thread mode? Can't we save/restore > the count on a context switch? The implementation is here: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/arch/x86/events/msr.c?h=perf-tools-next#n234 > There's clearly something going on as pperf appears to have other restrictions: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/arch/x86/events/msr.c?h=perf-tools-next#n115 > I'm wondering if aggregation is working right if these counters are > more than per hyperthread (I'm guessing why the restrictions exist). > Hi Ian, I am not really sure why the msr/smi/ does not support per-thread mode but I encountered multiple instances of events that aren't supported in per-thread mode during testing. Generally, we cannot be sure that there won't be such event, whether the msr/smi event should be able to handle per-thread mode is another question. Thanks for checking it out and would be great if you could add it to v2. Thanks, Veronika > - the tool error message is doing pretty good. In the test I guess we > can spot the per-thread error and turn the fail to a skip. It's a > shame to bucket things as skip, but it seems easier than listing > metrics in the test or spotting particular events. > > I can do a v2 to add this. > > Thanks, > Ian > >> Thanks, >> Veronika >> > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-04-29 14:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-03 16:48 [PATCH v1] perf test: Be more tolerant of metricgroup failures Ian Rogers
2024-04-04 18:08 ` Namhyung Kim
[not found] ` <CAGa8GX5MhP3FUhafN5QivHaE3Fg+p5MgvTq3SW6MQy4NeZ1sYQ@mail.gmail.com>
2024-04-22 15:42 ` Ian Rogers
2024-04-25 18:58 ` Arnaldo Carvalho de Melo
2024-04-29 14:49 ` Veronika Molnarova
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).