* [PATCH] perf: test: Speed up running brstack test
@ 2024-12-13 23:13 Rob Herring (Arm)
2024-12-17 14:41 ` James Clark
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Rob Herring (Arm) @ 2024-12-13 23:13 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan
Cc: James Clark, Anshuman Khandual, linux-perf-users, linux-kernel
From: James Clark <james.clark@arm.com>
The brstack test runs quite slowly in software models. Part of the reason
is "xargs -n1" is quite inefficient in replacing spaces with newlines.
While that's not noticeable on normal machines, it is on software models.
Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
much faster. For comparison on an M1 Macbook Pro:
$ time seq -s ' ' 10000 | xargs -n1 > /dev/null
real 0m2.729s
user 0m2.009s
sys 0m0.914s
$ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null
real 0m0.002s
user 0m0.001s
sys 0m0.001s
The "grep '.'" is also needed to remove any remaining blank lines.
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
[robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
Originally part of this series[1], but I've dropped any Arm specifics,
and it stands on its own. No reason this needs to wait on Arm BRBE
support (which I'm working on now). I don't expect to have other changes
to this test related to BRBE anymore.
[1] https://lore.kernel.org/all/20240613061731.3109448-8-anshuman.khandual@arm.com/
tools/perf/tests/shell/test_brstack.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/tests/shell/test_brstack.sh b/tools/perf/tests/shell/test_brstack.sh
index 5f14d0cb013f..e01df7581393 100755
--- a/tools/perf/tests/shell/test_brstack.sh
+++ b/tools/perf/tests/shell/test_brstack.sh
@@ -30,7 +30,7 @@ test_user_branches() {
echo "Testing user branch stack sampling"
perf record -o $TMPDIR/perf.data --branch-filter any,save_type,u -- ${TESTPROG} > /dev/null 2>&1
- perf script -i $TMPDIR/perf.data --fields brstacksym | xargs -n1 > $TMPDIR/perf.script
+ perf script -i $TMPDIR/perf.data --fields brstacksym | tr -s ' ' '\n' > $TMPDIR/perf.script
# example of branch entries:
# brstack_foo+0x14/brstack_bar+0x40/P/-/-/0/CALL
@@ -59,7 +59,7 @@ test_filter() {
echo "Testing branch stack filtering permutation ($test_filter_filter,$test_filter_expect)"
perf record -o $TMPDIR/perf.data --branch-filter $test_filter_filter,save_type,u -- ${TESTPROG} > /dev/null 2>&1
- perf script -i $TMPDIR/perf.data --fields brstack | xargs -n1 > $TMPDIR/perf.script
+ perf script -i $TMPDIR/perf.data --fields brstack | tr -s ' ' '\n' | grep '.' > $TMPDIR/perf.script
# fail if we find any branch type that doesn't match any of the expected ones
# also consider UNKNOWN branch types (-)
--
2.45.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] perf: test: Speed up running brstack test
2024-12-13 23:13 [PATCH] perf: test: Speed up running brstack test Rob Herring (Arm)
@ 2024-12-17 14:41 ` James Clark
2024-12-18 3:32 ` Anshuman Khandual
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: James Clark @ 2024-12-17 14:41 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Anshuman Khandual, linux-perf-users, linux-kernel, Peter Zijlstra,
Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan
On 13/12/2024 11:13 pm, Rob Herring (Arm) wrote:
> From: James Clark <james.clark@arm.com>
>
> The brstack test runs quite slowly in software models. Part of the reason
> is "xargs -n1" is quite inefficient in replacing spaces with newlines.
> While that's not noticeable on normal machines, it is on software models.
> Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
> much faster. For comparison on an M1 Macbook Pro:
>
> $ time seq -s ' ' 10000 | xargs -n1 > /dev/null
>
> real 0m2.729s
> user 0m2.009s
> sys 0m0.914s
> $ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null
>
> real 0m0.002s
> user 0m0.001s
> sys 0m0.001s
>
> The "grep '.'" is also needed to remove any remaining blank lines.
>
> Signed-off-by: James Clark <james.clark@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> [robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> Originally part of this series[1], but I've dropped any Arm specifics,
> and it stands on its own. No reason this needs to wait on Arm BRBE
> support (which I'm working on now). I don't expect to have other changes
> to this test related to BRBE anymore.
>
> [1] https://lore.kernel.org/all/20240613061731.3109448-8-anshuman.khandual@arm.com/
>
Reviewed-by: James Clark <james.clark@linaro.org>
> tools/perf/tests/shell/test_brstack.sh | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/tests/shell/test_brstack.sh b/tools/perf/tests/shell/test_brstack.sh
> index 5f14d0cb013f..e01df7581393 100755
> --- a/tools/perf/tests/shell/test_brstack.sh
> +++ b/tools/perf/tests/shell/test_brstack.sh
> @@ -30,7 +30,7 @@ test_user_branches() {
> echo "Testing user branch stack sampling"
>
> perf record -o $TMPDIR/perf.data --branch-filter any,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstacksym | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstacksym | tr -s ' ' '\n' > $TMPDIR/perf.script
>
> # example of branch entries:
> # brstack_foo+0x14/brstack_bar+0x40/P/-/-/0/CALL
> @@ -59,7 +59,7 @@ test_filter() {
> echo "Testing branch stack filtering permutation ($test_filter_filter,$test_filter_expect)"
>
> perf record -o $TMPDIR/perf.data --branch-filter $test_filter_filter,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstack | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstack | tr -s ' ' '\n' | grep '.' > $TMPDIR/perf.script
>
> # fail if we find any branch type that doesn't match any of the expected ones
> # also consider UNKNOWN branch types (-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf: test: Speed up running brstack test
2024-12-13 23:13 [PATCH] perf: test: Speed up running brstack test Rob Herring (Arm)
2024-12-17 14:41 ` James Clark
@ 2024-12-18 3:32 ` Anshuman Khandual
2024-12-19 5:48 ` Namhyung Kim
2025-01-13 14:25 ` Rob Herring
3 siblings, 0 replies; 6+ messages in thread
From: Anshuman Khandual @ 2024-12-18 3:32 UTC (permalink / raw)
To: Rob Herring (Arm), Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan
Cc: James Clark, linux-perf-users, linux-kernel
On 12/14/24 04:43, Rob Herring (Arm) wrote:
> From: James Clark <james.clark@arm.com>
>
> The brstack test runs quite slowly in software models. Part of the reason
> is "xargs -n1" is quite inefficient in replacing spaces with newlines.
> While that's not noticeable on normal machines, it is on software models.
> Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
> much faster. For comparison on an M1 Macbook Pro:
>
> $ time seq -s ' ' 10000 | xargs -n1 > /dev/null
>
> real 0m2.729s
> user 0m2.009s
> sys 0m0.914s
> $ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null
>
> real 0m0.002s
> user 0m0.001s
> sys 0m0.001s
>
> The "grep '.'" is also needed to remove any remaining blank lines.
>
> Signed-off-by: James Clark <james.clark@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> [robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> Originally part of this series[1], but I've dropped any Arm specifics,
> and it stands on its own. No reason this needs to wait on Arm BRBE
> support (which I'm working on now). I don't expect to have other changes
> to this test related to BRBE anymore.
>
> [1] https://lore.kernel.org/all/20240613061731.3109448-8-anshuman.khandual@arm.com/
>
> tools/perf/tests/shell/test_brstack.sh | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/tests/shell/test_brstack.sh b/tools/perf/tests/shell/test_brstack.sh
> index 5f14d0cb013f..e01df7581393 100755
> --- a/tools/perf/tests/shell/test_brstack.sh
> +++ b/tools/perf/tests/shell/test_brstack.sh
> @@ -30,7 +30,7 @@ test_user_branches() {
> echo "Testing user branch stack sampling"
>
> perf record -o $TMPDIR/perf.data --branch-filter any,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstacksym | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstacksym | tr -s ' ' '\n' > $TMPDIR/perf.script
>
> # example of branch entries:
> # brstack_foo+0x14/brstack_bar+0x40/P/-/-/0/CALL
> @@ -59,7 +59,7 @@ test_filter() {
> echo "Testing branch stack filtering permutation ($test_filter_filter,$test_filter_expect)"
>
> perf record -o $TMPDIR/perf.data --branch-filter $test_filter_filter,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstack | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstack | tr -s ' ' '\n' | grep '.' > $TMPDIR/perf.script
>
> # fail if we find any branch type that doesn't match any of the expected ones
> # also consider UNKNOWN branch types (-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf: test: Speed up running brstack test
2024-12-13 23:13 [PATCH] perf: test: Speed up running brstack test Rob Herring (Arm)
2024-12-17 14:41 ` James Clark
2024-12-18 3:32 ` Anshuman Khandual
@ 2024-12-19 5:48 ` Namhyung Kim
2025-01-13 14:25 ` Rob Herring
3 siblings, 0 replies; 6+ messages in thread
From: Namhyung Kim @ 2024-12-19 5:48 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Liang, Kan, James Clark, Anshuman Khandual,
linux-perf-users, linux-kernel
On Fri, Dec 13, 2024 at 05:13:12PM -0600, Rob Herring (Arm) wrote:
> From: James Clark <james.clark@arm.com>
>
> The brstack test runs quite slowly in software models. Part of the reason
> is "xargs -n1" is quite inefficient in replacing spaces with newlines.
> While that's not noticeable on normal machines, it is on software models.
> Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
> much faster. For comparison on an M1 Macbook Pro:
>
> $ time seq -s ' ' 10000 | xargs -n1 > /dev/null
>
> real 0m2.729s
> user 0m2.009s
> sys 0m0.914s
> $ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null
>
> real 0m0.002s
> user 0m0.001s
> sys 0m0.001s
>
> The "grep '.'" is also needed to remove any remaining blank lines.
>
> Signed-off-by: James Clark <james.clark@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> [robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
> ---
> Originally part of this series[1], but I've dropped any Arm specifics,
> and it stands on its own. No reason this needs to wait on Arm BRBE
> support (which I'm working on now). I don't expect to have other changes
> to this test related to BRBE anymore.
>
> [1] https://lore.kernel.org/all/20240613061731.3109448-8-anshuman.khandual@arm.com/
>
> tools/perf/tests/shell/test_brstack.sh | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/tests/shell/test_brstack.sh b/tools/perf/tests/shell/test_brstack.sh
> index 5f14d0cb013f..e01df7581393 100755
> --- a/tools/perf/tests/shell/test_brstack.sh
> +++ b/tools/perf/tests/shell/test_brstack.sh
> @@ -30,7 +30,7 @@ test_user_branches() {
> echo "Testing user branch stack sampling"
>
> perf record -o $TMPDIR/perf.data --branch-filter any,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstacksym | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstacksym | tr -s ' ' '\n' > $TMPDIR/perf.script
>
> # example of branch entries:
> # brstack_foo+0x14/brstack_bar+0x40/P/-/-/0/CALL
> @@ -59,7 +59,7 @@ test_filter() {
> echo "Testing branch stack filtering permutation ($test_filter_filter,$test_filter_expect)"
>
> perf record -o $TMPDIR/perf.data --branch-filter $test_filter_filter,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstack | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstack | tr -s ' ' '\n' | grep '.' > $TMPDIR/perf.script
>
> # fail if we find any branch type that doesn't match any of the expected ones
> # also consider UNKNOWN branch types (-)
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf: test: Speed up running brstack test
2024-12-13 23:13 [PATCH] perf: test: Speed up running brstack test Rob Herring (Arm)
` (2 preceding siblings ...)
2024-12-19 5:48 ` Namhyung Kim
@ 2025-01-13 14:25 ` Rob Herring
2025-01-13 14:59 ` Arnaldo Carvalho de Melo
3 siblings, 1 reply; 6+ messages in thread
From: Rob Herring @ 2025-01-13 14:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan
Cc: James Clark, Anshuman Khandual, linux-perf-users, linux-kernel
On Fri, Dec 13, 2024 at 5:19 PM Rob Herring (Arm) <robh@kernel.org> wrote:
>
> From: James Clark <james.clark@arm.com>
>
> The brstack test runs quite slowly in software models. Part of the reason
> is "xargs -n1" is quite inefficient in replacing spaces with newlines.
> While that's not noticeable on normal machines, it is on software models.
> Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
> much faster. For comparison on an M1 Macbook Pro:
>
> $ time seq -s ' ' 10000 | xargs -n1 > /dev/null
>
> real 0m2.729s
> user 0m2.009s
> sys 0m0.914s
> $ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null
>
> real 0m0.002s
> user 0m0.001s
> sys 0m0.001s
>
> The "grep '.'" is also needed to remove any remaining blank lines.
>
> Signed-off-by: James Clark <james.clark@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> [robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> Originally part of this series[1], but I've dropped any Arm specifics,
> and it stands on its own. No reason this needs to wait on Arm BRBE
> support (which I'm working on now). I don't expect to have other changes
> to this test related to BRBE anymore.
>
> [1] https://lore.kernel.org/all/20240613061731.3109448-8-anshuman.khandual@arm.com/
>
> tools/perf/tests/shell/test_brstack.sh | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Ping!
>
> diff --git a/tools/perf/tests/shell/test_brstack.sh b/tools/perf/tests/shell/test_brstack.sh
> index 5f14d0cb013f..e01df7581393 100755
> --- a/tools/perf/tests/shell/test_brstack.sh
> +++ b/tools/perf/tests/shell/test_brstack.sh
> @@ -30,7 +30,7 @@ test_user_branches() {
> echo "Testing user branch stack sampling"
>
> perf record -o $TMPDIR/perf.data --branch-filter any,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstacksym | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstacksym | tr -s ' ' '\n' > $TMPDIR/perf.script
>
> # example of branch entries:
> # brstack_foo+0x14/brstack_bar+0x40/P/-/-/0/CALL
> @@ -59,7 +59,7 @@ test_filter() {
> echo "Testing branch stack filtering permutation ($test_filter_filter,$test_filter_expect)"
>
> perf record -o $TMPDIR/perf.data --branch-filter $test_filter_filter,save_type,u -- ${TESTPROG} > /dev/null 2>&1
> - perf script -i $TMPDIR/perf.data --fields brstack | xargs -n1 > $TMPDIR/perf.script
> + perf script -i $TMPDIR/perf.data --fields brstack | tr -s ' ' '\n' | grep '.' > $TMPDIR/perf.script
>
> # fail if we find any branch type that doesn't match any of the expected ones
> # also consider UNKNOWN branch types (-)
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf: test: Speed up running brstack test
2025-01-13 14:25 ` Rob Herring
@ 2025-01-13 14:59 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-01-13 14:59 UTC (permalink / raw)
To: Rob Herring
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, James Clark, Anshuman Khandual, linux-perf-users,
linux-kernel
On Mon, Jan 13, 2025 at 08:25:45AM -0600, Rob Herring wrote:
> On Fri, Dec 13, 2024 at 5:19 PM Rob Herring (Arm) <robh@kernel.org> wrote:
> >
> > From: James Clark <james.clark@arm.com>
> >
> > The brstack test runs quite slowly in software models. Part of the reason
> > is "xargs -n1" is quite inefficient in replacing spaces with newlines.
> > While that's not noticeable on normal machines, it is on software models.
> > Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
> > much faster. For comparison on an M1 Macbook Pro:
> >
> > $ time seq -s ' ' 10000 | xargs -n1 > /dev/null
> >
> > real 0m2.729s
> > user 0m2.009s
> > sys 0m0.914s
> > $ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null
> >
> > real 0m0.002s
> > user 0m0.001s
> > sys 0m0.001s
> >
> > The "grep '.'" is also needed to remove any remaining blank lines.
> >
> > Signed-off-by: James Clark <james.clark@arm.com>
> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > [robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
> > Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> > ---
> > Originally part of this series[1], but I've dropped any Arm specifics,
> > and it stands on its own. No reason this needs to wait on Arm BRBE
> > support (which I'm working on now). I don't expect to have other changes
> > to this test related to BRBE anymore.
> >
> > [1] https://lore.kernel.org/all/20240613061731.3109448-8-anshuman.khandual@arm.com/
> >
> > tools/perf/tests/shell/test_brstack.sh | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Ping!
Thanks, applied.
- Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-01-13 14:59 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-13 23:13 [PATCH] perf: test: Speed up running brstack test Rob Herring (Arm)
2024-12-17 14:41 ` James Clark
2024-12-18 3:32 ` Anshuman Khandual
2024-12-19 5:48 ` Namhyung Kim
2025-01-13 14:25 ` Rob Herring
2025-01-13 14:59 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).