* [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks
@ 2024-08-08 5:46 Ian Rogers
2024-08-08 5:46 ` [PATCH v1 2/2] perf test: Add set of perf record LBR tests Ian Rogers
2024-08-08 15:59 ` [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks Liang, Kan
0 siblings, 2 replies; 7+ messages in thread
From: Ian Rogers @ 2024-08-08 5:46 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Anne Macedo, Changbin Du,
Andi Kleen, linux-kernel, linux-perf-users
The callchain_cursor_node has a map_symbol whose maps and map
variables are reference counted. Ensure these values use a _get
routine to increment the refernece counts and use map_symbol__exit to
release the reference counts. Do similar for thread's prev_lbr_cursor,
but save the size of the prev_lbr_cursor array so that it may be
iterated.
Ensure that when stitch_nodes are placed on the free list the
map_symbols are exited. Fix resolve_lbr_callchain_sample by replacing
list_replace_init to list_splice_init, so the whole list is moved and
nodes aren't leaked.
A reproduction of the memory leaks is possible with a leak sanitizer
build in the perf report command of:
```
$ perf record -e cycles --call-graph lbr perf test -w thloop
$ perf report --stitch-lbr
```
Fixes: ff165628d726 ("perf callchain: Stitch LBR call stack")
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/machine.c | 17 +++++++++++++++--
tools/perf/util/thread.c | 4 ++++
tools/perf/util/thread.h | 1 +
3 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 8477edefc299..706be5e4a076 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2270,8 +2270,12 @@ static void save_lbr_cursor_node(struct thread *thread,
cursor->curr = cursor->first;
else
cursor->curr = cursor->curr->next;
+
+ map_symbol__exit(&lbr_stitch->prev_lbr_cursor[idx].ms);
memcpy(&lbr_stitch->prev_lbr_cursor[idx], cursor->curr,
sizeof(struct callchain_cursor_node));
+ lbr_stitch->prev_lbr_cursor[idx].ms.maps = maps__get(cursor->curr->ms.maps);
+ lbr_stitch->prev_lbr_cursor[idx].ms.map = map__get(cursor->curr->ms.map);
lbr_stitch->prev_lbr_cursor[idx].valid = true;
cursor->pos++;
@@ -2482,6 +2486,9 @@ static bool has_stitched_lbr(struct thread *thread,
memcpy(&stitch_node->cursor, &lbr_stitch->prev_lbr_cursor[i],
sizeof(struct callchain_cursor_node));
+ stitch_node->cursor.ms.maps = maps__get(lbr_stitch->prev_lbr_cursor[i].ms.maps);
+ stitch_node->cursor.ms.map = map__get(lbr_stitch->prev_lbr_cursor[i].ms.map);
+
if (callee)
list_add(&stitch_node->node, &lbr_stitch->lists);
else
@@ -2505,6 +2512,8 @@ static bool alloc_lbr_stitch(struct thread *thread, unsigned int max_lbr)
if (!thread__lbr_stitch(thread)->prev_lbr_cursor)
goto free_lbr_stitch;
+ thread__lbr_stitch(thread)->prev_lbr_cursor_size = max_lbr + 1;
+
INIT_LIST_HEAD(&thread__lbr_stitch(thread)->lists);
INIT_LIST_HEAD(&thread__lbr_stitch(thread)->free_lists);
@@ -2560,8 +2569,12 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
max_lbr, callee);
if (!stitched_lbr && !list_empty(&lbr_stitch->lists)) {
- list_replace_init(&lbr_stitch->lists,
- &lbr_stitch->free_lists);
+ struct stitch_list *stitch_node;
+
+ list_for_each_entry(stitch_node, &lbr_stitch->lists, node)
+ map_symbol__exit(&stitch_node->cursor.ms);
+
+ list_splice_init(&lbr_stitch->lists, &lbr_stitch->free_lists);
}
memcpy(&lbr_stitch->prev_sample, sample, sizeof(*sample));
}
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 87c59aa9fe38..0ffdd52d86d7 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -476,6 +476,7 @@ void thread__free_stitch_list(struct thread *thread)
return;
list_for_each_entry_safe(pos, tmp, &lbr_stitch->lists, node) {
+ map_symbol__exit(&pos->cursor.ms);
list_del_init(&pos->node);
free(pos);
}
@@ -485,6 +486,9 @@ void thread__free_stitch_list(struct thread *thread)
free(pos);
}
+ for (unsigned int i = 0 ; i < lbr_stitch->prev_lbr_cursor_size; i++)
+ map_symbol__exit(&lbr_stitch->prev_lbr_cursor[i].ms);
+
zfree(&lbr_stitch->prev_lbr_cursor);
free(thread__lbr_stitch(thread));
thread__set_lbr_stitch(thread, NULL);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 8b4a3c69bad1..6cbf6eb2812e 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -26,6 +26,7 @@ struct lbr_stitch {
struct list_head free_lists;
struct perf_sample prev_sample;
struct callchain_cursor_node *prev_lbr_cursor;
+ unsigned int prev_lbr_cursor_size;
};
DECLARE_RC_STRUCT(thread) {
--
2.46.0.rc2.264.g509ed76dc8-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v1 2/2] perf test: Add set of perf record LBR tests
2024-08-08 5:46 [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks Ian Rogers
@ 2024-08-08 5:46 ` Ian Rogers
2024-08-08 14:34 ` Arnaldo Carvalho de Melo
2024-08-08 15:59 ` [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks Liang, Kan
1 sibling, 1 reply; 7+ messages in thread
From: Ian Rogers @ 2024-08-08 5:46 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Anne Macedo, Changbin Du,
Andi Kleen, linux-kernel, linux-perf-users
Adds coverage for LBR operations and LBR callgraph.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/tests/shell/record_lbr.sh | 161 +++++++++++++++++++++++++++
1 file changed, 161 insertions(+)
create mode 100755 tools/perf/tests/shell/record_lbr.sh
diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
new file mode 100755
index 000000000000..baf168d0ddbb
--- /dev/null
+++ b/tools/perf/tests/shell/record_lbr.sh
@@ -0,0 +1,161 @@
+#!/bin/bash
+# perf record LBR tests
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+if [ ! -f /sys/devices/cpu/caps/branches ]
+then
+ echo "Skip: only x86 CPUs support LBR"
+ exit 2
+fi
+
+err=0
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX)
+
+cleanup() {
+ rm -rf "${perfdata}"
+ rm -rf "${perfdata}".old
+ rm -rf "${perfdata}".txt
+
+ trap - EXIT TERM INT
+}
+
+trap_cleanup() {
+ cleanup
+ exit 1
+}
+trap trap_cleanup EXIT TERM INT
+
+
+lbr_callgraph_test() {
+ test="LBR callgraph"
+
+ echo "$test"
+ if ! perf record -e cycles --call-graph lbr -o "${perfdata}" perf test -w thloop
+ then
+ echo "$test [Failed support missing]"
+ if [ $err -eq 0 ]
+ then
+ err=2
+ fi
+ return
+ fi
+
+ if ! perf report --stitch-lbr -i "${perfdata}" > "${perfdata}".txt
+ then
+ cat "${perfdata}".txt
+ echo "$test [Failed in perf report]"
+ err=1
+ return
+ fi
+
+ echo "$test [Success]"
+}
+
+lbr_test() {
+ local branch_flags=$1
+ local test="LBR $2 test"
+ local threshold=$3
+ local out
+ local sam_nr
+ local bs_nr
+ local zero_nr
+ local r
+
+ echo "$test"
+ if ! perf record -e cycles $branch_flags -o "${perfdata}" perf test -w thloop
+ then
+ echo "$test [Failed support missing]"
+ perf record -e cycles $branch_flags -o "${perfdata}" perf test -w thloop || true
+ if [ $err -eq 0 ]
+ then
+ err=2
+ fi
+ return
+ fi
+
+ out=$(perf report -D -i "${perfdata}" 2> /dev/null | grep -A1 'PERF_RECORD_SAMPLE')
+ sam_nr=$(echo "$out" | grep -c 'PERF_RECORD_SAMPLE' || true)
+ if [ $sam_nr -eq 0 ]
+ then
+ echo "$test [Failed no samples captured]"
+ err=1
+ return
+ fi
+ echo "$test: $sam_nr samples"
+
+ bs_nr=$(echo "$out" | grep -c 'branch stack: nr:' || true)
+ if [ $sam_nr -ne $bs_nr ]
+ then
+ echo "$test [Failed samples missing branch stacks]"
+ err=1
+ return
+ fi
+
+ zero_nr=$(echo "$out" | grep -c 'branch stack: nr:0' || true)
+ r=$(($zero_nr * 100 / $bs_nr))
+ if [ $r -gt $threshold ]; then
+ echo "$test [Failed empty br stack ratio exceed $threshold%: $r%]"
+ err=1
+ return
+ fi
+
+ echo "$test [Success]"
+}
+
+parallel_lbr_test() {
+ err=0
+ perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX)
+ lbr_test "$1" "$2" "$3"
+ cleanup
+ exit $err
+}
+
+lbr_callgraph_test
+
+# Sequential
+lbr_test "-b" "any branch" 2
+lbr_test "-j any_call" "any call" 2
+lbr_test "-j any_ret" "any ret" 2
+lbr_test "-j ind_call" "any indirect call" 2
+lbr_test "-j ind_jmp" "any indirect jump" 100
+lbr_test "-j call" "direct calls" 2
+lbr_test "-j ind_call,u" "any indirect user call" 100
+lbr_test "-a -b" "system wide any branch" 2
+lbr_test "-a -j any_call" "system wide any call" 2
+
+# Parallel
+parallel_lbr_test "-b" "parallel any branch" 100 &
+pid1=$!
+parallel_lbr_test "-j any_call" "parallel any call" 100 &
+pid2=$!
+parallel_lbr_test "-j any_ret" "parallel any ret" 100 &
+pid3=$!
+parallel_lbr_test "-j ind_call" "parallel any indirect call" 100 &
+pid4=$!
+parallel_lbr_test "-j ind_jmp" "parallel any indirect jump" 100 &
+pid5=$!
+parallel_lbr_test "-j call" "parallel direct calls" 100 &
+pid6=$!
+parallel_lbr_test "-j ind_call,u" "parallel any indirect user call" 100 &
+pid7=$!
+parallel_lbr_test "-a -b" "parallel system wide any branch" 100 &
+pid8=$!
+parallel_lbr_test "-a -j any_call" "parallel system wide any call" 100 &
+pid9=$!
+
+for pid in $pid1 $pid2 $pid3 $pid4 $pid5 $pid6 $pid7 $pid8 $pid9
+do
+ set +e
+ wait $pid
+ child_err=$?
+ set -e
+ if ([ $err -eq 2 ] && [ $child_err -eq 1 ]) || [ $err -eq 0 ]
+ then
+ err=$child_err
+ fi
+done
+
+cleanup
+exit $err
--
2.46.0.rc2.264.g509ed76dc8-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] perf test: Add set of perf record LBR tests
2024-08-08 5:46 ` [PATCH v1 2/2] perf test: Add set of perf record LBR tests Ian Rogers
@ 2024-08-08 14:34 ` Arnaldo Carvalho de Melo
2024-08-08 16:00 ` Ian Rogers
2024-08-08 16:01 ` Liang, Kan
0 siblings, 2 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-08 14:34 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Anne Macedo, Changbin Du, Andi Kleen, linux-kernel,
linux-perf-users
On Wed, Aug 07, 2024 at 10:46:44PM -0700, Ian Rogers wrote:
> Adds coverage for LBR operations and LBR callgraph.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> tools/perf/tests/shell/record_lbr.sh | 161 +++++++++++++++++++++++++++
> 1 file changed, 161 insertions(+)
> create mode 100755 tools/perf/tests/shell/record_lbr.sh
>
> diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
> new file mode 100755
> index 000000000000..baf168d0ddbb
> --- /dev/null
> +++ b/tools/perf/tests/shell/record_lbr.sh
> @@ -0,0 +1,161 @@
> +#!/bin/bash
> +# perf record LBR tests
> +# SPDX-License-Identifier: GPL-2.0
> +
> +set -e
> +
> +if [ ! -f /sys/devices/cpu/caps/branches ]
> +then
> + echo "Skip: only x86 CPUs support LBR"
> + exit 2
> +fi
root@x1:~# ls -la /sys/devices/cpu*/caps/branches
-r--r--r--. 1 root root 4096 Jun 4 16:07 /sys/devices/cpu_atom/caps/branches
-r--r--r--. 1 root root 4096 Aug 8 11:21 /sys/devices/cpu_core/caps/branches
root@x1:~#
I'm getting:
root@x1:~# perf test -vvvv LBR
97: perf record LBR tests:
--- start ---
test child forked, pid 2033388
Skip: only x86 CPUs support LBR
---- end(-2) ----
97: perf record LBR tests : Skip
root@x1:~#
So I added the following follow-up patch, ack?
- Arnaldo
From f6ea332dd25cc62b9493b2d40040c2fb35253d9e Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 8 Aug 2024 11:26:13 -0300
Subject: [PATCH 1/1] perf test shell lbr: Support hybrid x86 systems too
Running on a:
root@x1:~# grep 'model name' -m1 /proc/cpuinfo
model name : 13th Gen Intel(R) Core(TM) i7-1365U
root@x1:~#
It skips all the tests with:
root@x1:~# perf test -vvvv LBR
97: perf record LBR tests:
--- start ---
test child forked, pid 2033388
Skip: only x86 CPUs support LBR
---- end(-2) ----
97: perf record LBR tests : Skip
root@x1:~#
Because the test checks for the /sys/devices/cpu/caps/branches file,
that isn't present as we have instead:
root@x1:~# ls -la /sys/devices/cpu*/caps/branches
-r--r--r--. 1 root root 4096 Aug 8 11:22 /sys/devices/cpu_atom/caps/branches
-r--r--r--. 1 root root 4096 Aug 8 11:21 /sys/devices/cpu_core/caps/branches
root@x1:~#
If we check as well for one of those,
/sys/devices/cpu_core/caps/branches, then we don't skip the tests and
all are run on these x86 Intel Hybrid systems as well, passing all of
them:
root@x1:~# perf test -vvvv LBR
97: perf record LBR tests:
--- start ---
test child forked, pid 2034956
LBR callgraph
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.812 MB /tmp/__perf_test.perf.data.B2HvQ (8114 samples) ]
LBR callgraph [Success]
LBR any branch test
[ perf record: Woken up 25 times to write data ]
[ perf record: Captured and wrote 6.382 MB /tmp/__perf_test.perf.data.B2HvQ (8071 samples) ]
LBR any branch test: 8071 samples
LBR any branch test [Success]
LBR any call test
[ perf record: Woken up 23 times to write data ]
[ perf record: Captured and wrote 6.208 MB /tmp/__perf_test.perf.data.B2HvQ (8092 samples) ]
LBR any call test: 8092 samples
LBR any call test [Success]
LBR any ret test
[ perf record: Woken up 24 times to write data ]
[ perf record: Captured and wrote 6.396 MB /tmp/__perf_test.perf.data.B2HvQ (8093 samples) ]
LBR any ret test: 8093 samples
LBR any ret test [Success]
LBR any indirect call test
[ perf record: Woken up 25 times to write data ]
[ perf record: Captured and wrote 6.344 MB /tmp/__perf_test.perf.data.B2HvQ (8067 samples) ]
LBR any indirect call test: 8067 samples
LBR any indirect call test [Success]
LBR any indirect jump test
[ perf record: Woken up 12 times to write data ]
[ perf record: Captured and wrote 3.073 MB /tmp/__perf_test.perf.data.B2HvQ (8061 samples) ]
LBR any indirect jump test: 8061 samples
LBR any indirect jump test [Success]
LBR direct calls test
[ perf record: Woken up 25 times to write data ]
[ perf record: Captured and wrote 6.380 MB /tmp/__perf_test.perf.data.B2HvQ (8076 samples) ]
LBR direct calls test: 8076 samples
LBR direct calls test [Success]
LBR any indirect user call test
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.597 MB /tmp/__perf_test.perf.data.B2HvQ (8079 samples) ]
LBR any indirect user call test: 8079 samples
LBR any indirect user call test [Success]
LBR system wide any branch test
[ perf record: Woken up 26 times to write data ]
[ perf record: Captured and wrote 9.088 MB /tmp/__perf_test.perf.data.B2HvQ (9209 samples) ]
LBR system wide any branch test: 9209 samples
LBR system wide any branch test [Success]
LBR system wide any call test
[ perf record: Woken up 25 times to write data ]
[ perf record: Captured and wrote 8.945 MB /tmp/__perf_test.perf.data.B2HvQ (9333 samples) ]
LBR system wide any call test: 9333 samples
LBR system wide any call test [Success]
LBR parallel any branch test
LBR parallel any call test
LBR parallel any ret test
LBR parallel any indirect call test
LBR parallel any indirect jump test
LBR parallel direct calls test
LBR parallel system wide any branch test
LBR parallel any indirect user call test
LBR parallel system wide any call test
[ perf record: Woken up 9 times to write data ]
[ perf record: Woken up 51 times to write data ]
[ perf record: Woken up 1 times to write data ]
[ perf record: Woken up 5 times to write data ]
[ perf record: Woken up 559 times to write data ]
[ perf record: Woken up 14 times to write data ]
[ perf record: Woken up 17 times to write data ]
[ perf record: Woken up 1 times to write data ]
[ perf record: Woken up 11 times to write data ]
[ perf record: Captured and wrote 0.150 MB /tmp/__perf_test.perf.data.lANpR (1909 samples) ]
[ perf record: Captured and wrote 2.371 MB /tmp/__perf_test.perf.data.Olum8 (3033 samples) ]
[ perf record: Captured and wrote 1.230 MB /tmp/__perf_test.perf.data.njfJ8 (1742 samples) ]
[ perf record: Captured and wrote 5.554 MB /tmp/__perf_test.perf.data.4ZTrj (29662 samples) ]
[ perf record: Captured and wrote 19.906 MB /tmp/__perf_test.perf.data.dlGQt (29576 samples) ]
[ perf record: Captured and wrote 0.289 MB /tmp/__perf_test.perf.data.CAT7y (4311 samples) ]
[ perf record: Captured and wrote 3.129 MB /tmp/__perf_test.perf.data.diuKG (3971 samples) ]
LBR parallel any indirect user call test: 1909 samples
[ perf record: Captured and wrote 4.858 MB /tmp/__perf_test.perf.data.sVjtN (6130 samples) ]
LBR parallel any indirect user call test [Success]
[ perf record: Captured and wrote 3.669 MB /tmp/__perf_test.perf.data.AJtNI (4827 samples) ]
LBR parallel any indirect jump test: 4311 samples
LBR parallel any indirect jump test [Success]
LBR parallel direct calls test: 3033 samples
LBR parallel direct calls test [Success]
LBR parallel any indirect call test: 1742 samples
LBR parallel any indirect call test [Success]
LBR parallel any call test: 4827 samples
LBR parallel any call test [Success]
LBR parallel any branch test: 6130 samples
LBR parallel any branch test [Success]
LBR parallel system wide any branch test: 29662 samples
LBR parallel any ret test: 3971 samples
LBR parallel any ret test [Success]
LBR parallel system wide any branch test [Success]
LBR parallel system wide any call test: 29576 samples
LBR parallel system wide any call test [Success]
---- end(0) ----
97: perf record LBR tests : Ok
root@x1:~#
Cc: Adrian Hunter <adrian.hunter@intel.com>,
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>,
Cc: Changbin Du <changbin.du@huawei.com>,
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Kan Liang <kan.liang@linux.intel.com>,
Cc: Mark Rutland <mark.rutland@arm.com>,
Cc: Namhyung Kim <namhyung@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/tests/shell/record_lbr.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
index baf168d0ddbbf447..ead87b731898d70b 100755
--- a/tools/perf/tests/shell/record_lbr.sh
+++ b/tools/perf/tests/shell/record_lbr.sh
@@ -4,7 +4,7 @@
set -e
-if [ ! -f /sys/devices/cpu/caps/branches ]
+if [ ! -f /sys/devices/cpu/caps/branches -a ! -f /sys/devices/cpu_core/caps/branches ]
then
echo "Skip: only x86 CPUs support LBR"
exit 2
--
2.44.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks
2024-08-08 5:46 [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks Ian Rogers
2024-08-08 5:46 ` [PATCH v1 2/2] perf test: Add set of perf record LBR tests Ian Rogers
@ 2024-08-08 15:59 ` Liang, Kan
1 sibling, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2024-08-08 15:59 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Anne Macedo, Changbin Du, Andi Kleen, linux-kernel,
linux-perf-users
On 2024-08-08 1:46 a.m., Ian Rogers wrote:
> The callchain_cursor_node has a map_symbol whose maps and map
> variables are reference counted. Ensure these values use a _get
> routine to increment the refernece counts and use map_symbol__exit to
> release the reference counts. Do similar for thread's prev_lbr_cursor,
> but save the size of the prev_lbr_cursor array so that it may be
> iterated.
>
> Ensure that when stitch_nodes are placed on the free list the
> map_symbols are exited. Fix resolve_lbr_callchain_sample by replacing
> list_replace_init to list_splice_init, so the whole list is moved and
> nodes aren't leaked.
>
> A reproduction of the memory leaks is possible with a leak sanitizer
> build in the perf report command of:
> ```
> $ perf record -e cycles --call-graph lbr perf test -w thloop
> $ perf report --stitch-lbr
> ```
>
> Fixes: ff165628d726 ("perf callchain: Stitch LBR call stack")
> Signed-off-by: Ian Rogers <irogers@google.com>
Thanks Ian.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Thanks,
Kan
> ---
> tools/perf/util/machine.c | 17 +++++++++++++++--
> tools/perf/util/thread.c | 4 ++++
> tools/perf/util/thread.h | 1 +
> 3 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 8477edefc299..706be5e4a076 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2270,8 +2270,12 @@ static void save_lbr_cursor_node(struct thread *thread,
> cursor->curr = cursor->first;
> else
> cursor->curr = cursor->curr->next;
> +
> + map_symbol__exit(&lbr_stitch->prev_lbr_cursor[idx].ms);
> memcpy(&lbr_stitch->prev_lbr_cursor[idx], cursor->curr,
> sizeof(struct callchain_cursor_node));
> + lbr_stitch->prev_lbr_cursor[idx].ms.maps = maps__get(cursor->curr->ms.maps);
> + lbr_stitch->prev_lbr_cursor[idx].ms.map = map__get(cursor->curr->ms.map);
>
> lbr_stitch->prev_lbr_cursor[idx].valid = true;
> cursor->pos++;
> @@ -2482,6 +2486,9 @@ static bool has_stitched_lbr(struct thread *thread,
> memcpy(&stitch_node->cursor, &lbr_stitch->prev_lbr_cursor[i],
> sizeof(struct callchain_cursor_node));
>
> + stitch_node->cursor.ms.maps = maps__get(lbr_stitch->prev_lbr_cursor[i].ms.maps);
> + stitch_node->cursor.ms.map = map__get(lbr_stitch->prev_lbr_cursor[i].ms.map);
> +
> if (callee)
> list_add(&stitch_node->node, &lbr_stitch->lists);
> else
> @@ -2505,6 +2512,8 @@ static bool alloc_lbr_stitch(struct thread *thread, unsigned int max_lbr)
> if (!thread__lbr_stitch(thread)->prev_lbr_cursor)
> goto free_lbr_stitch;
>
> + thread__lbr_stitch(thread)->prev_lbr_cursor_size = max_lbr + 1;
> +
> INIT_LIST_HEAD(&thread__lbr_stitch(thread)->lists);
> INIT_LIST_HEAD(&thread__lbr_stitch(thread)->free_lists);
>
> @@ -2560,8 +2569,12 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
> max_lbr, callee);
>
> if (!stitched_lbr && !list_empty(&lbr_stitch->lists)) {
> - list_replace_init(&lbr_stitch->lists,
> - &lbr_stitch->free_lists);
> + struct stitch_list *stitch_node;
> +
> + list_for_each_entry(stitch_node, &lbr_stitch->lists, node)
> + map_symbol__exit(&stitch_node->cursor.ms);
> +
> + list_splice_init(&lbr_stitch->lists, &lbr_stitch->free_lists);
> }
> memcpy(&lbr_stitch->prev_sample, sample, sizeof(*sample));
> }
> diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> index 87c59aa9fe38..0ffdd52d86d7 100644
> --- a/tools/perf/util/thread.c
> +++ b/tools/perf/util/thread.c
> @@ -476,6 +476,7 @@ void thread__free_stitch_list(struct thread *thread)
> return;
>
> list_for_each_entry_safe(pos, tmp, &lbr_stitch->lists, node) {
> + map_symbol__exit(&pos->cursor.ms);
> list_del_init(&pos->node);
> free(pos);
> }
> @@ -485,6 +486,9 @@ void thread__free_stitch_list(struct thread *thread)
> free(pos);
> }
>
> + for (unsigned int i = 0 ; i < lbr_stitch->prev_lbr_cursor_size; i++)
> + map_symbol__exit(&lbr_stitch->prev_lbr_cursor[i].ms);
> +
> zfree(&lbr_stitch->prev_lbr_cursor);
> free(thread__lbr_stitch(thread));
> thread__set_lbr_stitch(thread, NULL);
> diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
> index 8b4a3c69bad1..6cbf6eb2812e 100644
> --- a/tools/perf/util/thread.h
> +++ b/tools/perf/util/thread.h
> @@ -26,6 +26,7 @@ struct lbr_stitch {
> struct list_head free_lists;
> struct perf_sample prev_sample;
> struct callchain_cursor_node *prev_lbr_cursor;
> + unsigned int prev_lbr_cursor_size;
> };
>
> DECLARE_RC_STRUCT(thread) {
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] perf test: Add set of perf record LBR tests
2024-08-08 14:34 ` Arnaldo Carvalho de Melo
@ 2024-08-08 16:00 ` Ian Rogers
2024-08-08 20:28 ` Arnaldo Carvalho de Melo
2024-08-08 16:01 ` Liang, Kan
1 sibling, 1 reply; 7+ messages in thread
From: Ian Rogers @ 2024-08-08 16:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Anne Macedo, Changbin Du, Andi Kleen, linux-kernel,
linux-perf-users
On Thu, Aug 8, 2024 at 7:34 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
[snip]
> -if [ ! -f /sys/devices/cpu/caps/branches ]
> +if [ ! -f /sys/devices/cpu/caps/branches -a ! -f /sys/devices/cpu_core/caps/branches ]
Adding this lgtm, thanks for testing.
Ian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] perf test: Add set of perf record LBR tests
2024-08-08 14:34 ` Arnaldo Carvalho de Melo
2024-08-08 16:00 ` Ian Rogers
@ 2024-08-08 16:01 ` Liang, Kan
1 sibling, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2024-08-08 16:01 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Anne Macedo,
Changbin Du, Andi Kleen, linux-kernel, linux-perf-users
On 2024-08-08 10:34 a.m., Arnaldo Carvalho de Melo wrote:
> On Wed, Aug 07, 2024 at 10:46:44PM -0700, Ian Rogers wrote:
>> Adds coverage for LBR operations and LBR callgraph.
>>
>> Signed-off-by: Ian Rogers <irogers@google.com>
>> ---
>> tools/perf/tests/shell/record_lbr.sh | 161 +++++++++++++++++++++++++++
>> 1 file changed, 161 insertions(+)
>> create mode 100755 tools/perf/tests/shell/record_lbr.sh
>>
>> diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
>> new file mode 100755
>> index 000000000000..baf168d0ddbb
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/record_lbr.sh
>> @@ -0,0 +1,161 @@
>> +#!/bin/bash
>> +# perf record LBR tests
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +set -e
>> +
>> +if [ ! -f /sys/devices/cpu/caps/branches ]
>> +then
>> + echo "Skip: only x86 CPUs support LBR"
>> + exit 2
>> +fi
>
> root@x1:~# ls -la /sys/devices/cpu*/caps/branches
> -r--r--r--. 1 root root 4096 Jun 4 16:07 /sys/devices/cpu_atom/caps/branches
> -r--r--r--. 1 root root 4096 Aug 8 11:21 /sys/devices/cpu_core/caps/branches
> root@x1:~#
>
> I'm getting:
>
> root@x1:~# perf test -vvvv LBR
> 97: perf record LBR tests:
> --- start ---
> test child forked, pid 2033388
> Skip: only x86 CPUs support LBR
> ---- end(-2) ----
> 97: perf record LBR tests : Skip
> root@x1:~#
>
> So I added the following follow-up patch, ack?
Thanks Arnaldo. We need the test works on hybrid as well.
For both original patch and this patch,
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Thanks,
Kan
>
> - Arnaldo
>
> From f6ea332dd25cc62b9493b2d40040c2fb35253d9e Mon Sep 17 00:00:00 2001
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date: Thu, 8 Aug 2024 11:26:13 -0300
> Subject: [PATCH 1/1] perf test shell lbr: Support hybrid x86 systems too
>
> Running on a:
>
> root@x1:~# grep 'model name' -m1 /proc/cpuinfo
> model name : 13th Gen Intel(R) Core(TM) i7-1365U
> root@x1:~#
>
> It skips all the tests with:
>
> root@x1:~# perf test -vvvv LBR
> 97: perf record LBR tests:
> --- start ---
> test child forked, pid 2033388
> Skip: only x86 CPUs support LBR
> ---- end(-2) ----
> 97: perf record LBR tests : Skip
> root@x1:~#
>
> Because the test checks for the /sys/devices/cpu/caps/branches file,
> that isn't present as we have instead:
>
> root@x1:~# ls -la /sys/devices/cpu*/caps/branches
> -r--r--r--. 1 root root 4096 Aug 8 11:22 /sys/devices/cpu_atom/caps/branches
> -r--r--r--. 1 root root 4096 Aug 8 11:21 /sys/devices/cpu_core/caps/branches
> root@x1:~#
>
> If we check as well for one of those,
> /sys/devices/cpu_core/caps/branches, then we don't skip the tests and
> all are run on these x86 Intel Hybrid systems as well, passing all of
> them:
>
> root@x1:~# perf test -vvvv LBR
> 97: perf record LBR tests:
> --- start ---
> test child forked, pid 2034956
> LBR callgraph
> [ perf record: Woken up 5 times to write data ]
> [ perf record: Captured and wrote 1.812 MB /tmp/__perf_test.perf.data.B2HvQ (8114 samples) ]
> LBR callgraph [Success]
> LBR any branch test
> [ perf record: Woken up 25 times to write data ]
> [ perf record: Captured and wrote 6.382 MB /tmp/__perf_test.perf.data.B2HvQ (8071 samples) ]
> LBR any branch test: 8071 samples
> LBR any branch test [Success]
> LBR any call test
> [ perf record: Woken up 23 times to write data ]
> [ perf record: Captured and wrote 6.208 MB /tmp/__perf_test.perf.data.B2HvQ (8092 samples) ]
> LBR any call test: 8092 samples
> LBR any call test [Success]
> LBR any ret test
> [ perf record: Woken up 24 times to write data ]
> [ perf record: Captured and wrote 6.396 MB /tmp/__perf_test.perf.data.B2HvQ (8093 samples) ]
> LBR any ret test: 8093 samples
> LBR any ret test [Success]
> LBR any indirect call test
> [ perf record: Woken up 25 times to write data ]
> [ perf record: Captured and wrote 6.344 MB /tmp/__perf_test.perf.data.B2HvQ (8067 samples) ]
> LBR any indirect call test: 8067 samples
> LBR any indirect call test [Success]
> LBR any indirect jump test
> [ perf record: Woken up 12 times to write data ]
> [ perf record: Captured and wrote 3.073 MB /tmp/__perf_test.perf.data.B2HvQ (8061 samples) ]
> LBR any indirect jump test: 8061 samples
> LBR any indirect jump test [Success]
> LBR direct calls test
> [ perf record: Woken up 25 times to write data ]
> [ perf record: Captured and wrote 6.380 MB /tmp/__perf_test.perf.data.B2HvQ (8076 samples) ]
> LBR direct calls test: 8076 samples
> LBR direct calls test [Success]
> LBR any indirect user call test
> [ perf record: Woken up 5 times to write data ]
> [ perf record: Captured and wrote 1.597 MB /tmp/__perf_test.perf.data.B2HvQ (8079 samples) ]
> LBR any indirect user call test: 8079 samples
> LBR any indirect user call test [Success]
> LBR system wide any branch test
> [ perf record: Woken up 26 times to write data ]
> [ perf record: Captured and wrote 9.088 MB /tmp/__perf_test.perf.data.B2HvQ (9209 samples) ]
> LBR system wide any branch test: 9209 samples
> LBR system wide any branch test [Success]
> LBR system wide any call test
> [ perf record: Woken up 25 times to write data ]
> [ perf record: Captured and wrote 8.945 MB /tmp/__perf_test.perf.data.B2HvQ (9333 samples) ]
> LBR system wide any call test: 9333 samples
> LBR system wide any call test [Success]
> LBR parallel any branch test
> LBR parallel any call test
> LBR parallel any ret test
> LBR parallel any indirect call test
> LBR parallel any indirect jump test
> LBR parallel direct calls test
> LBR parallel system wide any branch test
> LBR parallel any indirect user call test
> LBR parallel system wide any call test
> [ perf record: Woken up 9 times to write data ]
> [ perf record: Woken up 51 times to write data ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Woken up 5 times to write data ]
> [ perf record: Woken up 559 times to write data ]
> [ perf record: Woken up 14 times to write data ]
> [ perf record: Woken up 17 times to write data ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Woken up 11 times to write data ]
> [ perf record: Captured and wrote 0.150 MB /tmp/__perf_test.perf.data.lANpR (1909 samples) ]
> [ perf record: Captured and wrote 2.371 MB /tmp/__perf_test.perf.data.Olum8 (3033 samples) ]
> [ perf record: Captured and wrote 1.230 MB /tmp/__perf_test.perf.data.njfJ8 (1742 samples) ]
> [ perf record: Captured and wrote 5.554 MB /tmp/__perf_test.perf.data.4ZTrj (29662 samples) ]
> [ perf record: Captured and wrote 19.906 MB /tmp/__perf_test.perf.data.dlGQt (29576 samples) ]
> [ perf record: Captured and wrote 0.289 MB /tmp/__perf_test.perf.data.CAT7y (4311 samples) ]
> [ perf record: Captured and wrote 3.129 MB /tmp/__perf_test.perf.data.diuKG (3971 samples) ]
> LBR parallel any indirect user call test: 1909 samples
> [ perf record: Captured and wrote 4.858 MB /tmp/__perf_test.perf.data.sVjtN (6130 samples) ]
> LBR parallel any indirect user call test [Success]
> [ perf record: Captured and wrote 3.669 MB /tmp/__perf_test.perf.data.AJtNI (4827 samples) ]
> LBR parallel any indirect jump test: 4311 samples
> LBR parallel any indirect jump test [Success]
> LBR parallel direct calls test: 3033 samples
> LBR parallel direct calls test [Success]
> LBR parallel any indirect call test: 1742 samples
> LBR parallel any indirect call test [Success]
> LBR parallel any call test: 4827 samples
> LBR parallel any call test [Success]
> LBR parallel any branch test: 6130 samples
> LBR parallel any branch test [Success]
> LBR parallel system wide any branch test: 29662 samples
> LBR parallel any ret test: 3971 samples
> LBR parallel any ret test [Success]
> LBR parallel system wide any branch test [Success]
> LBR parallel system wide any call test: 29576 samples
> LBR parallel system wide any call test [Success]
> ---- end(0) ----
> 97: perf record LBR tests : Ok
> root@x1:~#
>
> Cc: Adrian Hunter <adrian.hunter@intel.com>,
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>,
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Anne Macedo <retpolanne@posteo.net>,
> Cc: Changbin Du <changbin.du@huawei.com>,
> Cc: Ian Rogers <irogers@google.com>
> Cc: Ingo Molnar <mingo@redhat.com>,
> Cc: Jiri Olsa <jolsa@kernel.org>,
> Cc: Kan Liang <kan.liang@linux.intel.com>,
> Cc: Mark Rutland <mark.rutland@arm.com>,
> Cc: Namhyung Kim <namhyung@kernel.org>,
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
> tools/perf/tests/shell/record_lbr.sh | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
> index baf168d0ddbbf447..ead87b731898d70b 100755
> --- a/tools/perf/tests/shell/record_lbr.sh
> +++ b/tools/perf/tests/shell/record_lbr.sh
> @@ -4,7 +4,7 @@
>
> set -e
>
> -if [ ! -f /sys/devices/cpu/caps/branches ]
> +if [ ! -f /sys/devices/cpu/caps/branches -a ! -f /sys/devices/cpu_core/caps/branches ]
> then
> echo "Skip: only x86 CPUs support LBR"
> exit 2
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] perf test: Add set of perf record LBR tests
2024-08-08 16:00 ` Ian Rogers
@ 2024-08-08 20:28 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-08 20:28 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Anne Macedo, Changbin Du, Andi Kleen, linux-kernel,
linux-perf-users
On Thu, Aug 08, 2024 at 09:00:49AM -0700, Ian Rogers wrote:
> On Thu, Aug 8, 2024 at 7:34 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> [snip]
> > -if [ ! -f /sys/devices/cpu/caps/branches ]
> > +if [ ! -f /sys/devices/cpu/caps/branches -a ! -f /sys/devices/cpu_core/caps/branches ]
>
> Adding this lgtm, thanks for testing.
Changed to the one below to address a ShellCheck warning/suggestion.
diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
index ead87b731898d70b..32314641217e6db9 100755
--- a/tools/perf/tests/shell/record_lbr.sh
+++ b/tools/perf/tests/shell/record_lbr.sh
@@ -4,7 +4,7 @@
set -e
-if [ ! -f /sys/devices/cpu/caps/branches -a ! -f /sys/devices/cpu_core/caps/branches ]
+if [ ! -f /sys/devices/cpu/caps/branches ] && [ ! -f /sys/devices/cpu_core/caps/branches ]
then
echo "Skip: only x86 CPUs support LBR"
exit 2
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-08-08 20:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-08 5:46 [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks Ian Rogers
2024-08-08 5:46 ` [PATCH v1 2/2] perf test: Add set of perf record LBR tests Ian Rogers
2024-08-08 14:34 ` Arnaldo Carvalho de Melo
2024-08-08 16:00 ` Ian Rogers
2024-08-08 20:28 ` Arnaldo Carvalho de Melo
2024-08-08 16:01 ` Liang, Kan
2024-08-08 15:59 ` [PATCH v1 1/2] perf callchain: Fix stitch LBR memory leaks Liang, Kan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox