From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Thomas Richter <tmricht@linux.vnet.ibm.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 13/36] perf annotate: Fix unnecessary memory allocation for s390x
Date: Wed, 6 Dec 2017 11:40:52 -0300 [thread overview]
Message-ID: <20171206144115.15097-14-acme@kernel.org> (raw)
In-Reply-To: <20171206144115.15097-1-acme@kernel.org>
From: Thomas Richter <tmricht@linux.vnet.ibm.com>
This patch fixes a bug introduced with commit d9f8dfa9baf9 ("perf
annotate s390: Implement jump types for perf annotate").
'perf annotate' displays annotated assembler output by reading output of
command objdump and parsing the disassembled lines. For each shown
mnemonic this function sequence is executed:
disasm_line__new()
|
+--> disasm_line__init_ins()
|
+--> ins__find()
|
+--> arch->associate_instruction_ops()
The s390x specific function assigned to function pointer
associate_instruction_ops refers to function s390__associate_ins_ops().
This function checks for supported mnemonics and assigns a NULL pointer
to unsupported mnemonics. However even the NULL pointer is added to the
architecture dependend instruction array.
This leads to an extremely large architecture instruction array
(due to array resize logic in function arch__grow_instructions()).
Depending on the objdump output being parsed the array can end up
with several ten-thousand elements.
This patch checks if a mnemonic is supported and only adds supported
ones into the architecture instruction array. The array does not contain
elements with NULL pointers anymore.
Before the patch (With some debug printf output):
[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxbb
real 8m49.679s
user 7m13.008s
sys 0m1.649s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
/tmp/xxxbb | tail -1
__ins__find sorted:1 nr_instructions:87433 ins:0x341583c0
[root@s35lp76 perf]#
The number of different s390x branch/jump/call/return instructions
entered into the array is 87433.
After the patch (With some printf debug output:)
[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxaa
real 1m24.553s
user 0m0.587s
sys 0m1.530s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
/tmp/xxxaa | tail -1
__ins__find sorted:1 nr_instructions:56 ins:0x3f406570
[root@s35lp76 perf]#
The number of different s390x branch/jump/call/return instructions
entered into the array is 56 which is sensible.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20171124094637.55558-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/s390/annotate/instructions.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/arch/s390/annotate/instructions.c b/tools/perf/arch/s390/annotate/instructions.c
index e0e466c650df..8c72b44444cb 100644
--- a/tools/perf/arch/s390/annotate/instructions.c
+++ b/tools/perf/arch/s390/annotate/instructions.c
@@ -18,7 +18,8 @@ static struct ins_ops *s390__associate_ins_ops(struct arch *arch, const char *na
if (!strcmp(name, "br"))
ops = &ret_ops;
- arch__associate_ins_ops(arch, name, ops);
+ if (ops)
+ arch__associate_ins_ops(arch, name, ops);
return ops;
}
--
2.13.6
next prev parent reply other threads:[~2017-12-06 14:40 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20171206144115.15097-1-acme@kernel.org>
2017-12-06 14:40 ` [PATCH 01/36] tools headers: Follow the upstream UAPI header version 100% differ from the kernel Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 03/36] perf record: Synthesize unit/scale/... in event update Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 04/36] perf record: Synthesize thread map and cpu map Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 05/36] perf script: Allow computing 'perf stat' style metrics Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 06/36] perf buildid-cache: Document for Node.js USDT Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 07/36] perf report: Fix -D output for user metadata events Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 08/36] perf intel-pt: Improve build messages for files that differ from the kernel Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 09/36] Documentation: Add Arnaldo Melo to list of enforcement statement endorsers Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 10/36] perf bench futex: Use cpumaps Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available Arnaldo Carvalho de Melo
2017-12-06 21:31 ` Philippe Ombredanne
2017-12-07 11:24 ` Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 12/36] perf bench futex: Sync waker threads Arnaldo Carvalho de Melo
2017-12-06 14:40 ` Arnaldo Carvalho de Melo [this message]
2017-12-06 14:40 ` [PATCH 14/36] perf annotate: Fix objdump comment parsing for Intel mov dissassembly Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 15/36] perf rblist: Create rblist__exit() function Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 16/36] perf stat: Add rbtree node_delete op Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 17/36] perf thread_map: Add method to map all threads in the system Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 18/36] perf s390: Always build with -fPIC Arnaldo Carvalho de Melo
2017-12-07 8:09 ` Hendrik Brueckner
2017-12-06 14:40 ` [PATCH 19/36] perf pmu: Pass pmu as a parameter to get_cpuid_str() Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 20/36] perf tools arm64: Add support for get_cpuid_str function Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 21/36] perf pmu: Add helper function is_pmu_core to detect PMU CORE devices Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 22/36] perf vendor events arm64: Add ThunderX2 implementation defined pmu core events Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 23/36] perf pmu: Add check for valid cpuid in perf_pmu__find_map() Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 24/36] perf tools: Fix up build in hardnened environments Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 25/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 26/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap_ex Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 27/36] perf evlist: Remove evlist->overwrite Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 28/36] perf mmap: Remove overwrite from arguments list of perf_mmap__push Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 29/36] perf mmap: Remove overwrite and check_messup from mmap read Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 30/36] perf c2c: Add a tip about cacheline events Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 31/36] perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 32/36] x86/asm: Allow again using asm.h when building for the 'bpf' clang target Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 33/36] perf report: Set browser mode right before setup_browser() Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 34/36] perf mmap: Fix perf backward recording Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 35/36] perf mmap: Don't discard prev in backward mode Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 36/36] perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and record Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171206144115.15097-14-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=heiko.carstens@de.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=schwidefsky@de.ibm.com \
--cc=tmricht@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).