From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A889C3F2CD for ; Fri, 28 Feb 2020 16:32:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E1EE8246BA for ; Fri, 28 Feb 2020 16:32:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727095AbgB1Qcp (ORCPT ); Fri, 28 Feb 2020 11:32:45 -0500 Received: from mga14.intel.com ([192.55.52.115]:35576 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726901AbgB1QcY (ORCPT ); Fri, 28 Feb 2020 11:32:24 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Feb 2020 08:32:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,496,1574150400"; d="scan'208";a="227593084" Received: from labuser-ice-lake-client-platform.jf.intel.com ([10.54.55.45]) by orsmga007.jf.intel.com with ESMTP; 28 Feb 2020 08:32:24 -0800 From: kan.liang@linux.intel.com To: acme@kernel.org, jolsa@redhat.com, peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: namhyung@kernel.org, adrian.hunter@intel.com, mathieu.poirier@linaro.org, ravi.bangoria@linux.ibm.com, alexey.budankov@linux.intel.com, vitaly.slobodskoy@intel.com, pavel.gerasimov@intel.com, mpe@ellerman.id.au, eranian@google.com, ak@linux.intel.com, Kan Liang Subject: [PATCH 08/12] perf report: Add option to enable the LBR stitching approach Date: Fri, 28 Feb 2020 08:30:07 -0800 Message-Id: <20200228163011.19358-9-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200228163011.19358-1-kan.liang@linux.intel.com> References: <20200228163011.19358-1-kan.liang@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Kan Liang With the LBR stitching approach, the reconstructed LBR call stack can break the HW limitation. However, it may reconstruct invalid call stacks in some cases, e.g. exception handing such as setjmp/longjmp. Also, it may impact the processing time especially when the number of samples with stitched LBRs are huge. Add an option to enable the approach. # To display the perf.data header info, please use # --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 6K of event 'cycles' # Event count (approx.): 6492797701 # # Children Self Command Shared Object Symbol # ........ ........ ............... .................. # ................................. # 99.99% 99.99% tchain_edit tchain_edit [.] f43 | ---main f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15 f16 f17 f18 f19 f20 f21 f22 f23 f24 f25 f26 f27 f28 f29 f30 f31 | --99.65%--f32 f33 f34 f35 f36 f37 f38 f39 f40 f41 f42 f43 Reviewed-by: Andi Kleen Signed-off-by: Kan Liang --- tools/perf/Documentation/perf-report.txt | 11 +++++++++++ tools/perf/builtin-report.c | 6 ++++++ 2 files changed, 17 insertions(+) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index db61f16ffa56..5e4155d2511c 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -482,6 +482,17 @@ include::itrace.txt[] This option extends the perf report to show reference callgraphs, which collected by reference event, in no callgraph event. +--stitch-lbr:: + Show callgraph with stitched LBRs, which may have more complete + callgraph. The perf.data file must have been obtained using + perf record --call-graph lbr. + Disabled by default. In common cases with call stack overflows, + it can recreate better call stacks than the default lbr call stack + output. But this approach is not full proof. There can be cases + where it creates incorrect call stacks from incorrect matches. + The known limitations include exception handing such as + setjmp/longjmp will have calls/returns not match. + --socket-filter:: Only report the samples on the processor socket that match with this filter diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 9483b3f0cae3..286f50d3fc65 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -84,6 +84,7 @@ struct report { bool header_only; bool nonany_branch_mode; bool group_set; + bool stitch_lbr; int max_stack; struct perf_read_values show_threads_values; struct annotation_options annotation_opts; @@ -267,6 +268,9 @@ static int process_sample_event(struct perf_tool *tool, return -1; } + if (rep->stitch_lbr) + al.thread->lbr_stitch_enable = true; + if (symbol_conf.hide_unresolved && al.sym == NULL) goto out_put; @@ -1241,6 +1245,8 @@ int cmd_report(int argc, const char **argv) "Show full source file name path for source lines"), OPT_BOOLEAN(0, "show-ref-call-graph", &symbol_conf.show_ref_callgraph, "Show callgraph from reference event"), + OPT_BOOLEAN(0, "stitch-lbr", &report.stitch_lbr, + "Enable LBR callgraph stitching approach"), OPT_INTEGER(0, "socket-filter", &report.socket_filter, "only show processor socket that match with this filter"), OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace, -- 2.17.1