From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Stephane Eranian <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
kim.phillips@amd.com, acme@redhat.com, jolsa@redhat.com,
songliubraving@fb.com
Subject: Re: [PATCH v6 12/12] perf report: add addr_from/addr_to sort dimensions
Date: Wed, 16 Feb 2022 11:21:12 -0300 [thread overview]
Message-ID: <Yg0IWIjgpRcb8bZX@kernel.org> (raw)
In-Reply-To: <20220208211637.2221872-13-eranian@google.com>
Em Tue, Feb 08, 2022 at 01:16:37PM -0800, Stephane Eranian escreveu:
> With the existing symbol_from/symbol_to, branches captured in the same
> function would be collapsed into a single function if the latencies associated
> with the each branch (cycles) were all the same. That is the case on Intel
> Broadwell, for instance. Since Intel Skylake, the latency is captured by
> hardware and therefore is used to disambiguate branches.
>
> Add addr_from/addr_to sort dimensions to sort branches based on their
> addresses and not the function there are in. The output is still the function
> name but the offset within the function is provided to uniquely identify each
> branch. These new sort dimensions also help with annotate because they create
> different entries in the histogram which, in turn, generates proper branch
> annotations.
>
> Here is an example using AMD's branch sampling:
This can be cherry picked from this patchset, I'll try to do it now, and
also add the above explanation for the new sort dimensions to:
tools/perf/Documentation/perf-report.txt
Please update the documentation in the future when adding new sort
dimensions.
Thanks,
- Arnaldo
> $ perf record -a -b -c 1000037 -e cpu/branch-brs/ test_prg
>
> $ perf report
> Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276
> Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycle
> 99.65% test_prg test_prg [.] test_thread [.] test_thread -
> 0.02% test_prg [kernel.vmlinux] [k] asm_sysvec_apic_timer_interrupt [k] error_entry -
>
> $ perf report -F overhead,comm,dso,addr_from,addr_to
> Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276
> Overhead Command Shared Object Source Address Target Address
> 4.22% test_prg test_prg [.] test_thread+0x3c [.] test_thread+0x4
> 4.13% test_prg test_prg [.] test_thread+0x4 [.] test_thread+0x3a
> 4.09% test_prg test_prg [.] test_thread+0x3a [.] test_thread+0x6
> 4.08% test_prg test_prg [.] test_thread+0x2 [.] test_thread+0x3c
> 4.06% test_prg test_prg [.] test_thread+0x3e [.] test_thread+0x2
> 3.87% test_prg test_prg [.] test_thread+0x6 [.] test_thread+0x38
> 3.84% test_prg test_prg [.] test_thread [.] test_thread+0x3e
> 3.76% test_prg test_prg [.] test_thread+0x1e [.] test_thread
> 3.76% test_prg test_prg [.] test_thread+0x38 [.] test_thread+0x8
> 3.56% test_prg test_prg [.] test_thread+0x22 [.] test_thread+0x1e
> 3.54% test_prg test_prg [.] test_thread+0x8 [.] test_thread+0x36
> 3.47% test_prg test_prg [.] test_thread+0x1c [.] test_thread+0x22
> 3.45% test_prg test_prg [.] test_thread+0x36 [.] test_thread+0xa
> 3.28% test_prg test_prg [.] test_thread+0x24 [.] test_thread+0x1c
> 3.25% test_prg test_prg [.] test_thread+0xa [.] test_thread+0x34
> 3.24% test_prg test_prg [.] test_thread+0x1a [.] test_thread+0x24
> 3.20% test_prg test_prg [.] test_thread+0x34 [.] test_thread+0xc
> 3.04% test_prg test_prg [.] test_thread+0x26 [.] test_thread+0x1a
> 3.01% test_prg test_prg [.] test_thread+0xc [.] test_thread+0x32
> 2.98% test_prg test_prg [.] test_thread+0x18 [.] test_thread+0x26
> 2.94% test_prg test_prg [.] test_thread+0x32 [.] test_thread+0xe
> 2.76% test_prg test_prg [.] test_thread+0x28 [.] test_thread+0x18
> 2.73% test_prg test_prg [.] test_thread+0xe [.] test_thread+0x30
> 2.67% test_prg test_prg [.] test_thread+0x30 [.] test_thread+0x10
> 2.67% test_prg test_prg [.] test_thread+0x16 [.] test_thread+0x28
> 2.46% test_prg test_prg [.] test_thread+0x10 [.] test_thread+0x2e
> 2.44% test_prg test_prg [.] test_thread+0x2a [.] test_thread+0x16
> 2.38% test_prg test_prg [.] test_thread+0x14 [.] test_thread+0x2a
> 2.32% test_prg test_prg [.] test_thread+0x2e [.] test_thread+0x12
> 2.28% test_prg test_prg [.] test_thread+0x12 [.] test_thread+0x2c
> 2.16% test_prg test_prg [.] test_thread+0x2c [.] test_thread+0x14
> 0.02% test_prg [kernel.vmlinux] [k] asm_sysvec_apic_ti+0x5 [k] error_entry
>
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
> tools/perf/util/hist.c | 2 +
> tools/perf/util/hist.h | 2 +
> tools/perf/util/sort.c | 128 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/sort.h | 2 +
> 4 files changed, 134 insertions(+)
>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 0a8033b09e28..1c085ab56534 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -124,6 +124,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
> } else {
> symlen = unresolved_col_width + 4 + 2;
> hists__new_col_len(hists, HISTC_SYMBOL_FROM, symlen);
> + hists__new_col_len(hists, HISTC_ADDR_FROM, symlen);
> hists__set_unres_dso_col_len(hists, HISTC_DSO_FROM);
> }
>
> @@ -138,6 +139,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
> } else {
> symlen = unresolved_col_width + 4 + 2;
> hists__new_col_len(hists, HISTC_SYMBOL_TO, symlen);
> + hists__new_col_len(hists, HISTC_ADDR_TO, symlen);
> hists__set_unres_dso_col_len(hists, HISTC_DSO_TO);
> }
>
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index 2a15e22fb89c..7ed4648d2fc2 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -77,6 +77,8 @@ enum hist_column {
> HISTC_GLOBAL_INS_LAT,
> HISTC_LOCAL_P_STAGE_CYC,
> HISTC_GLOBAL_P_STAGE_CYC,
> + HISTC_ADDR_FROM,
> + HISTC_ADDR_TO,
> HISTC_NR_COLS, /* Last entry */
> };
>
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 2da081ef532b..6d5588e80935 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -990,6 +990,128 @@ struct sort_entry sort_sym_to = {
> .se_width_idx = HISTC_SYMBOL_TO,
> };
>
> +static int _hist_entry__addr_snprintf(struct map_symbol *ms,
> + u64 ip, char level, char *bf, size_t size,
> + unsigned int width)
> +{
> + struct symbol *sym = ms->sym;
> + struct map *map = ms->map;
> + size_t ret = 0, offs;
> +
> + ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
> + if (sym && map) {
> + if (sym->type == STT_OBJECT) {
> + ret += repsep_snprintf(bf + ret, size - ret, "%s", sym->name);
> + ret += repsep_snprintf(bf + ret, size - ret, "+0x%llx",
> + ip - map->unmap_ip(map, sym->start));
> + } else {
> + ret += repsep_snprintf(bf + ret, size - ret, "%.*s",
> + width - ret,
> + sym->name);
> + offs = ip - sym->start;
> + if (offs)
> + ret += repsep_snprintf(bf + ret, size - ret, "+0x%llx", offs);
> + }
> + } else {
> + size_t len = BITS_PER_LONG / 4;
> + ret += repsep_snprintf(bf + ret, size - ret, "%-#.*llx",
> + len, ip);
> + }
> +
> + return ret;
> +}
> +
> +static int hist_entry__addr_from_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + if (he->branch_info) {
> + struct addr_map_symbol *from = &he->branch_info->from;
> +
> + return _hist_entry__addr_snprintf(&from->ms, from->al_addr,
> + he->level, bf, size, width);
> + }
> +
> + return repsep_snprintf(bf, size, "%-*.*s", width, width, "N/A");
> +}
> +
> +static int hist_entry__addr_to_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + if (he->branch_info) {
> + struct addr_map_symbol *to = &he->branch_info->to;
> +
> + return _hist_entry__addr_snprintf(&to->ms, to->al_addr,
> + he->level, bf, size, width);
> + }
> +
> + return repsep_snprintf(bf, size, "%-*.*s", width, width, "N/A");
> +}
> +
> +static int64_t
> +sort__addr_from_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + struct addr_map_symbol *from_l;
> + struct addr_map_symbol *from_r;
> + int64_t ret;
> +
> + if (!left->branch_info || !right->branch_info)
> + return cmp_null(left->branch_info, right->branch_info);
> +
> + from_l = &left->branch_info->from;
> + from_r = &right->branch_info->from;
> +
> + /*
> + * comparing symbol address alone is not enough since it's a
> + * relative address within a dso.
> + */
> + ret = _sort__dso_cmp(from_l->ms.map, from_r->ms.map);
> + if (ret != 0)
> + return ret;
> +
> + return _sort__addr_cmp(from_l->addr, from_r->addr);
> +}
> +
> +static int64_t
> +sort__addr_to_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + struct addr_map_symbol *to_l;
> + struct addr_map_symbol *to_r;
> + int64_t ret;
> +
> + if (!left->branch_info || !right->branch_info)
> + return cmp_null(left->branch_info, right->branch_info);
> +
> + to_l = &left->branch_info->to;
> + to_r = &right->branch_info->to;
> +
> + /*
> + * comparing symbol address alone is not enough since it's a
> + * relative address within a dso.
> + */
> + ret = _sort__dso_cmp(to_l->ms.map, to_r->ms.map);
> + if (ret != 0)
> + return ret;
> +
> + return _sort__addr_cmp(to_l->addr, to_r->addr);
> +}
> +
> +struct sort_entry sort_addr_from = {
> + .se_header = "Source Address",
> + .se_cmp = sort__addr_from_cmp,
> + .se_snprintf = hist_entry__addr_from_snprintf,
> + .se_filter = hist_entry__sym_from_filter, /* shared with sym_from */
> + .se_width_idx = HISTC_ADDR_FROM,
> +};
> +
> +struct sort_entry sort_addr_to = {
> + .se_header = "Target Address",
> + .se_cmp = sort__addr_to_cmp,
> + .se_snprintf = hist_entry__addr_to_snprintf,
> + .se_filter = hist_entry__sym_to_filter, /* shared with sym_to */
> + .se_width_idx = HISTC_ADDR_TO,
> +};
> +
> +
> static int64_t
> sort__mispredict_cmp(struct hist_entry *left, struct hist_entry *right)
> {
> @@ -1893,6 +2015,8 @@ static struct sort_dimension bstack_sort_dimensions[] = {
> DIM(SORT_SRCLINE_FROM, "srcline_from", sort_srcline_from),
> DIM(SORT_SRCLINE_TO, "srcline_to", sort_srcline_to),
> DIM(SORT_SYM_IPC, "ipc_lbr", sort_sym_ipc),
> + DIM(SORT_ADDR_FROM, "addr_from", sort_addr_from),
> + DIM(SORT_ADDR_TO, "addr_to", sort_addr_to),
> };
>
> #undef DIM
> @@ -3126,6 +3250,10 @@ static bool get_elide(int idx, FILE *output)
> return __get_elide(symbol_conf.dso_from_list, "dso_from", output);
> case HISTC_DSO_TO:
> return __get_elide(symbol_conf.dso_to_list, "dso_to", output);
> + case HISTC_ADDR_FROM:
> + return __get_elide(symbol_conf.sym_from_list, "addr_from", output);
> + case HISTC_ADDR_TO:
> + return __get_elide(symbol_conf.sym_to_list, "addr_to", output);
> default:
> break;
> }
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index f994261888e1..2ddc00d1c464 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -251,6 +251,8 @@ enum sort_type {
> SORT_SRCLINE_FROM,
> SORT_SRCLINE_TO,
> SORT_SYM_IPC,
> + SORT_ADDR_FROM,
> + SORT_ADDR_TO,
>
> /* memory mode specific sort keys */
> __SORT_MEMORY_MODE,
> --
> 2.35.0.263.gb82422642f-goog
--
- Arnaldo
prev parent reply other threads:[~2022-02-16 14:21 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-08 21:16 [PATCH v6 00/12] perf/x86/amd: Add AMD Fam19h Branch Sampling support Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 01/12] perf/core: add perf_clear_branch_entry_bitfields() helper Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 02/12] x86/cpufeatures: add AMD Fam19h Branch Sampling feature Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 03/12] perf/x86/amd: add AMD Fam19h Branch Sampling support Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 04/12] perf/x86/amd: add branch-brs helper event for Fam19h BRS Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 05/12] perf/x86/amd: enable branch sampling priv level filtering Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 06/12] perf/x86/amd: add AMD branch sampling period adjustment Stephane Eranian
2022-02-09 15:32 ` Peter Zijlstra
2022-03-04 15:45 ` Peter Zijlstra
2022-03-09 23:03 ` Stephane Eranian
2022-03-15 12:08 ` Peter Zijlstra
2022-03-17 17:11 ` Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 07/12] perf/x86/amd: make Zen3 branch sampling opt-in Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 08/12] ACPI: add perf low power callback Stephane Eranian
2022-02-09 9:32 ` kernel test robot
2022-02-09 9:32 ` kernel test robot
2022-02-09 14:28 ` kernel test robot
2022-02-08 21:16 ` [PATCH v6 09/12] perf/x86/amd: add idle hooks for branch sampling Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 10/12] perf tools: Improve IBS error handling Stephane Eranian
2022-02-09 15:47 ` Arnaldo Carvalho de Melo
2022-03-15 6:49 ` Ravi Bangoria
2022-03-15 2:01 ` Stephane Eranian
2022-03-15 6:23 ` Ravi Bangoria
2022-03-15 7:12 ` Stephane Eranian
2022-03-15 7:45 ` Ravi Bangoria
2022-03-16 0:03 ` Stephane Eranian
2022-03-16 11:07 ` Ravi Bangoria
2022-03-16 11:16 ` Ravi Bangoria
2022-02-08 21:16 ` [PATCH v6 11/12] perf tools: Improve error handling of AMD Branch Sampling Stephane Eranian
2022-02-16 14:17 ` Arnaldo Carvalho de Melo
2022-02-08 21:16 ` [PATCH v6 12/12] perf report: add addr_from/addr_to sort dimensions Stephane Eranian
2022-02-16 14:21 ` Arnaldo Carvalho de Melo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yg0IWIjgpRcb8bZX@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=kim.phillips@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.