From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
jolsa@redhat.com, peterz@infradead.org, mingo@redhat.com,
linux-kernel@vger.kernel.org, namhyung@kernel.org,
adrian.hunter@intel.com, mathieu.poirier@linaro.org,
ravi.bangoria@linux.ibm.com, alexey.budankov@linux.intel.com,
vitaly.slobodskoy@intel.com, pavel.gerasimov@intel.com,
mpe@ellerman.id.au, eranian@google.com, ak@linux.intel.com
Subject: Re: [PATCH 02/12] perf tools: Support PERF_SAMPLE_BRANCH_HW_INDEX
Date: Thu, 5 Mar 2020 20:17:44 -0300 [thread overview]
Message-ID: <20200305231744.GB17483@kernel.org> (raw)
In-Reply-To: <74b01cee-12dc-2a95-9817-4f89a3fbd441@linux.intel.com>
Em Thu, Mar 05, 2020 at 04:02:10PM -0500, Liang, Kan escreveu:
>
>
> On 3/5/2020 3:25 PM, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Feb 28, 2020 at 08:30:01AM -0800, kan.liang@linux.intel.com escreveu:
> > > From: Kan Liang <kan.liang@linux.intel.com>
> > >
> > > A new branch sample type PERF_SAMPLE_BRANCH_HW_INDEX has been introduced
> > > in latest kernel.
> > >
> > > Enable HW_INDEX by default in LBR call stack mode.
> > > If kernel doesn't support the sample type, switching it off.
> > >
> > > Add HW_INDEX in attr_fprintf as well. User can check whether the branch
> > > sample type is set via debug information or header.
> >
> > So while this works with a kernel where PERF_SAMPLE_BRANCH_HW_INDEX is
> > present and we get, from the committer notes I was putting together
> > while testing/applying this cset:
> >
> > First collect some samples with LBR callchains, system wide, for a few
> > seconds:
> >
> > # perf record --call-graph lbr -a sleep 5
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.625 MB perf.data (224 samples) ]
> > #
> >
> > Now lets use 'perf evlist -v' to look at the branch_sample_type:
> >
> > # perf evlist -v
> > cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES|HW_INDEX
> > #
> >
> > So the machine has the kernel feature, and it was correctly added to
> > perf_event_attr.branch_sample_type, for the default 'cycles' event.
> >
> > Cool, and look at that 'attr.precise_ip: 3' part, the kernel is OK with
> > having that together with attr.branch_sample_type with HW_INDEX set.
> >
> > The problem happens when I go test this in an older kernel, where the
> > kernel doesn't know about HW_INDEX, we get it disabled but then
> > precise_ip is set to zero in its detection , even if at the end we get
> > it to 3, as expected, which got me a bit confused, I'll investigate this
> > a bit more to try and avoid these extra probes for the max precise level
> > that fails in older kernels due to branch_sample_type having HW_INDEX
> > :-\
>
> It looks like this is an expected behavior for the event with maximum
> precise config for current perf tool.
>
> The related commits are as below:
> commit ID: 4e8a5c155137 ("perf evsel: Fix max perf_event_attr.precise_ip
> detection")
> commit ID: cd136189370c ("perf evsel: Do not rely on errno values for
> precise_ip fallback")
>
> Before handling any standard fallback (not just HW_INDEX), perf tool will
> try all the precise_ip value first.
Right, and since it uses HW_INDEX and the kernel doesn't support that,
that precise_ip max value detection will fail, it will go back to
whatever it was and try again later, then the fallback will remove the
HW_INDEX and it will work.
What I said is that if we don't set branch_sample_type when detecting
the max precise_ip, then that detection will work and we will not
needlessly go on testing with precise_ip = 3, 2, 1, 0.
My doubt was more about if HW_INDEX (or any other branch_sample_type)
had to be tested in conjunction with precise_ip, which I think isn't
the case from what I saw in my tests.
- Arnaldo
> Thanks,
> Kan
>
> >
> >
> > # perf record -vv --call-graph lbr -a sleep 5
> > <SNIP>
> > ------------------------------------------------------------
> > perf_event_attr:
> > size 120
> > { sample_period, sample_freq } 4000
> > sample_type IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK
> > read_format ID
> > disabled 1
> > inherit 1
> > mmap 1
> > comm 1
> > freq 1
> > task 1
> > precise_ip 3
> > sample_id_all 1
> > exclude_guest 1
> > mmap2 1
> > comm_exec 1
> > ksymbol 1
> > bpf_event 1
> > branch_sample_type USER|CALL_STACK|NO_FLAGS|NO_CYCLES|HW_INDEX
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
> > sys_perf_event_open failed, error -95
> > decreasing precise_ip by one (2)
> > ------------------------------------------------------------
> > perf_event_attr:
> > size 120
> > { sample_period, sample_freq } 4000
> > sample_type IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK
> > read_format ID
> > disabled 1
> > inherit 1
> > mmap 1
> > comm 1
> > freq 1
> > task 1
> > precise_ip 2
> > sample_id_all 1
> > exclude_guest 1
> > mmap2 1
> > comm_exec 1
> > ksymbol 1
> > bpf_event 1
> > branch_sample_type USER|CALL_STACK|NO_FLAGS|NO_CYCLES|HW_INDEX
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
> > sys_perf_event_open failed, error -95
> > decreasing precise_ip by one (1)
> > ------------------------------------------------------------
> > perf_event_attr:
> > size 120
> > { sample_period, sample_freq } 4000
> > sample_type IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK
> > read_format ID
> > disabled 1
> > inherit 1
> > mmap 1
> > comm 1
> > freq 1
> > task 1
> > precise_ip 1
> > sample_id_all 1
> > exclude_guest 1
> > mmap2 1
> > comm_exec 1
> > ksymbol 1
> > bpf_event 1
> > branch_sample_type USER|CALL_STACK|NO_FLAGS|NO_CYCLES|HW_INDEX
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
> > sys_perf_event_open failed, error -95
> > decreasing precise_ip by one (0)
> > ------------------------------------------------------------
> > perf_event_attr:
> > size 120
> > { sample_period, sample_freq } 4000
> > sample_type IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK
> > read_format ID
> > disabled 1
> > inherit 1
> > mmap 1
> > comm 1
> > freq 1
> > task 1
> > sample_id_all 1
> > exclude_guest 1
> > mmap2 1
> > comm_exec 1
> > ksymbol 1
> > bpf_event 1
> > branch_sample_type USER|CALL_STACK|NO_FLAGS|NO_CYCLES|HW_INDEX
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
> > sys_perf_event_open failed, error -22
> > switching off branch HW index support
> > ------------------------------------------------------------
> > perf_event_attr:
> > size 120
> > { sample_period, sample_freq } 4000
> > sample_type IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK
> > read_format ID
> > disabled 1
> > inherit 1
> > mmap 1
> > comm 1
> > freq 1
> > task 1
> > precise_ip 3
> > sample_id_all 1
> > exclude_guest 1
> > mmap2 1
> > comm_exec 1
> > ksymbol 1
> > bpf_event 1
> > branch_sample_type USER|CALL_STACK|NO_FLAGS|NO_CYCLES
> > ------------------------------------------------------------
> >
> >
> > > Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> > > ---
> > > tools/perf/util/evsel.c | 15 ++++++++++++---
> > > tools/perf/util/evsel.h | 1 +
> > > tools/perf/util/perf_event_attr_fprintf.c | 1 +
> > > 3 files changed, 14 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > > index 05883a45de5b..816d930d774e 100644
> > > --- a/tools/perf/util/evsel.c
> > > +++ b/tools/perf/util/evsel.c
> > > @@ -712,7 +712,8 @@ static void __perf_evsel__config_callchain(struct evsel *evsel,
> > > attr->branch_sample_type = PERF_SAMPLE_BRANCH_USER |
> > > PERF_SAMPLE_BRANCH_CALL_STACK |
> > > PERF_SAMPLE_BRANCH_NO_CYCLES |
> > > - PERF_SAMPLE_BRANCH_NO_FLAGS;
> > > + PERF_SAMPLE_BRANCH_NO_FLAGS |
> > > + PERF_SAMPLE_BRANCH_HW_INDEX;
> > > }
> > > } else
> > > pr_warning("Cannot use LBR callstack with branch stack. "
> > > @@ -763,7 +764,8 @@ perf_evsel__reset_callgraph(struct evsel *evsel,
> > > if (param->record_mode == CALLCHAIN_LBR) {
> > > perf_evsel__reset_sample_bit(evsel, BRANCH_STACK);
> > > attr->branch_sample_type &= ~(PERF_SAMPLE_BRANCH_USER |
> > > - PERF_SAMPLE_BRANCH_CALL_STACK);
> > > + PERF_SAMPLE_BRANCH_CALL_STACK |
> > > + PERF_SAMPLE_BRANCH_HW_INDEX);
> > > }
> > > if (param->record_mode == CALLCHAIN_DWARF) {
> > > perf_evsel__reset_sample_bit(evsel, REGS_USER);
> > > @@ -1673,6 +1675,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> > > evsel->core.attr.ksymbol = 0;
> > > if (perf_missing_features.bpf)
> > > evsel->core.attr.bpf_event = 0;
> > > + if (perf_missing_features.branch_hw_idx)
> > > + evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_HW_INDEX;
> > > retry_sample_id:
> > > if (perf_missing_features.sample_id_all)
> > > evsel->core.attr.sample_id_all = 0;
> > > @@ -1784,7 +1788,12 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> > > * Must probe features in the order they were added to the
> > > * perf_event_attr interface.
> > > */
> > > - if (!perf_missing_features.aux_output && evsel->core.attr.aux_output) {
> > > + if (!perf_missing_features.branch_hw_idx &&
> > > + (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX)) {
> > > + perf_missing_features.branch_hw_idx = true;
> > > + pr_debug2("switching off branch HW index support\n");
> > > + goto fallback_missing_features;
> > > + } else if (!perf_missing_features.aux_output && evsel->core.attr.aux_output) {
> > > perf_missing_features.aux_output = true;
> > > pr_debug2_peo("Kernel has no attr.aux_output support, bailing out\n");
> > > goto out_close;
> > > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> > > index 99a0cb60c556..33804740e2ca 100644
> > > --- a/tools/perf/util/evsel.h
> > > +++ b/tools/perf/util/evsel.h
> > > @@ -119,6 +119,7 @@ struct perf_missing_features {
> > > bool ksymbol;
> > > bool bpf;
> > > bool aux_output;
> > > + bool branch_hw_idx;
> > > };
> > > extern struct perf_missing_features perf_missing_features;
> > > diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
> > > index 651203126c71..355d3458d4e6 100644
> > > --- a/tools/perf/util/perf_event_attr_fprintf.c
> > > +++ b/tools/perf/util/perf_event_attr_fprintf.c
> > > @@ -50,6 +50,7 @@ static void __p_branch_sample_type(char *buf, size_t size, u64 value)
> > > bit_name(ABORT_TX), bit_name(IN_TX), bit_name(NO_TX),
> > > bit_name(COND), bit_name(CALL_STACK), bit_name(IND_JUMP),
> > > bit_name(CALL), bit_name(NO_FLAGS), bit_name(NO_CYCLES),
> > > + bit_name(HW_INDEX),
> > > { .name = NULL, }
> > > };
> > > #undef bit_name
> > > --
> > > 2.17.1
> > >
> >
--
- Arnaldo
next prev parent reply other threads:[~2020-03-05 23:17 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-28 16:29 [PATCH 00/12] Stitch LBR call stack (Perf Tools) kan.liang
2020-02-28 16:30 ` [PATCH 01/12] perf tools: Add hw_idx in struct branch_stack kan.liang
2020-03-04 13:49 ` Arnaldo Carvalho de Melo
2020-03-04 15:45 ` Arnaldo Carvalho de Melo
2020-03-04 16:07 ` Liang, Kan
2020-03-19 14:10 ` [tip: perf/core] tools headers UAPI: Update tools's copy of linux/perf_event.h tip-bot2 for Arnaldo Carvalho de Melo
2020-03-10 0:42 ` [PATCH 01/12] perf tools: Add hw_idx in struct branch_stack Arnaldo Carvalho de Melo
2020-03-10 12:53 ` Liang, Kan
2020-03-19 14:10 ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-02-28 16:30 ` [PATCH 02/12] perf tools: Support PERF_SAMPLE_BRANCH_HW_INDEX kan.liang
2020-03-05 20:25 ` Arnaldo Carvalho de Melo
2020-03-05 21:02 ` Liang, Kan
2020-03-05 23:17 ` Arnaldo Carvalho de Melo [this message]
2020-03-19 14:10 ` [tip: perf/core] perf evsel: " tip-bot2 for Kan Liang
2020-02-28 16:30 ` [PATCH 03/12] perf header: Add check for event attr kan.liang
2020-03-19 14:10 ` [tip: perf/core] perf header: Add check for unexpected use of reserved membrs in " tip-bot2 for Kan Liang
2020-02-28 16:30 ` [PATCH 04/12] perf pmu: Add support for PMU capabilities kan.liang
2020-02-28 16:30 ` [PATCH 05/12] perf header: Support CPU " kan.liang
2020-02-28 16:30 ` [PATCH 06/12] perf machine: Refine the function for LBR call stack reconstruction kan.liang
2020-02-28 16:30 ` [PATCH 07/12] perf tools: Stitch LBR call stack kan.liang
2020-02-28 16:30 ` [PATCH 08/12] perf report: Add option to enable the LBR stitching approach kan.liang
2020-02-28 16:30 ` [PATCH 09/12] perf script: " kan.liang
2020-02-28 16:30 ` [PATCH 10/12] perf top: " kan.liang
2020-02-28 16:30 ` [PATCH 11/12] perf c2c: " kan.liang
2020-02-28 16:30 ` [PATCH 12/12] perf hist: Add fast path for duplicate entries check approach kan.liang
2020-03-04 13:33 ` [PATCH 00/12] Stitch LBR call stack (Perf Tools) Arnaldo Carvalho de Melo
2020-03-06 9:39 ` Jiri Olsa
2020-03-06 19:13 ` Liang, Kan
2020-03-06 20:06 ` Arnaldo Carvalho de Melo
2020-03-09 13:27 ` Arnaldo Carvalho de Melo
2020-03-09 13:42 ` Liang, Kan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200305231744.GB17483@kernel.org \
--to=arnaldo.melo@gmail.com \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexey.budankov@linux.intel.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.poirier@linaro.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=namhyung@kernel.org \
--cc=pavel.gerasimov@intel.com \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@linux.ibm.com \
--cc=vitaly.slobodskoy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.