From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Ian Rogers <irogers@google.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Ahmad Yasin <ahmad.yasin@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Stephane Eranian <eranian@google.com>,
Andi Kleen <ak@linux.intel.com>,
Perry Taylor <perry.taylor@intel.com>,
Samantha Alt <samantha.alt@intel.com>,
Caleb Biggers <caleb.biggers@intel.com>,
Weilin Wang <weilin.wang@intel.com>,
Edward Baker <edward.baker@intel.com>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Florian Fischer <florian.fischer@muhq.space>,
Rob Herring <robh@kernel.org>,
Zhengjun Xing <zhengjun.xing@linux.intel.com>,
John Garry <john.g.garry@oracle.com>,
Kajol Jain <kjain@linux.ibm.com>,
Sumanth Korikkar <sumanthk@linux.ibm.com>,
Thomas Richter <tmricht@linux.ibm.com>,
Tiezhu Yang <yangtiezhu@loongson.cn>,
Ravi Bangoria <ravi.bangoria@amd.com>,
Leo Yan <leo.yan@linaro.org>,
Yang Jihong <yangjihong1@huawei.com>,
James Clark <james.clark@arm.com>,
Suzuki Poulouse <suzuki.poulose@arm.com>,
Kang Minchul <tegongkang@gmail.com>,
Athira Rajeev <atrajeev@linux.vnet.ibm.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 03/46] perf stat: Introduce skippable evsels
Date: Mon, 1 May 2023 10:56:11 -0400 [thread overview]
Message-ID: <d6784858-2a5f-7920-f1ac-d7ec9ed89605@linux.intel.com> (raw)
In-Reply-To: <20230429053506.1962559-4-irogers@google.com>
On 2023-04-29 1:34 a.m., Ian Rogers wrote:
> Perf stat with no arguments will use default events and metrics. These
> events may fail to open even with kernel and hypervisor disabled. When
> these fail then the permissions error appears even though they were
> implicitly selected. This is particularly a problem with the automatic
> selection of the TopdownL1 metric group on certain architectures like
> Skylake:
>
> '''
> $ perf stat true
> Error:
> Access to performance monitoring and observability operations is limited.
> Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
> access to performance monitoring and observability operations for processes
> without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
> More information can be found at 'Perf events and tool security' document:
> https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> perf_event_paranoid setting is 2:
> -1: Allow use of (almost) all events by all users
> Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>> = 0: Disallow raw and ftrace function tracepoint access
>> = 1: Disallow CPU event access
>> = 2: Disallow kernel profiling
> To make the adjusted perf_event_paranoid setting permanent preserve it
> in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
> '''
>
> This patch adds skippable evsels that when they fail to open won't
> cause termination and will appear as "<not supported>" in output. The
> TopdownL1 events, from the metric group, are marked as skippable. This
> turns the failure above to:
>
> '''
> $ perf stat perf bench internals synthesize
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
> Average synthesis took: 49.287 usec (+- 0.083 usec)
> Average num. events: 3.000 (+- 0.000)
> Average time per event 16.429 usec
> Average data synthesis took: 49.641 usec (+- 0.085 usec)
> Average num. events: 11.000 (+- 0.000)
> Average time per event 4.513 usec
>
> Performance counter stats for 'perf bench internals synthesize':
>
> 1,222.38 msec task-clock:u # 0.993 CPUs utilized
> 0 context-switches:u # 0.000 /sec
> 0 cpu-migrations:u # 0.000 /sec
> 162 page-faults:u # 132.529 /sec
> 774,445,184 cycles:u # 0.634 GHz (49.61%)
> 1,640,969,811 instructions:u # 2.12 insn per cycle (59.67%)
> 302,052,148 branches:u # 247.102 M/sec (59.69%)
> 1,807,718 branch-misses:u # 0.60% of all branches (59.68%)
> 5,218,927 CPU_CLK_UNHALTED.REF_XCLK:u # 4.269 M/sec
> # 17.3 % tma_frontend_bound
> # 56.4 % tma_retiring
> # nan % tma_backend_bound
> # nan % tma_bad_speculation (60.01%)
> 536,580,469 IDQ_UOPS_NOT_DELIVERED.CORE:u # 438.965 M/sec (60.33%)
> <not supported> INT_MISC.RECOVERY_CYCLES_ANY:u
> 5,223,936 CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u # 4.274 M/sec (40.31%)
> 774,127,250 CPU_CLK_UNHALTED.THREAD:u # 633.297 M/sec (50.34%)
> 1,746,579,518 UOPS_RETIRED.RETIRE_SLOTS:u # 1.429 G/sec (50.12%)
> 1,940,625,702 UOPS_ISSUED.ANY:u # 1.588 G/sec (49.70%)
>
> 1.231055525 seconds time elapsed
>
> 0.258327000 seconds user
> 0.965749000 seconds sys
Which branch is this patch series based on?
I still cannot get the same output as the examples.
I'm using the latest perf-tools-next (The latest commit ID is
5d27a645f609 ("perf tracepoint: Fix memory leak in is_valid_tracepoint()")).
I only applied patch 2 and patch 3, since the patch 1 is already merged.
It's a single socket Cascade Lake. with kernel 5.19-8.
$ uname -r
5.19.8-100.fc35.x86_64
As you can see, all the topdown related events are displayed twice.
With root permission,
$ sudo ./perf stat perf bench internals synthesize
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 91.487 usec (+- 0.050 usec)
Average num. events: 47.000 (+- 0.000)
Average time per event 1.947 usec
Average data synthesis took: 97.720 usec (+- 0.059 usec)
Average num. events: 245.000 (+- 0.000)
Average time per event 0.399 usec
Performance counter stats for 'perf bench internals synthesize':
2,077.81 msec task-clock # 0.998 CPUs
utilized
466 context-switches # 224.274 /sec
4 cpu-migrations # 1.925 /sec
775 page-faults # 372.988 /sec
9,561,957,326 cycles # 4.602 GHz
(31.17%)
24,466,854,021 instructions # 2.56 insn
per cycle (37.42%)
5,547,892,196 branches # 2.670
G/sec (37.48%)
37,880,526 branch-misses # 0.68% of
all branches (37.52%)
49,576,109 CPU_CLK_UNHALTED.REF_XCLK # 23.860 M/sec
# 59.9 % tma_retiring
# 4.6 %
tma_bad_speculation (37.47%)
228,406,003 INT_MISC.RECOVERY_CYCLES_ANY # 109.926
M/sec (37.52%)
49,591,815 CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE # 23.867
M/sec (24.99%)
9,553,472,893 CPU_CLK_UNHALTED.THREAD # 4.598
G/sec (31.25%)
22,893,372,651 UOPS_RETIRED.RETIRE_SLOTS # 11.018
G/sec (31.23%)
24,180,375,299 UOPS_ISSUED.ANY # 11.637
G/sec (31.25%)
49,562,300 CPU_CLK_UNHALTED.REF_XCLK # 23.853 M/sec
# 28.1 %
tma_frontend_bound
# 7.2 %
tma_backend_bound (31.24%)
10,735,205,084 IDQ_UOPS_NOT_DELIVERED.CORE # 5.167
G/sec (31.30%)
228,798,426 INT_MISC.RECOVERY_CYCLES_ANY # 110.115
M/sec (25.04%)
49,559,962 CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE # 23.852
M/sec (25.00%)
9,538,354,333 CPU_CLK_UNHALTED.THREAD # 4.591
G/sec (31.29%)
24,207,967,071 UOPS_ISSUED.ANY # 11.651
G/sec (31.24%)
2.082670856 seconds time elapsed
0.812763000 seconds user
1.252387000 seconds sys
With non-root, nothing is counted for the topdownL1 events.
$ ./perf stat perf bench internals synthesize
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 91.852 usec (+- 0.139 usec)
Average num. events: 47.000 (+- 0.000)
Average time per event 1.954 usec
Average data synthesis took: 96.230 usec (+- 0.046 usec)
Average num. events: 245.000 (+- 0.000)
Average time per event 0.393 usec
Performance counter stats for 'perf bench internals synthesize':
2,051.95 msec task-clock:u # 0.997 CPUs
utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
765 page-faults:u # 372.816 /sec
3,601,662,523 cycles:u # 1.755 GHz
(16.72%)
9,241,811,003 instructions:u # 2.57 insn
per cycle (33.43%)
2,238,848,485 branches:u # 1.091
G/sec (50.06%)
19,966,181 branch-misses:u # 0.89% of
all branches (66.77%)
<not counted> CPU_CLK_UNHALTED.REF_XCLK:u
<not supported> INT_MISC.RECOVERY_CYCLES_ANY:u
<not counted> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u
<not counted> CPU_CLK_UNHALTED.THREAD:u
<not counted> UOPS_RETIRED.RETIRE_SLOTS:u
<not counted> UOPS_ISSUED.ANY:u
<not counted> CPU_CLK_UNHALTED.REF_XCLK:u
<not counted> IDQ_UOPS_NOT_DELIVERED.CORE:u
<not supported> INT_MISC.RECOVERY_CYCLES_ANY:u
<not counted> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u
<not counted> CPU_CLK_UNHALTED.THREAD:u
<not counted> UOPS_ISSUED.ANY:u
2.057691297 seconds time elapsed
0.766640000 seconds user
1.275170000 seconds sys
Thanks,
Kan
next prev parent reply other threads:[~2023-05-01 14:56 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-29 5:34 [PATCH v3 00/46] Fix perf on Intel hybrid CPUs Ian Rogers
2023-04-29 5:34 ` [PATCH v3 01/46] perf stat: Disable TopdownL1 on hybrid Ian Rogers
2023-04-29 5:34 ` [PATCH v3 02/46] perf metric: Change divide by zero and !support events behavior Ian Rogers
2023-04-29 5:34 ` [PATCH v3 03/46] perf stat: Introduce skippable evsels Ian Rogers
2023-05-01 14:56 ` Liang, Kan [this message]
2023-05-01 15:29 ` Ian Rogers
2023-05-01 20:25 ` Liang, Kan
2023-05-01 20:48 ` Ian Rogers
2023-05-01 23:34 ` Liang, Kan
2023-04-29 5:34 ` [PATCH v3 05/46] perf parse-events: Don't reorder ungrouped events by pmu Ian Rogers
2023-04-29 5:34 ` [PATCH v3 06/46] perf vendor events intel: Add alderlake metric constraints Ian Rogers
2023-04-29 5:34 ` [PATCH v3 07/46] perf vendor events intel: Add icelake " Ian Rogers
2023-04-29 5:34 ` [PATCH v3 08/46] perf vendor events intel: Add icelakex " Ian Rogers
2023-04-29 5:34 ` [PATCH v3 09/46] perf vendor events intel: Add sapphirerapids " Ian Rogers
2023-04-29 5:34 ` [PATCH v3 10/46] perf vendor events intel: Add tigerlake " Ian Rogers
2023-04-29 5:34 ` [PATCH v3 11/46] perf stat: Avoid segv on counter->name Ian Rogers
2023-04-29 5:34 ` [PATCH v3 12/46] perf test: Test more sysfs events Ian Rogers
2023-05-02 10:27 ` Ravi Bangoria
2023-05-02 15:16 ` Ian Rogers
2023-05-02 15:29 ` Ian Rogers
2023-04-29 5:34 ` [PATCH v3 13/46] perf test: Use valid for PMU tests Ian Rogers
2023-04-29 5:34 ` [PATCH v3 14/46] perf test: Mask config then test Ian Rogers
2023-05-02 10:44 ` Ravi Bangoria
2023-05-02 16:19 ` Ian Rogers
2023-04-29 5:34 ` [PATCH v3 15/46] perf test: Test more with config_cache Ian Rogers
2023-04-29 5:34 ` [PATCH v3 16/46] perf test: Roundtrip name, don't assume 1 event per name Ian Rogers
2023-04-29 5:34 ` [PATCH v3 17/46] perf parse-events: Set attr.type to PMU type early Ian Rogers
2023-04-29 5:34 ` [PATCH v3 18/46] perf parse-events: Set pmu_name whenever a pmu is given Ian Rogers
2023-04-29 5:34 ` [PATCH v3 19/46] perf print-events: Avoid unnecessary strlist Ian Rogers
2023-04-29 5:34 ` [PATCH v3 20/46] perf parse-events: Avoid scanning PMUs before parsing Ian Rogers
2023-04-29 5:34 ` [PATCH v3 21/46] perf evsel: Modify group pmu name for software events Ian Rogers
2023-04-29 5:34 ` [PATCH v3 22/46] perf test: Move x86 hybrid tests to arch/x86 Ian Rogers
2023-04-29 5:34 ` [PATCH v3 23/46] perf test x86 hybrid: Update test expectations Ian Rogers
2023-04-29 5:34 ` [PATCH v3 24/46] perf test x86 hybrid: Add hybrid extended type checks Ian Rogers
2023-04-29 5:34 ` [PATCH v3 25/46] perf parse-events: Support PMUs for legacy cache events Ian Rogers
2023-04-29 5:34 ` [PATCH v3 26/46] perf parse-events: Wildcard " Ian Rogers
2023-04-29 5:34 ` [PATCH v3 27/46] perf print-events: Print legacy cache events for each PMU Ian Rogers
2023-05-02 10:48 ` Ravi Bangoria
2023-05-02 17:40 ` Ian Rogers
2023-04-29 5:34 ` [PATCH v3 28/46] perf parse-events: Support wildcards on raw events Ian Rogers
2023-04-29 5:34 ` [PATCH v3 29/46] perf parse-events: Remove now unused hybrid logic Ian Rogers
2023-04-29 5:34 ` [PATCH v3 30/46] perf parse-events: Minor type safety cleanup Ian Rogers
2023-04-29 5:34 ` [PATCH v3 31/46] perf parse-events: Add pmu filter Ian Rogers
2023-04-29 5:34 ` [PATCH v3 32/46] perf stat: Make cputype filter generic Ian Rogers
2023-05-02 10:51 ` Ravi Bangoria
2023-05-02 20:09 ` Ian Rogers
2023-05-02 20:16 ` Ian Rogers
2023-04-29 5:34 ` [PATCH v3 33/46] perf test: Add cputype testing to perf stat Ian Rogers
2023-04-29 5:34 ` [PATCH v3 34/46] perf test: Fix parse-events tests for >1 core PMU Ian Rogers
2023-04-29 5:34 ` [PATCH v3 35/46] perf parse-events: Support hardware events as terms Ian Rogers
2023-05-02 10:55 ` Ravi Bangoria
2023-05-02 17:57 ` Ian Rogers
2023-04-29 5:34 ` [PATCH v3 36/46] perf parse-events: Avoid error when assigning a term Ian Rogers
2023-04-29 5:34 ` [PATCH v3 37/46] perf parse-events: Avoid error when assigning a legacy cache term Ian Rogers
2023-04-29 5:34 ` [PATCH v3 38/46] perf parse-events: Don't auto merge hybrid wildcard events Ian Rogers
2023-04-29 5:34 ` [PATCH v3 39/46] perf parse-events: Don't reorder atom cpu events Ian Rogers
2023-04-29 5:35 ` [PATCH v3 40/46] perf metrics: Be PMU specific for referenced metrics Ian Rogers
2023-04-29 5:35 ` [PATCH v3 41/46] perf stat: Command line PMU metric filtering Ian Rogers
2023-04-29 5:35 ` [PATCH v3 42/46] perf vendor events intel: Correct alderlake metrics Ian Rogers
2023-04-29 5:35 ` [PATCH v3 43/46] perf jevents: Don't rewrite metrics across PMUs Ian Rogers
2023-04-29 5:35 ` [PATCH v3 44/46] perf metrics: Be PMU specific in event match Ian Rogers
2023-04-29 5:35 ` [PATCH v3 45/46] perf stat: Don't disable TopdownL1 metric on hybrid Ian Rogers
2023-04-29 5:35 ` [PATCH v3 46/46] perf parse-events: Reduce scope of is_event_supported Ian Rogers
2023-05-01 20:34 ` [PATCH v3 00/46] Fix perf on Intel hybrid CPUs Liang, Kan
2023-05-01 20:51 ` Ian Rogers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d6784858-2a5f-7920-f1ac-d7ec9ed89605@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ahmad.yasin@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=atrajeev@linux.vnet.ibm.com \
--cc=caleb.biggers@intel.com \
--cc=edward.baker@intel.com \
--cc=eranian@google.com \
--cc=florian.fischer@muhq.space \
--cc=irogers@google.com \
--cc=james.clark@arm.com \
--cc=john.g.garry@oracle.com \
--cc=jolsa@kernel.org \
--cc=kjain@linux.ibm.com \
--cc=leo.yan@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=perry.taylor@intel.com \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=robh@kernel.org \
--cc=samantha.alt@intel.com \
--cc=sumanthk@linux.ibm.com \
--cc=suzuki.poulose@arm.com \
--cc=tegongkang@gmail.com \
--cc=tmricht@linux.ibm.com \
--cc=weilin.wang@intel.com \
--cc=yangjihong1@huawei.com \
--cc=yangtiezhu@loongson.cn \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).