* [PATCH] perf topdown: Correct leader selection with sample_read enabled
@ 2024-06-14 21:39 Dapeng Mi
2024-06-27 15:11 ` Liang, Kan
0 siblings, 1 reply; 6+ messages in thread
From: Dapeng Mi @ 2024-06-14 21:39 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Kan Liang
Cc: linux-perf-users, linux-kernel, Dapeng Mi, Dapeng Mi
Addresses an issue where, in the absence of a topdown metrics event
within a sampling group, the slots event was incorrectly bypassed as
the sampling leader when sample_read was enabled.
perf record -e '{slots,branches}:S' -c 10000 -vv sleep 1
In this case, the slots event should be sampled as leader but the
branches event is sampled in fact like the verbose output shows.
perf_event_attr:
type 4 (cpu)
size 168
config 0x400 (slots)
sample_type IP|TID|TIME|READ|CPU|IDENTIFIER
read_format ID|GROUP|LOST
disabled 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 168
config 0x4 (PERF_COUNT_HW_BRANCH_INSTRUCTIONS)
{ sample_period, sample_freq } 10000
sample_type IP|TID|TIME|READ|CPU|IDENTIFIER
read_format ID|GROUP|LOST
sample_id_all 1
exclude_guest 1
The sample period of slots event instead of branches event is reset to
0.
This fix ensures the slots event remains the leader under these
conditions.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
tools/perf/arch/x86/util/topdown.c | 42 ++++++++++++++++++++++++++++--
1 file changed, 40 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/x86/util/topdown.c b/tools/perf/arch/x86/util/topdown.c
index 3f9a267d4501..aea6896bbb57 100644
--- a/tools/perf/arch/x86/util/topdown.c
+++ b/tools/perf/arch/x86/util/topdown.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include "api/fs/fs.h"
#include "util/evsel.h"
+#include "util/evlist.h"
#include "util/pmu.h"
#include "util/pmus.h"
#include "util/topdown.h"
@@ -31,6 +32,32 @@ bool topdown_sys_has_perf_metrics(void)
return has_perf_metrics;
}
+static int perf_pmus__topdown_event(void *vstate, struct pmu_event_info *info)
+{
+ if (!strcmp(info->name, (char *)vstate))
+ return 1;
+
+ return 0;
+}
+
+static bool is_topdown_metric_event(struct evsel *event)
+{
+ struct perf_pmu *pmu;
+
+ if (!topdown_sys_has_perf_metrics())
+ return false;
+
+ if (event->core.attr.type != PERF_TYPE_RAW)
+ return false;
+
+ pmu = perf_pmus__find_by_type(PERF_TYPE_RAW);
+ if (pmu && perf_pmu__for_each_event(pmu, false, event->name,
+ perf_pmus__topdown_event))
+ return true;
+
+ return false;
+}
+
#define TOPDOWN_SLOTS 0x0400
/*
@@ -41,11 +68,22 @@ bool topdown_sys_has_perf_metrics(void)
*/
bool arch_topdown_sample_read(struct evsel *leader)
{
+ struct evsel *event;
+
if (!evsel__sys_has_perf_metrics(leader))
return false;
- if (leader->core.attr.config == TOPDOWN_SLOTS)
- return true;
+ if (leader->core.attr.config != TOPDOWN_SLOTS)
+ return false;
+
+ /*
+ * If slots event as leader event but no topdown metric events in group,
+ * slots event should still sample as leader.
+ */
+ evlist__for_each_entry(leader->evlist, event) {
+ if (event != leader && is_topdown_metric_event(event))
+ return true;
+ }
return false;
}
base-commit: 92e5605a199efbaee59fb19e15d6cc2103a04ec2
--
2.40.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] perf topdown: Correct leader selection with sample_read enabled
2024-06-14 21:39 [PATCH] perf topdown: Correct leader selection with sample_read enabled Dapeng Mi
@ 2024-06-27 15:11 ` Liang, Kan
2024-06-28 6:17 ` Mi, Dapeng
0 siblings, 1 reply; 6+ messages in thread
From: Liang, Kan @ 2024-06-27 15:11 UTC (permalink / raw)
To: Dapeng Mi, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin
Cc: linux-perf-users, linux-kernel, Dapeng Mi
Hi Dapeng,
On 2024-06-14 5:39 p.m., Dapeng Mi wrote:
> Addresses an issue where, in the absence of a topdown metrics event
> within a sampling group, the slots event was incorrectly bypassed as
> the sampling leader when sample_read was enabled.
>
> perf record -e '{slots,branches}:S' -c 10000 -vv sleep 1
>
> In this case, the slots event should be sampled as leader but the
> branches event is sampled in fact like the verbose output shows.
>
> perf_event_attr:
> type 4 (cpu)
> size 168
> config 0x400 (slots)
> sample_type IP|TID|TIME|READ|CPU|IDENTIFIER
> read_format ID|GROUP|LOST
> disabled 1
> sample_id_all 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5
> ------------------------------------------------------------
> perf_event_attr:
> type 0 (PERF_TYPE_HARDWARE)
> size 168
> config 0x4 (PERF_COUNT_HW_BRANCH_INSTRUCTIONS)
> { sample_period, sample_freq } 10000
> sample_type IP|TID|TIME|READ|CPU|IDENTIFIER
> read_format ID|GROUP|LOST
> sample_id_all 1
> exclude_guest 1
>
> The sample period of slots event instead of branches event is reset to
> 0.
>
> This fix ensures the slots event remains the leader under these
> conditions.
This should be just one of the issues with the slots/topdown related
sampling read.
If adding one more topdown event, the sampling read may still be broken.
perf record -e "{slots,instructions,topdown-retiring}:S" -C0 sleep 1
WARNING: events were regrouped to match PMUs
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (topdown-retiring).
That may require Yanfei's patch.
https://lore.kernel.org/lkml/20240411144852.2507143-1-yanfei.xu@intel.com/
Please give it try and summarize all the required patches for the
topdown sampling read feature.
Besides, we need a test for the sampling read as well.
Ian has provided a very good base. Please add a topdown sampling read
case on top of it as well.
https://lore.kernel.org/lkml/CAP-5=fUkg-cAXTb+3wbFOQCfdXgpQeZw40XHjfrNFbnBD=NMXg@mail.gmail.com/
Thanks,
Kan
>
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> tools/perf/arch/x86/util/topdown.c | 42 ++++++++++++++++++++++++++++--
> 1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/arch/x86/util/topdown.c b/tools/perf/arch/x86/util/topdown.c
> index 3f9a267d4501..aea6896bbb57 100644
> --- a/tools/perf/arch/x86/util/topdown.c
> +++ b/tools/perf/arch/x86/util/topdown.c
> @@ -1,6 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0
> #include "api/fs/fs.h"
> #include "util/evsel.h"
> +#include "util/evlist.h"
> #include "util/pmu.h"
> #include "util/pmus.h"
> #include "util/topdown.h"
> @@ -31,6 +32,32 @@ bool topdown_sys_has_perf_metrics(void)
> return has_perf_metrics;
> }
>
> +static int perf_pmus__topdown_event(void *vstate, struct pmu_event_info *info)
> +{
> + if (!strcmp(info->name, (char *)vstate))
> + return 1;
> +
> + return 0;
> +}
> +
> +static bool is_topdown_metric_event(struct evsel *event)
> +{
> + struct perf_pmu *pmu;
> +
> + if (!topdown_sys_has_perf_metrics())
> + return false;
> +
> + if (event->core.attr.type != PERF_TYPE_RAW)
> + return false;
> +
> + pmu = perf_pmus__find_by_type(PERF_TYPE_RAW);
> + if (pmu && perf_pmu__for_each_event(pmu, false, event->name,
> + perf_pmus__topdown_event))
> + return true;
> +
> + return false;
> +}
> +
> #define TOPDOWN_SLOTS 0x0400
>
> /*
> @@ -41,11 +68,22 @@ bool topdown_sys_has_perf_metrics(void)
> */
> bool arch_topdown_sample_read(struct evsel *leader)
> {
> + struct evsel *event;
> +
> if (!evsel__sys_has_perf_metrics(leader))
> return false;
>
> - if (leader->core.attr.config == TOPDOWN_SLOTS)
> - return true;
> + if (leader->core.attr.config != TOPDOWN_SLOTS)
> + return false;
> +
> + /*
> + * If slots event as leader event but no topdown metric events in group,
> + * slots event should still sample as leader.
> + */
> + evlist__for_each_entry(leader->evlist, event) {
> + if (event != leader && is_topdown_metric_event(event))
> + return true;
> + }
>
> return false;
> }
>
> base-commit: 92e5605a199efbaee59fb19e15d6cc2103a04ec2
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] perf topdown: Correct leader selection with sample_read enabled
2024-06-27 15:11 ` Liang, Kan
@ 2024-06-28 6:17 ` Mi, Dapeng
2024-06-28 18:28 ` Ian Rogers
0 siblings, 1 reply; 6+ messages in thread
From: Mi, Dapeng @ 2024-06-28 6:17 UTC (permalink / raw)
To: Liang, Kan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin
Cc: linux-perf-users, linux-kernel, Dapeng Mi
On 6/27/2024 11:11 PM, Liang, Kan wrote:
> Hi Dapeng,
>
> On 2024-06-14 5:39 p.m., Dapeng Mi wrote:
>> Addresses an issue where, in the absence of a topdown metrics event
>> within a sampling group, the slots event was incorrectly bypassed as
>> the sampling leader when sample_read was enabled.
>>
>> perf record -e '{slots,branches}:S' -c 10000 -vv sleep 1
>>
>> In this case, the slots event should be sampled as leader but the
>> branches event is sampled in fact like the verbose output shows.
>>
>> perf_event_attr:
>> type 4 (cpu)
>> size 168
>> config 0x400 (slots)
>> sample_type IP|TID|TIME|READ|CPU|IDENTIFIER
>> read_format ID|GROUP|LOST
>> disabled 1
>> sample_id_all 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 0 (PERF_TYPE_HARDWARE)
>> size 168
>> config 0x4 (PERF_COUNT_HW_BRANCH_INSTRUCTIONS)
>> { sample_period, sample_freq } 10000
>> sample_type IP|TID|TIME|READ|CPU|IDENTIFIER
>> read_format ID|GROUP|LOST
>> sample_id_all 1
>> exclude_guest 1
>>
>> The sample period of slots event instead of branches event is reset to
>> 0.
>>
>> This fix ensures the slots event remains the leader under these
>> conditions.
> This should be just one of the issues with the slots/topdown related
> sampling read.
>
> If adding one more topdown event, the sampling read may still be broken.
> perf record -e "{slots,instructions,topdown-retiring}:S" -C0 sleep 1
> WARNING: events were regrouped to match PMUs
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument)
> for event (topdown-retiring).
>
> That may require Yanfei's patch.
> https://lore.kernel.org/lkml/20240411144852.2507143-1-yanfei.xu@intel.com/
Yes, we need this patch. It would fix the error what you see.
>
> Please give it try and summarize all the required patches for the
> topdown sampling read feature.
I would talk with Yanfei, and collect all required patches into a whole
patchset. This would make the patch review more easily.
>
> Besides, we need a test for the sampling read as well.
> Ian has provided a very good base. Please add a topdown sampling read
> case on top of it as well.
> https://lore.kernel.org/lkml/CAP-5=fUkg-cAXTb+3wbFOQCfdXgpQeZw40XHjfrNFbnBD=NMXg@mail.gmail.com/
Sure. I would look at it and add a test case.
>
>
> Thanks,
> Kan
>
>> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>> tools/perf/arch/x86/util/topdown.c | 42 ++++++++++++++++++++++++++++--
>> 1 file changed, 40 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/arch/x86/util/topdown.c b/tools/perf/arch/x86/util/topdown.c
>> index 3f9a267d4501..aea6896bbb57 100644
>> --- a/tools/perf/arch/x86/util/topdown.c
>> +++ b/tools/perf/arch/x86/util/topdown.c
>> @@ -1,6 +1,7 @@
>> // SPDX-License-Identifier: GPL-2.0
>> #include "api/fs/fs.h"
>> #include "util/evsel.h"
>> +#include "util/evlist.h"
>> #include "util/pmu.h"
>> #include "util/pmus.h"
>> #include "util/topdown.h"
>> @@ -31,6 +32,32 @@ bool topdown_sys_has_perf_metrics(void)
>> return has_perf_metrics;
>> }
>>
>> +static int perf_pmus__topdown_event(void *vstate, struct pmu_event_info *info)
>> +{
>> + if (!strcmp(info->name, (char *)vstate))
>> + return 1;
>> +
>> + return 0;
>> +}
>> +
>> +static bool is_topdown_metric_event(struct evsel *event)
>> +{
>> + struct perf_pmu *pmu;
>> +
>> + if (!topdown_sys_has_perf_metrics())
>> + return false;
>> +
>> + if (event->core.attr.type != PERF_TYPE_RAW)
>> + return false;
>> +
>> + pmu = perf_pmus__find_by_type(PERF_TYPE_RAW);
>> + if (pmu && perf_pmu__for_each_event(pmu, false, event->name,
>> + perf_pmus__topdown_event))
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> #define TOPDOWN_SLOTS 0x0400
>>
>> /*
>> @@ -41,11 +68,22 @@ bool topdown_sys_has_perf_metrics(void)
>> */
>> bool arch_topdown_sample_read(struct evsel *leader)
>> {
>> + struct evsel *event;
>> +
>> if (!evsel__sys_has_perf_metrics(leader))
>> return false;
>>
>> - if (leader->core.attr.config == TOPDOWN_SLOTS)
>> - return true;
>> + if (leader->core.attr.config != TOPDOWN_SLOTS)
>> + return false;
>> +
>> + /*
>> + * If slots event as leader event but no topdown metric events in group,
>> + * slots event should still sample as leader.
>> + */
>> + evlist__for_each_entry(leader->evlist, event) {
>> + if (event != leader && is_topdown_metric_event(event))
>> + return true;
>> + }
>>
>> return false;
>> }
>>
>> base-commit: 92e5605a199efbaee59fb19e15d6cc2103a04ec2
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] perf topdown: Correct leader selection with sample_read enabled
2024-06-28 6:17 ` Mi, Dapeng
@ 2024-06-28 18:28 ` Ian Rogers
2024-06-28 20:27 ` Liang, Kan
0 siblings, 1 reply; 6+ messages in thread
From: Ian Rogers @ 2024-06-28 18:28 UTC (permalink / raw)
To: Mi, Dapeng
Cc: Liang, Kan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, linux-perf-users,
linux-kernel, Dapeng Mi
On Thu, Jun 27, 2024 at 11:17 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
> On 6/27/2024 11:11 PM, Liang, Kan wrote:
> > On 2024-06-14 5:39 p.m., Dapeng Mi wrote:
> >
> > Besides, we need a test for the sampling read as well.
> > Ian has provided a very good base. Please add a topdown sampling read
> > case on top of it as well.
> > https://lore.kernel.org/lkml/CAP-5=fUkg-cAXTb+3wbFOQCfdXgpQeZw40XHjfrNFbnBD=NMXg@mail.gmail.com/
>
> Sure. I would look at it and add a test case.
Thanks Dapeng and thanks Kan too! I wonder if we can do a regular
counter and a leader sample counter then compare the counts are
reasonably consistent. Something like this:
```
$ perf stat -e instructions perf test -w noploop
Performance counter stats for '/tmp/perf/perf test -w noploop':
25,779,785,496 instructions
1.008047696 seconds time elapsed
1.003754000 seconds user
0.003999000 seconds sys
```
```
cat << "_end_of_file_" > a.py
last_count = None
def process_event(param_dict):
if ("ev_name" in param_dict and "sample" in param_dict and
param_dict["ev_name"] == "instructions"):
sample = param_dict["sample"]
if "values" in sample:
global last_count
last_count = sample["values"][1][1]
def trace_end():
global last_count
print(last_count)
_end_of_file_
$ sudo perf record -o - -e "{cycles,instructions}:S" perf test -w
noploop|perf script -i - -s ./a.py
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.459 MB - ]
22195356100
```
I didn't see a simpler way to get count and I don't think it is right.
There's some similar perf script checking of data in
tools/perf/tests/shell/test_intel_pt.sh.
Thanks,
Ian
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] perf topdown: Correct leader selection with sample_read enabled
2024-06-28 18:28 ` Ian Rogers
@ 2024-06-28 20:27 ` Liang, Kan
2024-07-01 9:51 ` Mi, Dapeng
0 siblings, 1 reply; 6+ messages in thread
From: Liang, Kan @ 2024-06-28 20:27 UTC (permalink / raw)
To: Ian Rogers, Mi, Dapeng
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, linux-perf-users,
linux-kernel, Dapeng Mi
On 2024-06-28 2:28 p.m., Ian Rogers wrote:
> On Thu, Jun 27, 2024 at 11:17 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>> On 6/27/2024 11:11 PM, Liang, Kan wrote:
>>> On 2024-06-14 5:39 p.m., Dapeng Mi wrote:
>>>
>>> Besides, we need a test for the sampling read as well.
>>> Ian has provided a very good base. Please add a topdown sampling read
>>> case on top of it as well.
>>> https://lore.kernel.org/lkml/CAP-5=fUkg-cAXTb+3wbFOQCfdXgpQeZw40XHjfrNFbnBD=NMXg@mail.gmail.com/
>>
>> Sure. I would look at it and add a test case.
>
> Thanks Dapeng and thanks Kan too! I wonder if we can do a regular
> counter and a leader sample counter then compare the counts are
> reasonably consistent. Something like this:
>
> ```
> $ perf stat -e instructions perf test -w noploop
>
> Performance counter stats for '/tmp/perf/perf test -w noploop':
>
> 25,779,785,496 instructions
>
> 1.008047696 seconds time elapsed
>
> 1.003754000 seconds user
> 0.003999000 seconds sys
> ```
>
> ```
> cat << "_end_of_file_" > a.py
> last_count = None
>
> def process_event(param_dict):
> if ("ev_name" in param_dict and "sample" in param_dict and
> param_dict["ev_name"] == "instructions"):
> sample = param_dict["sample"]
> if "values" in sample:
> global last_count
> last_count = sample["values"][1][1]
>
> def trace_end():
> global last_count
> print(last_count)
> _end_of_file_
> $ sudo perf record -o - -e "{cycles,instructions}:S" perf test -w
> noploop|perf script -i - -s ./a.py
> [ perf record: Woken up 2 times to write data ]
> [ perf record: Captured and wrote 0.459 MB - ]
> 22195356100
> ```
>
> I didn't see a simpler way to get count and I don't think it is right.
The perf stat can cover the whole life cycle of a workload. But I think
the result of perf record can only give the sum from the beginning to
the last sample.
There are some differences.
> There's some similar perf script checking of data in
> tools/perf/tests/shell/test_intel_pt.sh.
>
I think the case should be to test the output of the perf script, rather
than verify the accuracy of an event.
If so, we may run two same events. They should show the exact same
results in a sample.
For example,
perf record -e "{branches,branches}:Su" -c 1000000 ./perf test -w brstack
perf script
perf 752598 349300.123884: 1000002 branches: 7f18676a875a
do_lookup_x+0x2fa (/usr/lib64/l>
perf 752598 349300.123884: 1000002 branches: 7f18676a875a
do_lookup_x+0x2fa (/usr/lib64/l>
perf 752598 349300.124854: 1000005 branches: 7f18676a90b6
_dl_lookup_symbol_x+0x56 (/usr/>
perf 752598 349300.124854: 1000005 branches: 7f18676a90b6
_dl_lookup_symbol_x+0x56 (/usr/>
perf 752598 349300.125914: 999998 branches: 7f18676a8556
do_lookup_x+0xf6 (/usr/lib64/ld>
perf 752598 349300.125914: 999998 branches: 7f18676a8556
do_lookup_x+0xf6 (/usr/lib64/ld>
perf 752598 349300.127401: 1000009 branches: 4c1adf
brstack_bench+0x15 (/home/kan/o>
perf 752598 349300.127401: 1000009 branches: 4c1adf
brstack_bench+0x15 (/home/kan/o>
Thanks,
Kan
> Thanks,
> Ian
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] perf topdown: Correct leader selection with sample_read enabled
2024-06-28 20:27 ` Liang, Kan
@ 2024-07-01 9:51 ` Mi, Dapeng
0 siblings, 0 replies; 6+ messages in thread
From: Mi, Dapeng @ 2024-07-01 9:51 UTC (permalink / raw)
To: Liang, Kan, Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Adrian Hunter, Alexander Shishkin, linux-perf-users,
linux-kernel, Dapeng Mi
On 6/29/2024 4:27 AM, Liang, Kan wrote:
>
> On 2024-06-28 2:28 p.m., Ian Rogers wrote:
>> On Thu, Jun 27, 2024 at 11:17 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>> On 6/27/2024 11:11 PM, Liang, Kan wrote:
>>>> On 2024-06-14 5:39 p.m., Dapeng Mi wrote:
>>>>
>>>> Besides, we need a test for the sampling read as well.
>>>> Ian has provided a very good base. Please add a topdown sampling read
>>>> case on top of it as well.
>>>> https://lore.kernel.org/lkml/CAP-5=fUkg-cAXTb+3wbFOQCfdXgpQeZw40XHjfrNFbnBD=NMXg@mail.gmail.com/
>>> Sure. I would look at it and add a test case.
>> Thanks Dapeng and thanks Kan too! I wonder if we can do a regular
>> counter and a leader sample counter then compare the counts are
>> reasonably consistent. Something like this:
>>
>> ```
>> $ perf stat -e instructions perf test -w noploop
>>
>> Performance counter stats for '/tmp/perf/perf test -w noploop':
>>
>> 25,779,785,496 instructions
>>
>> 1.008047696 seconds time elapsed
>>
>> 1.003754000 seconds user
>> 0.003999000 seconds sys
>> ```
>>
>> ```
>> cat << "_end_of_file_" > a.py
>> last_count = None
>>
>> def process_event(param_dict):
>> if ("ev_name" in param_dict and "sample" in param_dict and
>> param_dict["ev_name"] == "instructions"):
>> sample = param_dict["sample"]
>> if "values" in sample:
>> global last_count
>> last_count = sample["values"][1][1]
>>
>> def trace_end():
>> global last_count
>> print(last_count)
>> _end_of_file_
>> $ sudo perf record -o - -e "{cycles,instructions}:S" perf test -w
>> noploop|perf script -i - -s ./a.py
>> [ perf record: Woken up 2 times to write data ]
>> [ perf record: Captured and wrote 0.459 MB - ]
>> 22195356100
>> ```
>>
>> I didn't see a simpler way to get count and I don't think it is right.
> The perf stat can cover the whole life cycle of a workload. But I think
> the result of perf record can only give the sum from the beginning to
> the last sample.
> There are some differences.
>
>> There's some similar perf script checking of data in
>> tools/perf/tests/shell/test_intel_pt.sh.
>>
> I think the case should be to test the output of the perf script, rather
> than verify the accuracy of an event.
>
> If so, we may run two same events. They should show the exact same
> results in a sample.
>
> For example,
>
> perf record -e "{branches,branches}:Su" -c 1000000 ./perf test -w brstack
> perf script
> perf 752598 349300.123884: 1000002 branches: 7f18676a875a
> do_lookup_x+0x2fa (/usr/lib64/l>
> perf 752598 349300.123884: 1000002 branches: 7f18676a875a
> do_lookup_x+0x2fa (/usr/lib64/l>
> perf 752598 349300.124854: 1000005 branches: 7f18676a90b6
> _dl_lookup_symbol_x+0x56 (/usr/>
> perf 752598 349300.124854: 1000005 branches: 7f18676a90b6
> _dl_lookup_symbol_x+0x56 (/usr/>
> perf 752598 349300.125914: 999998 branches: 7f18676a8556
> do_lookup_x+0xf6 (/usr/lib64/ld>
> perf 752598 349300.125914: 999998 branches: 7f18676a8556
> do_lookup_x+0xf6 (/usr/lib64/ld>
> perf 752598 349300.127401: 1000009 branches: 4c1adf
> brstack_bench+0x15 (/home/kan/o>
> perf 752598 349300.127401: 1000009 branches: 4c1adf
> brstack_bench+0x15 (/home/kan/o>
This looks a more accurate validation. I would add this test.
>
> Thanks,
> Kan
>
>> Thanks,
>> Ian
>>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-01 9:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14 21:39 [PATCH] perf topdown: Correct leader selection with sample_read enabled Dapeng Mi
2024-06-27 15:11 ` Liang, Kan
2024-06-28 6:17 ` Mi, Dapeng
2024-06-28 18:28 ` Ian Rogers
2024-06-28 20:27 ` Liang, Kan
2024-07-01 9:51 ` Mi, Dapeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).