linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Lack of aarch64 checks for perf events schedulability
@ 2016-05-13 15:33 William Cohen
  2016-05-13 17:58 ` [Perfapi-devel] " Vince Weaver
  0 siblings, 1 reply; 5+ messages in thread
From: William Cohen @ 2016-05-13 15:33 UTC (permalink / raw)
  To: linux-perf-users, PAPI Developers, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 763 bytes --]

Hi,

When running the PAPI testsuite on RHEL for aarch64 Michael Petlan
found that the test overflow_allcounters was failing.  I investigated
and it looks like the the RHEL for aarch64 linux kernel perf support
suffers from a problem similar to MIPS kernels where perf_event_open
doesn't properly check that events can be scheduled together; then a
later read of the counters will fail.  This has been observed on the
RHEL for aarch64 4.5.0 based kernel. I have not tried this on the
latest kernel, so I don't know if this is still a problem with newer
kernels.

As a workaround papi's check for MIPS can be expanded for aarch64 as
the attached patch.  However, it would be better if the
perf_event_open schedule checks worked properly for aarch64.

-Will Cohen


[-- Attachment #2: papi_sched_aarch64.patch --]
[-- Type: text/x-patch, Size: 743 bytes --]

diff -up papi-5.2.0/src/components/perf_event/perf_event.c.sched_aarch64 papi-5.2.0/src/components/perf_event/perf_event.c
--- papi-5.2.0/src/components/perf_event/perf_event.c.sched_aarch64	2016-05-13 11:07:29.509101953 -0400
+++ papi-5.2.0/src/components/perf_event/perf_event.c	2016-05-13 11:08:16.520550544 -0400
@@ -159,8 +159,8 @@ bug_check_scheduability(void) {
 
 #if defined(__powerpc__)
   /* PowerPC not affected by this bug */
-#elif defined(__mips__)
-  /* MIPS as of kernel 3.1 does not properly detect schedulability */
+#elif defined(__mips__) || defined(__aarch64__)
+  /* MIPS and aarch64 kernels do not properly detect schedulability */
   return 1;
 #else
   if (_papi_os_info.os_version < LINUX_VERSION(2,6,33)) return 1;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Perfapi-devel] Lack of aarch64 checks for perf events schedulability
  2016-05-13 15:33 Lack of aarch64 checks for perf events schedulability William Cohen
@ 2016-05-13 17:58 ` Vince Weaver
  2016-05-13 18:28   ` William Cohen
  0 siblings, 1 reply; 5+ messages in thread
From: Vince Weaver @ 2016-05-13 17:58 UTC (permalink / raw)
  To: William Cohen; +Cc: linux-perf-users, PAPI Developers, Michael Petlan

On Fri, 13 May 2016, William Cohen wrote:

> When running the PAPI testsuite on RHEL for aarch64 Michael Petlan
> found that the test overflow_allcounters was failing.  I investigated
> and it looks like the the RHEL for aarch64 linux kernel perf support
> suffers from a problem similar to MIPS kernels where perf_event_open
> doesn't properly check that events can be scheduled together; then a
> later read of the counters will fail.  This has been observed on the
> RHEL for aarch64 4.5.0 based kernel. I have not tried this on the
> latest kernel, so I don't know if this is still a problem with newer
> kernels.

Let me see, yes I can reproduce this on my arm64 dragonboard running Linux 
3.16.

Your fix does fix things in that the test runs but it fails for other 
reasons at the validation stage.

Only 6 out of 7 counters are used, but it's picking a weird set of 
counters to use for the test.  I forget what overflow_allcounters actually
does.

I doubt it's worth the trouble of trying to get this fixed at the Linux 
level but it would be interesting to see why the 7th counter can't be 
scheduled.

Vince

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Perfapi-devel] Lack of aarch64 checks for perf events schedulability
  2016-05-13 17:58 ` [Perfapi-devel] " Vince Weaver
@ 2016-05-13 18:28   ` William Cohen
  2016-05-13 19:02     ` Vince Weaver
  0 siblings, 1 reply; 5+ messages in thread
From: William Cohen @ 2016-05-13 18:28 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-perf-users, PAPI Developers, Michael Petlan

On 05/13/2016 01:58 PM, Vince Weaver wrote:
> On Fri, 13 May 2016, William Cohen wrote:
> 
>> When running the PAPI testsuite on RHEL for aarch64 Michael Petlan
>> found that the test overflow_allcounters was failing.  I investigated
>> and it looks like the the RHEL for aarch64 linux kernel perf support
>> suffers from a problem similar to MIPS kernels where perf_event_open
>> doesn't properly check that events can be scheduled together; then a
>> later read of the counters will fail.  This has been observed on the
>> RHEL for aarch64 4.5.0 based kernel. I have not tried this on the
>> latest kernel, so I don't know if this is still a problem with newer
>> kernels.
> 
> Let me see, yes I can reproduce this on my arm64 dragonboard running Linux 
> 3.16.

Hi Vince,

Thanks for trying it out on the dragonboard.  I have a dragonboard 410c at home also and will give that a try too.

> 
> Your fix does fix things in that the test runs but it fails for other 
> reasons at the validation stage.

I saw at times that the test fail the validation because there were a the number of samples wasn't exactly the count expected.

> 
> Only 6 out of 7 counters are used, but it's picking a weird set of 
> counters to use for the test.  I forget what overflow_allcounters actually
> does.
> 
> I doubt it's worth the trouble of trying to get this fixed at the Linux 
> level but it would be interesting to see why the 7th counter can't be 
> scheduled.
> 
> Vince
> 

It still is worth it to check whether that silent failure behavior of perf_event_open was expected.

According to the cortex a53 manual there are six general purpose counters and one cycle counter.  Depending on how the event selection is done it might not be able to use that cycle counter.  It might also be taking a counter for a watchdog timer.  The Xgene machine I am using state in the /var/log/messages:

hw perfevents: enabled with armv8_pmuv3 PMU driver, 5 counters available

How many counters does the dragonboard kernel claim the machine has?

-Will

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Perfapi-devel] Lack of aarch64 checks for perf events schedulability
  2016-05-13 18:28   ` William Cohen
@ 2016-05-13 19:02     ` Vince Weaver
  2016-05-14  0:22       ` William Cohen
  0 siblings, 1 reply; 5+ messages in thread
From: Vince Weaver @ 2016-05-13 19:02 UTC (permalink / raw)
  To: William Cohen
  Cc: Vince Weaver, linux-perf-users, PAPI Developers, Michael Petlan

On Fri, 13 May 2016, William Cohen wrote:

> It still is worth it to check whether that silent failure behavior of 
> perf_event_open was expected.
> 
> According to the cortex a53 manual there are six general purpose 
> counters and one cycle counter.  Depending on how the event selection is 
> done it might not be able to use that cycle counter.  It might also be 
> taking a counter for a watchdog timer.  The Xgene machine I am using 
> state in the /var/log/messages:
> 
> hw perfevents: enabled with armv8_pmuv3 PMU driver, 5 counters available
> 
> How many counters does the dragonboard kernel claim the machine has?

My dragonboard reports:

hw perfevents: enabled with arm/armv8-pmuv3 PMU driver, 7 counters available

and my jetson-tx1 reports:

hw perfevents: enabled with arm/armv8-pmuv3 PMU driver, 7 counters available


Neither seems to have the NMI watchdog enabled, or at least it's not set 
in /proc/sys/kernel/nmi_watchdog

Vince

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Perfapi-devel] Lack of aarch64 checks for perf events schedulability
  2016-05-13 19:02     ` Vince Weaver
@ 2016-05-14  0:22       ` William Cohen
  0 siblings, 0 replies; 5+ messages in thread
From: William Cohen @ 2016-05-14  0:22 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-perf-users, PAPI Developers, Michael Petlan

On 05/13/2016 03:02 PM, Vince Weaver wrote:
> On Fri, 13 May 2016, William Cohen wrote:
> 
>> It still is worth it to check whether that silent failure behavior of 
>> perf_event_open was expected.
>>
>> According to the cortex a53 manual there are six general purpose 
>> counters and one cycle counter.  Depending on how the event selection is 
>> done it might not be able to use that cycle counter.  It might also be 
>> taking a counter for a watchdog timer.  The Xgene machine I am using 
>> state in the /var/log/messages:
>>
>> hw perfevents: enabled with armv8_pmuv3 PMU driver, 5 counters available
>>
>> How many counters does the dragonboard kernel claim the machine has?
> 
> My dragonboard reports:
> 
> hw perfevents: enabled with arm/armv8-pmuv3 PMU driver, 7 counters available
> 
> and my jetson-tx1 reports:
> 
> hw perfevents: enabled with arm/armv8-pmuv3 PMU driver, 7 counters available
> 
> 
> Neither seems to have the NMI watchdog enabled, or at least it's not set 
> in /proc/sys/kernel/nmi_watchdog
> 
> Vince
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Hi Vince,

I tried running on the current version of software that I have on my
dragonboard 410c and I see the same test failure for
overflow_allcounters.

-Will

from /var/log/messages:

 hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available     
                 
$ uname -a                      
Linux linaro-alip 4.4.0-linaro-lt-qcom #1 SMP PREEMPT Sat Feb 27 04:55:16 UTC 2 
$ ./ctests/overflow_allcounters >& out.log
$ more out.log
PAPI Error: Error! short read.
System error in PAPI_stop: Invalid argument
Trying to fill 6 hardware counters...
0x4000000f (perf::PERF_COUNT_HW_CPU_CYCLES:period=0) can't be added to the EventSet.
0x40000010 (perf::PERF_COUNT_HW_CPU_CYCLES:freq=0) can't be added to the EventSet.
event_code[0] = 0x40000011 (perf::PERF_COUNT_HW_CPU_CYCLES:precise=0)
event_code[1] = 0x40000012 (perf::PERF_COUNT_HW_CPU_CYCLES:excl=0)
0x40000013 (perf::PERF_COUNT_HW_CPU_CYCLES:mg=0) can't be added to the EventSet.
0x40000014 (perf::PERF_COUNT_HW_CPU_CYCLES:mh=0) can't be added to the EventSet.
0x40000015 (perf::PERF_COUNT_HW_CPU_CYCLES:cpu=0) can't be added to the EventSet.
0x40000016 (perf::PERF_COUNT_HW_CPU_CYCLES:pinned=0) can't be added to the EventSet.
0x40000018 (perf::CYCLES:period=0) can't be added to the EventSet.
0x40000019 (perf::CYCLES:freq=0) can't be added to the EventSet.
0x4000001a (perf::CYCLES:precise=0) can't be added to the EventSet.
0x4000001b (perf::CYCLES:excl=0) can't be added to the EventSet.
0x4000001c (perf::CYCLES:mg=0) can't be added to the EventSet.
0x4000001d (perf::CYCLES:mh=0) can't be added to the EventSet.
0x4000001e (perf::CYCLES:cpu=0) can't be added to the EventSet.
0x4000001f (perf::CYCLES:pinned=0) can't be added to the EventSet.
0x40000021 (perf::CPU-CYCLES:period=0) can't be added to the EventSet.
0x40000022 (perf::CPU-CYCLES:freq=0) can't be added to the EventSet.
0x40000023 (perf::CPU-CYCLES:precise=0) can't be added to the EventSet.
0x40000024 (perf::CPU-CYCLES:excl=0) can't be added to the EventSet.
0x40000025 (perf::CPU-CYCLES:mg=0) can't be added to the EventSet.
0x40000026 (perf::CPU-CYCLES:mh=0) can't be added to the EventSet.
0x40000027 (perf::CPU-CYCLES:cpu=0) can't be added to the EventSet.
0x40000028 (perf::CPU-CYCLES:pinned=0) can't be added to the EventSet.
0x4000002a (perf::PERF_COUNT_HW_INSTRUCTIONS:period=0) can't be added to the EventSet.
0x4000002b (perf::PERF_COUNT_HW_INSTRUCTIONS:freq=0) can't be added to the EventSet.
event_code[2] = 0x4000002c (perf::PERF_COUNT_HW_INSTRUCTIONS:precise=0)
event_code[3] = 0x4000002d (perf::PERF_COUNT_HW_INSTRUCTIONS:excl=0)
event_code[4] = 0x4000002e (perf::PERF_COUNT_HW_INSTRUCTIONS:mg=0)
event_code[5] = 0x4000002f (perf::PERF_COUNT_HW_INSTRUCTIONS:mh=0)
Tried to fill 6 counters with events, found 6
Trying 6 events
0: perf::PERF_COUNT_HW_CPU_CYCLES:precise=0
1: perf::PERF_COUNT_HW_CPU_CYCLES:excl=0
2: perf::PERF_COUNT_HW_INSTRUCTIONS:precise=0
3: perf::PERF_COUNT_HW_INSTRUCTIONS:excl=0
4: perf::PERF_COUNT_HW_INSTRUCTIONS:mg=0
5: perf::PERF_COUNT_HW_INSTRUCTIONS:mh=0
Using a threshold of 24180000 (20,000 * MHz)
Testing that the events all work with no overflow
overflow_allcounters.c       FAILED
Line # 137

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-05-14  0:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-13 15:33 Lack of aarch64 checks for perf events schedulability William Cohen
2016-05-13 17:58 ` [Perfapi-devel] " Vince Weaver
2016-05-13 18:28   ` William Cohen
2016-05-13 19:02     ` Vince Weaver
2016-05-14  0:22       ` William Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).