All of lore.kernel.org
 help / color / mirror / Atom feed
* Experiments using perf support in arm kvm guest
@ 2013-09-23 15:53 William Cohen
  2013-09-24  0:06 ` David Ahern
  0 siblings, 1 reply; 3+ messages in thread
From: William Cohen @ 2013-09-23 15:53 UTC (permalink / raw)
  To: PAPI list, linux-perf-users

Hi All,

I was curious to see how well (or poorly) perf events work in a virtualizated environment.  As a little experiment I have tried building papi from the git repo in a fedora rawhide guest vm running on an Intel ivy bridge.   I also ran things on the f19 host to compare results of "make fulltest" between the raw and virtualized hardware.  Despite trying to copy the host machine processor information in the set up of the guest machine, the guest vm thinks it is a sandy bridge rather than the Intel Ivy Bridge, but it looks like the same events are used in papi_events.csb for both.  The papi "make fulltest" results look similar on the x86.

There has been some work on arm cortex a15 to support hardware virtualization (http://osdir.com/ml/fedora-arm/2013-09/msg00011.html).  I have kvm hardware accelerated virtualization running on my Samsung ARM chromebook.  Both host and guest are running Fedora 19. The host is running a 3.11 kernel with a patch so that Samsung exynos 5250 boots up. The guest is running a stock Fedora 19 3.10.1-200 kernel.  For arm the guest papi "make fulltest" results are not so good.  It appears that access to the perf counters on the arm guest are not so good.  On the arm guest it looks like only the cycle count event is working::

Performance counter stats for 'ls':

          4.043500 task-clock                #    0.799 CPUs utilized          
                 0 context-switches          #    0.000 K/sec                  
                 0 cpu-migrations            #    0.000 K/sec                  
               237 page-faults               #    0.059 M/sec                  
     2,147,483,647 cycles                    #  531.095 GHz                    
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
     <not counted> instructions            
     <not counted> branches                
     <not counted> branch-misses           

       0.005059000 seconds time elapsed


On the arm host see:

 Performance counter stats for 'ls':

         19.259873 task-clock                #    0.777 CPUs utilized          
                 2 context-switches          #    0.104 K/sec                  
                 0 cpu-migrations            #    0.000 K/sec                  
               242 page-faults               #    0.013 M/sec                  
         6,242,062 cycles                    #    0.324 GHz                    
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
         3,479,441 instructions              #    0.56  insns per cycle        
           644,120 branches                  #   33.444 M/sec                  
            37,372 branch-misses             #    5.80% of all branches        

       0.024776800 seconds time elapsed

Are there reasons that the arm hardware cannot virtualize the performance counters like the x86 machines? Or is this something that just hasn't been implmented yet in the kernel? Or is this suppose to work and there is a bug?


-Will

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-10-02 15:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-23 15:53 Experiments using perf support in arm kvm guest William Cohen
2013-09-24  0:06 ` David Ahern
2013-10-02 15:33   ` Gleb Natapov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.