Re: [PATCH] Always save/restore performance counters when HVM guest switching VCPU

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
To: xen-devel@lists.xen.org
Cc: George Dunlap <george.dunlap@eu.citrix.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	"suravee.suthikulpanit@amd.com" <suravee.suthikulpanit@amd.com>,
	"JBeulich@suse.com" <JBeulich@suse.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH] Always save/restore performance counters when HVM guest switching VCPU
Date: Tue, 12 Mar 2013 09:18:57 +0100	[thread overview]
Message-ID: <1754744.DiyRSK42SD@amur> (raw)
In-Reply-To: <20130311145349.GA26394@phenom.dumpdata.com>

Am Montag 11 März 2013, 10:53:49 schrieb Konrad Rzeszutek Wilk:
> On Mon, Mar 11, 2013 at 11:11:02AM +0000, George Dunlap wrote:
> > On 08/03/13 15:11, Boris Ostrovsky wrote:
> > >----- george.dunlap@eu.citrix.com wrote:
> > >
> > >>On 08/03/13 14:50, Boris Ostrovsky wrote:
> > >>>----- JBeulich@suse.com wrote:
> > >>>
> > >>>>>>>On 04.03.13 at 13:42, George Dunlap
> > >><George.Dunlap@eu.citrix.com>
> > >>>>wrote:
> > >>>>>On Fri, Mar 1, 2013 at 8:49 PM,  <suravee.suthikulpanit@amd.com>
> > >>>>wrote:
> > >>>>>>From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > >>>>>>
> > >>>>>>Currently, the performance counter registers are saved/restores
> > >>>>>>when the HVM guest switchs VCPUs only if they are running.
> > >>>>>>However, PERF has one check where it writes the MSR and read
> > >>back
> > >>>>>>the value to check if the MSR is working.  This has shown to
> > >>fails
> > >>>>>>the check if the VCPU is moved in between rdmsr and wrmsr and
> > >>>>>>resulting in the values are different.
> > >>>>>Many moons ago (circa 2005) when I used performance counters, I
> > >>>>found
> > >>>>>that adding them to the save/restore path added a non-neligible
> > >>>>>overhead -- something like 5% slow-down.  Do you have any reason
> > >>to
> > >>>>>believe this is no longer the case?  Have you done any benchmarks
> > >>>>>before and after?
> > >>>I was doing some VPMU tracing a couple of weeks ago and by looking
> > >>at
> > >>>trace timestamps I think I saw about 4000 cycles on VPMU save and
> > >>>~9000 cycles on restore. Don't remember what it was percentage-wise
> > >>of
> > >>>a whole context switch.
> > >>>
> > >>>This was on Intel.
> > >>That's a really hefty expense to make all users pay on every context
> > >>switch, on behalf of a random check in a piece of software that only a
> > >>handful of people are going to be actually using.
> > >I believe Linux uses perf infrastructure to implement the watchdog.
> 
> And by default it won't work as for Intel you need these flags:
> 
> cpuid=['0xa:eax=0x07300403,ebx=0x00000004,ecx=0x00000000,edx=0x00000603' ]

This cpuid config variable should not be needed if your cpu is supported in
vmx_vpmu_initialise() where you added a lot of processors with your patch.
If not supported and you should see a message in the xen logs.

> 
> What we get right now when booting PVHVM under Intel is:
> 
> [    0.160989] Performance Events: unsupported p6 CPU model 45 no PMU driver, software events only.
> [    0.168098] NMI watchdog disabled (cpu0): hardware events not enabled

Did you add vpmu to the xen boot parameter list?

I installed opensuse-12.2 as a HVM guest with xen-unstable running and the kernel
log says:

Mar  7 15:06:18 linux kernel: [    0.183217] CPU0: Intel(R) Core(TM)2 Duo CPU     P8800  @ 2.66GHz stepping 0a
Mar  7 15:06:18 linux kernel: [    0.183980] Performance Events: 4-deep LBR, Core2 events, Intel PMU driver.
Mar  7 15:06:18 linux kernel: [    0.189994] ... version:                2
Mar  7 15:06:18 linux kernel: [    0.189997] ... bit width:              40
Mar  7 15:06:18 linux kernel: [    0.190000] ... generic registers:      2
Mar  7 15:06:18 linux kernel: [    0.190002] ... value mask:             000000ffffffffff
Mar  7 15:06:18 linux kernel: [    0.190005] ... max period:             000000007fffffff
Mar  7 15:06:18 linux kernel: [    0.190008] ... fixed-purpose events:   3
Mar  7 15:06:18 linux kernel: [    0.190011] ... event mask:             0000000700000003
Mar  7 15:06:18 linux kernel: [    0.198203] NMI watchdog: enabled, takes one hw-pmu counter.

When I call perf:

# perf stat ls
acpid             cups      kdm.log        mail.err        news              wtmp            zypper.log
alternatives.log  faillog   krb5           mail.info       ntp               Xorg.0.log
boot.log          firewall  lastlog        mail.warn       pm-powersave.log  Xorg.0.log.old
btmp              hp        localmessages  messages        samba             YaST2
ConsoleKit        journal   mail           NetworkManager  warn              zypp

 Performance counter stats for 'ls':

          7.840869 task-clock                #    0.590 CPUs utilized          
                59 context-switches          #    0.008 M/sec                  
                 0 CPU-migrations            #    0.000 K/sec                  
               304 page-faults               #    0.039 M/sec                  
         6,583,834 cycles                    #    0.840 GHz                     [40.38%]
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
         2,168,931 instructions              #    0.33  insns per cycle         [73.20%]
           525,628 branches                  #   67.037 M/sec                   [79.06%]
            27,138 branch-misses             #    5.16% of all branches         [83.55%]

       0.013283672 seconds time elapsed

As you can see performance counters are working for instructions, branches
and branch-misses.

When I call this command in the dom0 it's a bit different:

# perf stat ls
acpid             journal        messages           wpa_supplicant.log
alternatives.log  kdm.log        NetworkManager     wtmp
boot.log          krb5           news               xen
btmp              lastlog        ntp                Xorg.0.log
ConsoleKit        localmessages  pk_backend_zypp    Xorg.0.log.old
cups              mail           pk_backend_zypp-1  YaST2
faillog           mail.err       pm-powersave.log   zypp
firewall          mail.info      samba              zypper.log
hp                mail.warn      warn               zypper.log-20130307.xz

 Performance counter stats for 'ls':

          6.959326 task-clock                #    0.714 CPUs utilized          
                11 context-switches          #    0.002 M/sec                  
                 0 CPU-migrations            #    0.000 K/sec                  
               304 page-faults               #    0.044 M/sec                  
   <not supported> cycles                  
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
   <not supported> instructions            
   <not supported> branches                
   <not supported> branch-misses           

       0.009746152 seconds time elapsed

This is because the hardware events are not supported in PV.

Dietmar.


> Unless said above CPUID flag is provided.
> > 
> > Hmm -- well if it is the case that adding performance counters to
> > the vcpu context switch path will add a measurable overhead, then we
> > probably don't want them enabled for typical guests anyway.  If
> > people are actually using the performance counters to measure
> > performance, that makes sense; but for watchdogs it seems like Xen
> > should be able to provide something that is useful for a watchdog
> > without the extra overhead of saving and restoring performance
> > counters.
> > 
> > Konrad, any thoughts?
> 
> The other thing is that there is an Xen watchdog. The one that Jan Beulich
> wrote which should also work under PVHVM:
> 
> drivers/watchdog/xen_wdt.c
> 
> 
> > 
> >  -George

-- 
Company details: http://ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2013-03-12  8:18 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-08 15:11 [PATCH] Always save/restore performance counters when HVM guest switching VCPU Boris Ostrovsky
2013-03-11 11:11 ` George Dunlap
2013-03-11 14:53   ` Konrad Rzeszutek Wilk
2013-03-11 14:59     ` George Dunlap
2013-03-11 15:54       ` Boris Ostrovsky
2013-03-11 16:03     ` Jan Beulich
2013-03-12  8:18     ` Dietmar Hahn [this message]
2013-03-12 15:12       ` Konrad Rzeszutek Wilk
  -- strict thread matches above, loose matches on Subject: below --
2013-03-08 14:50 Boris Ostrovsky
2013-03-08 14:56 ` George Dunlap
2013-03-08 15:15   ` Jan Beulich
2013-03-01 20:49 suravee.suthikulpanit
2013-03-01 23:02 ` Boris Ostrovsky
2013-03-04 12:42 ` George Dunlap
2013-03-08  8:47   ` Jan Beulich
2013-03-08 22:52     ` Suravee Suthikulanit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1754744.DiyRSK42SD@amur \
    --to=dietmar.hahn@ts.fujitsu.com \
    --cc=JBeulich@suse.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.