public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Subject: Performance monitoring units and KVM
Date: Wed, 30 Jan 2008 18:06:20 +0100	[thread overview]
Message-ID: <87wsprxmyb.fsf@pike.pond.sub.org> (raw)

Before I talk about performance monitoring units (PMUs) and KVM, let
me sketch PMUs and the software we have to put them to use.  You may
wish to skip to the next occurence of "KVM".


Modern processors sport PMUs in various forms and shapes.  The
simplest form is a couple of performance counters, each of which can
be programmed to count a certain number of a certain event, then
interrupt.  Supported events depend on the PMU, and typically include
cycles spent executing instructions, cache misses, stall cycles and
such.  Precision varies; there's undercounting, and the interrupt can
occur pretty far from the instruction that triggered the event.
Smarter hardware exists that can record samples in a buffer, and only
interrupts when that buffer fills up.

Our existing tool for putting the PMUs to use is OProfile.  OProfile
is a low-overhead, transparent (no instrumentation), system-wide
profiler.

OProfile consists of a kernel part that messes with the hardware and
delivers samples to user space, a daemon that records these samples,
and utilities for control and analysis.

OProfile is a very useful tool, but it has its limitations.  It uses
PMUs only in their simple form.  Users also want to monitor single
threads instead of the whole system, write applications that monitor
selected parts of themselves and more.  Perfmon2 attempts to provide a
generic interface to all the various PMUs that can support all that.
But it's a big, complex hairball out of tree, and merging it will take
time and hard work.


So, what does all this have to do with virtualization in general and
KVM in particular?

As I explained above, use of the PMU beyond what OProfile can do is
quite a hairball.  Adding virtualization to it can only make it
hairier.  I feel that hairball needs to be untangled elsewhere before
we touch it.  That leaves system-wide profiling.

System-wide profiling comes with two competing definitions of system:
virtual and real.  Both are useful.  And both need work.


System-wide profiling of the *virtual* machine is related to profiling
just a process.  That's hard.  I guess building on Perfmon2 would make
sense there, but as long as it's out of tree...  Can we wait for it?
If not, what then?


System-wide profiling of the *real* machine we already have: OProfile.
The fact that we're running guests guests doesn't disturb it.
However, presence of guests makes it harder to interpret samples: we
need to map code address to program/DSO.  The information necessary
for that lives in the guest kernel.

An obvious way to join the sample with the information is delegating
the recording of samples to the guest.  Note, however, that you then
need to set up the recording of samples in each guest in addition to
the host, which is inconvenient.


Such a real-system-wide profiler already exists for Xen: Xenoprof,
which is a patch to OProfile.  Here's how it works.

Xenoprof splits OProfile's kernel part: the hardware part moves into
the hypervisor, while the deliver-to-user-space part remains in the
kernel.  Kernel and hypervisor talk through hypercalls, shared memory
and virtual interrupts.  Instead of the driver for the real PMU, the
kernel uses a special Xenoprof driver that talks to the hypervisor.
The hypervisor accepts commands controlling the PMU only from the
privileged guest (dom0).

Xen guests (domains in Xen parlance) running Xenoprof are called
active: they receive their samples from the hypervisor and record
them.  Domains not running Xenoprof are called passive, and the
hypervisor routes their samples to dom0 for recording.  Dom0 can make
sense of passive domain's Linux kernel space samples, if given
suitable kernel symbols, but can't make sense of passive user space.

Active Xenoprof is useful because it gets you the most data.  It's
also quite a rain dance to use: starting and stopping profiling takes
several steps spread over all active domains, and if you misstep,
things tend to fail in the most confusing way imaginable.  More robust
error handling and better automation should help there.

Passive Xenoprof is useful because it works even when a domain can't
cooperate (no Xenoprof), or you can't be bothered to start OProfile
there.


The same ideas should work for KVM.  The whole hypervisor headache
just evaporates, of course.  What remains is the host kernel routing
samples to active guests (over virtio, I guess), and guests kernels
receiving samples from there instead of the hardware PMU.  In other
words, the sample channel from the host becomes our virtual PMU for
the guest.  Which needs a driver for it.  It's a weird PMU, because
you can't program its performance counters.  That's left to the host.

How much of Xenoprof's kernel code we could use I don't know.  A
common user space should be quite feasible.


So, what do you think?  Is this worthwhile?  Other ideas?

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

             reply	other threads:[~2008-01-30 17:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-30 17:06 Markus Armbruster [this message]
     [not found] ` <87wsprxmyb.fsf-A7mx1g9ivIOttUaS3K59qNi2O/JbrIOy@public.gmane.org>
2008-01-30 17:41   ` Performance monitoring units and KVM Avi Kivity
     [not found]     ` <47A0B6DF.40208-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-30 18:05       ` Andi Kleen
2008-01-30 18:23       ` Balaji Rao
     [not found]         ` <200801302353.19872.balajirrao-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-01-30 18:14           ` Avi Kivity
     [not found]             ` <47A0BE8F.4090508-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-30 18:26               ` Andi Kleen
     [not found]                 ` <p73ir1b5fvy.fsf-KvMlXPVkKihbpigZmTR7Iw@public.gmane.org>
2008-01-30 19:14                   ` Balaji Rao
     [not found]                     ` <200801310044.11055.balajirrao-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-01-31  3:12                       ` Andi Kleen
     [not found]                         ` <20080131031232.GB27115-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2008-01-31  3:44                           ` Performance monitoring units and KVM II Andi Kleen
2008-01-31  7:12                           ` Performance monitoring units and KVM Balaji Rao
2008-01-30 18:55               ` Balaji Rao
2008-01-30 19:55       ` Markus Armbruster
     [not found]         ` <87fxwfw0jn.fsf-A7mx1g9ivIOttUaS3K59qNi2O/JbrIOy@public.gmane.org>
2008-01-31  7:03           ` Avi Kivity
     [not found]             ` <47A172D4.6040505-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-31 15:42               ` Markus Armbruster
     [not found]                 ` <87odb2gfx5.fsf-A7mx1g9ivIOttUaS3K59qNi2O/JbrIOy@public.gmane.org>
2008-01-31 16:37                   ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wsprxmyb.fsf@pike.pond.sub.org \
    --to=armbru-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox