[Qemu-devel] [RFC] host and guest kernel trace merging

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Luiz Capitulino <lcapitulino@redhat.com>
To: kvm@vger.kernel.org
Cc: yoshihiro.yunomae.ez@hitachi.com, mtosatti@redhat.com,
	qemu-devel@nongnu.org, rostedt@goodmis.org,
	linux-trace-users@vger.kernel.org, peterx@redhat.com,
	stefanha@redhat.com, pbonzini@redhat.com
Subject: [Qemu-devel] [RFC] host and guest kernel trace merging
Date: Thu, 3 Mar 2016 14:35:01 -0500	[thread overview]
Message-ID: <20160303143501.0edf21a2@redhat.com> (raw)

Very recently, trace-cmd got a few new features that allow you
to merge the host and guest kernel traces using the host TSC.

Those features originated in the tracing we're doing to debug spikes
in real-time KVM. However, as real-time KVM uses a very specific
setup and as we have so far debugged a very simple application,
I'm wondering: is this feature useful for the general, non-realtime,
use-cases?

If the answer is yes, then I've got several ideas on how to
make host and guest trace merging extremely simple to use.

I'll first describe how we do tracing for real-time KVM. Then
I'll give some suggestions on how to use the same procedure
for unpinned use-cases. Lastly, I'll talk about how we could
make it easy to use.

Real-time KVM host and guest tracing
====================================

In real-time KVM, each guest's vCPU is pinned to a different host
core. The real-time application running in the guest is also pinned.
When we get a spike, we know in which guest CPU it ocurred, and
so we know in which host core this CPU was running. All we have to
do is to get a trace of that guest CPU/host core pair.

1. Setup
--------

You'll need the following:

 1. A stable TSC
 2. The TSC offset of the guest you want to debug
    (see below)
 3. Have your guest transfer a file to your
    host someway (I use networking)
 4. Latest trace-cmd.git in both host and guest
    (HEAD c21aae2c or later)

To get the TSC offset of the guest, you can use the kvm_write_tsc_offset
tracepoint in the host. I use this script to do it:

 http://people.redhat.com/~lcapitul/real-time/start-debug-guest

Yes, it sucks. I have an idea on how to improve this (keep reading).

2. Tracing
----------

In summary, what you have to do is:

 1. run trace-cmd start -C x86-tsc in the host
 2. run trace-cmd record -C x86-tsc in the guest
 3. run trace-cmd stop in the host
 4. run trace-cmd extract in the host
 4. copy the guest's trace.dat file to a known
    location in the host

This guest script does all that:

 http://people.redhat.com/~lcapitul/real-time/trace-host-and-guest

I run it like this:

 # trace-host-and-guest cyclictest -q -n -b10 --notrace

3. Merging
----------

Merging is simple:

 $ trace-cmd report -i host-trace.dat --ts-offset $((GUEST-TSC-offset)) \
                    -i guest-trace.dat

For real-time KVM, we also want to see the difference in nanoseconds
of each line in the trace so we use additional options:

 $ trace-cmd report --ts-diff --ts2secs HOST-Hz -t \
                    -i host-trace.dat --ts-offset $((GUEST-TSC-offset)) \
					-i guest-trace.dat

Here's a real example:

 $ trace-cmd report --ts-diff --ts2secs 26000000000 -t \
                    -i host-trace.dat --ts-offset $((18443676333795429734)) \
					-i guest-trace.dat

And here's a little extract of a merged trace where the host injects
a timer interrupt, the guest handles it and reprograms the next
hrtimer timer. The value in "()" is how many nanoseconds it took
between the previous and the following line:

 host-trace.dat:         qemu-kvm-3699  [004] [004]  6196.749398857: (+88)    function:             kvm_inject_pending_timer_irqs <-- kvm_arch_vcpu_ioctl_run
 host-trace.dat:         qemu-kvm-3699  [004] [004]  6196.749398990: (+133)   kvm_entry:            vcpu 0
guest-trace.dat:           <idle>-0     [000] [000]  6196.749399096: (+106)   function:             hrtimer_interrupt <-- local_apic_timer_interrupt
guest-trace.dat:           <idle>-0     [000] [000]  6196.749399123: (+27)    function:             hrtimer_wakeup <-- __run_hrtimer
guest-trace.dat:           <idle>-0     [000] [000]  6196.749399183: (+60)    function:             tick_program_event <-- hrtimer_interrupt
 host-trace.dat:         qemu-kvm-3699  [004] [004]  6196.749399219: (+36)    kvm_exit:             reason MSR_WRITE rip 0xffffffff8104bf58 info 0 0
 host-trace.dat:         qemu-kvm-3699  [004] [004]  6196.749399260: (+41)    function:             kvm_set_lapic_tscdeadline_msr <-- kvm_set_msr_common
 host-trace.dat:         qemu-kvm-3699  [004] [004]  6196.749399283: (+23)    function:             hrtimer_start <-- start_apic_timer
 host-trace.dat:         qemu-kvm-3699  [004] [004]  6196.749399336: (+53)    kvm_entry:            vcpu 0

Unpinned use-cases
==================

If you can't pin the guest vCPU threads and the guest application like
we do in real-time KVM, you could try the following:

 * If your guest has a single CPU, or you want to trace a
   specific guest vCPU then try to pass -P vCPU-TID when
   running "trace-cmd record start" in the host

 * If you want to trace multiple vCPUs, I think you could
   try to trace all cores where the vCPUs could run with -M.
   Then you could try to merge this with the guest trace and
   see if you get a single timeline of all cores and guests CPUs

trace-cmd-server
================

Everything I described could look like this:

  # trace-cmd server [ in the host ]
  # trace-cmd record [ in the guest ]
  # trace-cmd report [ in the host, to merge the traces ]

To achieve this, we need two things:

 1. Add an interface to obtain the guest TSC offset from the
    host user-space.

    Maybe we could have a new sysfs directory, with a file
    per vCPU thread and the offset as contents? Or maybe
    just add a new entry to /proc/, like: /proc/TID/tsc-offset?

 2. Build a trace-cmd-server

    Which does the following:

      1. Receive trace-cmd record commands from a guest,
         to be performed in the host

      2. Read the TSC offset for vCPUs being traced

      3. Read the CPU frequency for --ts2secs

      4. Receive the guest.dat from the guest and store it together
         with the host's

    I'd suggest the default communication between guest and
    host be via networking. Then we add vsock support when it's
    available.

Credits
=======

Real-time KVM tracing was originally developped by Marcelo Tosatti
using scripts. Steven Rostedt added the features we needed in trace-cmd
at my request.

All the ideas and scripts showed here are mine. I also take the
the responsibility for any mistakes or nonsensical stuff.

next             reply	other threads:[~2016-03-03 19:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 19:35 Luiz Capitulino [this message]
2016-03-04 11:19 ` [Qemu-devel] [RFC] host and guest kernel trace merging Stefan Hajnoczi
2016-03-04 13:23   ` Steven Rostedt
2016-03-07 15:17     ` Stefan Hajnoczi
2016-03-07 15:49       ` Steven Rostedt
2016-03-07 16:10         ` Eric Blake
2016-03-07 16:26           ` Steven Rostedt
2016-03-07 17:13             ` Paolo Bonzini
2016-03-24  5:16     ` Peter Xu
2016-03-24 13:02       ` Luiz Capitulino
2016-03-25  1:53         ` Peter Xu
2016-03-24  8:42   ` Peter Xu
2016-03-24 10:13     ` Stefan Hajnoczi
2016-03-25  2:22       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160303143501.0edf21a2@redhat.com \
    --to=lcapitulino@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-trace-users@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rostedt@goodmis.org \
    --cc=stefanha@redhat.com \
    --cc=yoshihiro.yunomae.ez@hitachi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).