From: Luiz Capitulino <lcapitulino@redhat.com>
To: kvm@vger.kernel.org
Cc: yoshihiro.yunomae.ez@hitachi.com, mtosatti@redhat.com,
qemu-devel@nongnu.org, rostedt@goodmis.org,
linux-trace-users@vger.kernel.org, peterx@redhat.com,
stefanha@redhat.com, pbonzini@redhat.com
Subject: [Qemu-devel] [RFC] host and guest kernel trace merging
Date: Thu, 3 Mar 2016 14:35:01 -0500 [thread overview]
Message-ID: <20160303143501.0edf21a2@redhat.com> (raw)
Very recently, trace-cmd got a few new features that allow you
to merge the host and guest kernel traces using the host TSC.
Those features originated in the tracing we're doing to debug spikes
in real-time KVM. However, as real-time KVM uses a very specific
setup and as we have so far debugged a very simple application,
I'm wondering: is this feature useful for the general, non-realtime,
use-cases?
If the answer is yes, then I've got several ideas on how to
make host and guest trace merging extremely simple to use.
I'll first describe how we do tracing for real-time KVM. Then
I'll give some suggestions on how to use the same procedure
for unpinned use-cases. Lastly, I'll talk about how we could
make it easy to use.
Real-time KVM host and guest tracing
====================================
In real-time KVM, each guest's vCPU is pinned to a different host
core. The real-time application running in the guest is also pinned.
When we get a spike, we know in which guest CPU it ocurred, and
so we know in which host core this CPU was running. All we have to
do is to get a trace of that guest CPU/host core pair.
1. Setup
--------
You'll need the following:
1. A stable TSC
2. The TSC offset of the guest you want to debug
(see below)
3. Have your guest transfer a file to your
host someway (I use networking)
4. Latest trace-cmd.git in both host and guest
(HEAD c21aae2c or later)
To get the TSC offset of the guest, you can use the kvm_write_tsc_offset
tracepoint in the host. I use this script to do it:
http://people.redhat.com/~lcapitul/real-time/start-debug-guest
Yes, it sucks. I have an idea on how to improve this (keep reading).
2. Tracing
----------
In summary, what you have to do is:
1. run trace-cmd start -C x86-tsc in the host
2. run trace-cmd record -C x86-tsc in the guest
3. run trace-cmd stop in the host
4. run trace-cmd extract in the host
4. copy the guest's trace.dat file to a known
location in the host
This guest script does all that:
http://people.redhat.com/~lcapitul/real-time/trace-host-and-guest
I run it like this:
# trace-host-and-guest cyclictest -q -n -b10 --notrace
3. Merging
----------
Merging is simple:
$ trace-cmd report -i host-trace.dat --ts-offset $((GUEST-TSC-offset)) \
-i guest-trace.dat
For real-time KVM, we also want to see the difference in nanoseconds
of each line in the trace so we use additional options:
$ trace-cmd report --ts-diff --ts2secs HOST-Hz -t \
-i host-trace.dat --ts-offset $((GUEST-TSC-offset)) \
-i guest-trace.dat
Here's a real example:
$ trace-cmd report --ts-diff --ts2secs 26000000000 -t \
-i host-trace.dat --ts-offset $((18443676333795429734)) \
-i guest-trace.dat
And here's a little extract of a merged trace where the host injects
a timer interrupt, the guest handles it and reprograms the next
hrtimer timer. The value in "()" is how many nanoseconds it took
between the previous and the following line:
host-trace.dat: qemu-kvm-3699 [004] [004] 6196.749398857: (+88) function: kvm_inject_pending_timer_irqs <-- kvm_arch_vcpu_ioctl_run
host-trace.dat: qemu-kvm-3699 [004] [004] 6196.749398990: (+133) kvm_entry: vcpu 0
guest-trace.dat: <idle>-0 [000] [000] 6196.749399096: (+106) function: hrtimer_interrupt <-- local_apic_timer_interrupt
guest-trace.dat: <idle>-0 [000] [000] 6196.749399123: (+27) function: hrtimer_wakeup <-- __run_hrtimer
guest-trace.dat: <idle>-0 [000] [000] 6196.749399183: (+60) function: tick_program_event <-- hrtimer_interrupt
host-trace.dat: qemu-kvm-3699 [004] [004] 6196.749399219: (+36) kvm_exit: reason MSR_WRITE rip 0xffffffff8104bf58 info 0 0
host-trace.dat: qemu-kvm-3699 [004] [004] 6196.749399260: (+41) function: kvm_set_lapic_tscdeadline_msr <-- kvm_set_msr_common
host-trace.dat: qemu-kvm-3699 [004] [004] 6196.749399283: (+23) function: hrtimer_start <-- start_apic_timer
host-trace.dat: qemu-kvm-3699 [004] [004] 6196.749399336: (+53) kvm_entry: vcpu 0
Unpinned use-cases
==================
If you can't pin the guest vCPU threads and the guest application like
we do in real-time KVM, you could try the following:
* If your guest has a single CPU, or you want to trace a
specific guest vCPU then try to pass -P vCPU-TID when
running "trace-cmd record start" in the host
* If you want to trace multiple vCPUs, I think you could
try to trace all cores where the vCPUs could run with -M.
Then you could try to merge this with the guest trace and
see if you get a single timeline of all cores and guests CPUs
trace-cmd-server
================
Everything I described could look like this:
# trace-cmd server [ in the host ]
# trace-cmd record [ in the guest ]
# trace-cmd report [ in the host, to merge the traces ]
To achieve this, we need two things:
1. Add an interface to obtain the guest TSC offset from the
host user-space.
Maybe we could have a new sysfs directory, with a file
per vCPU thread and the offset as contents? Or maybe
just add a new entry to /proc/, like: /proc/TID/tsc-offset?
2. Build a trace-cmd-server
Which does the following:
1. Receive trace-cmd record commands from a guest,
to be performed in the host
2. Read the TSC offset for vCPUs being traced
3. Read the CPU frequency for --ts2secs
4. Receive the guest.dat from the guest and store it together
with the host's
I'd suggest the default communication between guest and
host be via networking. Then we add vsock support when it's
available.
Credits
=======
Real-time KVM tracing was originally developped by Marcelo Tosatti
using scripts. Steven Rostedt added the features we needed in trace-cmd
at my request.
All the ideas and scripts showed here are mine. I also take the
the responsibility for any mistakes or nonsensical stuff.
next reply other threads:[~2016-03-03 19:35 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-03 19:35 Luiz Capitulino [this message]
2016-03-04 11:19 ` [Qemu-devel] [RFC] host and guest kernel trace merging Stefan Hajnoczi
2016-03-04 13:23 ` Steven Rostedt
2016-03-07 15:17 ` Stefan Hajnoczi
2016-03-07 15:49 ` Steven Rostedt
2016-03-07 16:10 ` Eric Blake
2016-03-07 16:26 ` Steven Rostedt
2016-03-07 17:13 ` Paolo Bonzini
2016-03-24 5:16 ` Peter Xu
2016-03-24 13:02 ` Luiz Capitulino
2016-03-25 1:53 ` Peter Xu
2016-03-24 8:42 ` Peter Xu
2016-03-24 10:13 ` Stefan Hajnoczi
2016-03-25 2:22 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160303143501.0edf21a2@redhat.com \
--to=lcapitulino@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-trace-users@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rostedt@goodmis.org \
--cc=stefanha@redhat.com \
--cc=yoshihiro.yunomae.ez@hitachi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).