From: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
To: Gleb Natapov <gleb@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>
Cc: linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
David Sharp <dhsharp@google.com>,
Steven Rostedt <rostedt@goodmis.org>,
Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>,
Ingo Molnar <mingo@redhat.com>,
yrl.pp-manager.tt@hitachi.com,
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Re: [PATCH V2 1/1] kvm/vmx: Add a tracepoint write_tsc_offset
Date: Fri, 07 Jun 2013 14:22:22 +0900 [thread overview]
Message-ID: <51B16E0E.5020208@hitachi.com> (raw)
In-Reply-To: <20130606113305.GB4725@redhat.com>
(2013/06/06 20:33), Gleb Natapov wrote:
> On Wed, Jun 05, 2013 at 09:23:22PM -0300, Marcelo Tosatti wrote:
>> On Tue, Jun 04, 2013 at 05:36:19PM +0900, Yoshihiro YUNOMAE wrote:
>>> Add a tracepoint write_tsc_offset for tracing TSC offset change.
>>> We want to merge ftrace's trace data of guest OSs and the host OS using
>>> TSC for timestamp in chronological order. We need "TSC offset" values for
>>> each guest when merge those because the TSC value on a guest is always the
>>> host TSC plus guest's TSC offset. If we get the TSC offset values, we can
>>> calculate the host TSC value for each guest events from the TSC offset and
>>> the event TSC value. The host TSC values of the guest events are used when we
>>> want to merge trace data of guests and the host in chronological order.
>>> (Note: the trace_clock of both the host and the guest must be set x86-tsc in
>>> this case)
>>>
>>> TSC offset is stored in the VMCS by vmx_write_tsc_offset() or
>>> vmx_adjust_tsc_offset(). KVM executes the former function when a guest boots.
>>> The latter function is executed when kvm clock is updated. Only host can read
>>> TSC offset value from VMCS, so a host needs to output TSC offset value
>>> when TSC offset is changed.
>>>
>>> Since the TSC offset is not often changed, it could be overwritten by other
>>> frequent events while tracing. To avoid that, I recommend to use a special
>>> instance for getting this event:
>>>
>>> 1. set a instance before booting a guest
>>> # cd /sys/kernel/debug/tracing/instances
>>> # mkdir tsc_offset
>>> # cd tsc_offset
>>> # echo x86-tsc > trace_clock
>>> # echo 1 > events/kvm/kvm_write_tsc_offset/enable
>>>
>>> 2. boot a guest
>>>
>>> Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>>> Cc: Gleb Natapov <gleb@redhat.com>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>> ---
>>> arch/x86/kvm/trace.h | 18 ++++++++++++++++++
>>> arch/x86/kvm/vmx.c | 3 +++
>>> arch/x86/kvm/x86.c | 1 +
>>> 3 files changed, 22 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
>>> index fe5e00e..9c22e39 100644
>>> --- a/arch/x86/kvm/trace.h
>>> +++ b/arch/x86/kvm/trace.h
>>> @@ -815,6 +815,24 @@ TRACE_EVENT(kvm_track_tsc,
>>> __print_symbolic(__entry->host_clock, host_clocks))
>>> );
>>>
>>> +TRACE_EVENT(kvm_write_tsc_offset,
>>> + TP_PROTO(__u64 previous_tsc_offset, __u64 next_tsc_offset),
>>> + TP_ARGS(previous_tsc_offset, next_tsc_offset),
>>> +
>>> + TP_STRUCT__entry(
>>> + __field( __u64, previous_tsc_offset )
>>> + __field( __u64, next_tsc_offset )
>>> + ),
>>> +
>>> + TP_fast_assign(
>>> + __entry->previous_tsc_offset = previous_tsc_offset;
>>> + __entry->next_tsc_offset = next_tsc_offset;
>>> + ),
>>> +
>>> + TP_printk("previous=%llu next=%llu",
>>> + __entry->previous_tsc_offset, __entry->next_tsc_offset)
>>> +);
>>> +
>>
>> Yoshihiro YUNOMAE,
>>
>> 1) Why is previous_tsc_offset necessary?
I was considering the situations where we did not enable
kvm_write_tsc_offset event before booting a guest or where we did not
use multiple buffers. Here, we will need another new I/F to get current
TSC offset of a given VCPU. For example, if kvm_write_tsc_offset is not
included in the host's trace data, we get the current TSC offset from
the new I/F and apply it to all guest events. On the other hand, if
kvm_write_tsc_offset event appears more than once, we apply the
previous offset to guest events before the first TSC offset change.
Since we support only for using multiple buffers now, we don't need to
record previous TSC offset at this time. But I'm conscious that we have
to change the format of kvm_write_tsc_offset event when we support
those situations.
>> 2) The TSC offset traces should include vcpu number, so that its
>> possible to correlate traces of SMP guests (the tool should use
>> the individual vcpu tsc offsets when converting guests trace).
>>
> Why PID is not enough? No other trace, except kvm_entry, outputs vcpu id.
As Gleb mentioned, a tool can understand TSC offset for each vcpu from
PID and vcpu number of kvm_entry. IMO, that is indirect way, so I would
be better off including vcpu number.
>> 3) Please add traces for svm.c.
Sure, I'll add the tracepoint for SVM.
Thanks,
Yoshihiro YUNOMAE
--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae.ez@hitachi.com
next prev parent reply other threads:[~2013-06-07 5:22 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-04 8:36 [PATCH V2 0/1] kvm/vmx: Output TSC offset Yoshihiro YUNOMAE
2013-06-04 8:36 ` [PATCH V2 1/1] kvm/vmx: Add a tracepoint write_tsc_offset Yoshihiro YUNOMAE
2013-06-06 0:23 ` Marcelo Tosatti
2013-06-06 11:33 ` Gleb Natapov
2013-06-06 21:43 ` Marcelo Tosatti
2013-06-07 5:22 ` Yoshihiro YUNOMAE [this message]
2013-06-07 21:55 ` Marcelo Tosatti
2013-06-09 11:14 ` Gleb Natapov
2013-06-10 9:30 ` Yoshihiro YUNOMAE
2013-06-10 10:05 ` Gleb Natapov
2013-06-10 11:37 ` Yoshihiro YUNOMAE
2013-06-10 14:04 ` Marcelo Tosatti
2013-06-10 16:38 ` Gleb Natapov
2013-06-10 20:28 ` Marcelo Tosatti
2013-06-11 6:50 ` Gleb Natapov
2013-06-11 9:26 ` Yoshihiro YUNOMAE
2013-06-04 8:38 ` [EXAMPLE] tools: a tool for merging trace data of a guest and a host Yoshihiro YUNOMAE
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51B16E0E.5020208@hitachi.com \
--to=yoshihiro.yunomae.ez@hitachi.com \
--cc=dhsharp@google.com \
--cc=gleb@redhat.com \
--cc=hidehiro.kawai.ez@hitachi.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=masami.hiramatsu.pt@hitachi.com \
--cc=mingo@redhat.com \
--cc=mtosatti@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=yrl.pp-manager.tt@hitachi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox