From: Anthony Liguori <anthony@codemonkey.ws>
To: Zachary Amsden <zach@vmware.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
Alok Kataria <akataria@vmware.com>,
"avi@redhat.com" <avi@redhat.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Gerd Hoffmann <kraxel@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@elte.hu>,
the arch/x86 maintainers <x86@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
"Nakajima, Jun" <jun.nakajima@intel.com>,
Daniel Hecht <dhecht@vmware.com>,
"virtualization@lists.linux-foundation.org"
<virtualization@lists.linux-foundation.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [RFC] CPUID usage for interaction between Hypervisors and Linux.
Date: Wed, 01 Oct 2008 19:41:18 -0500 [thread overview]
Message-ID: <48E418AE.1090306@codemonkey.ws> (raw)
In-Reply-To: <1222904824.7330.83.camel@bodhitayantram.eng.vmware.com>
Zachary Amsden wrote:
> On Wed, 2008-10-01 at 14:34 -0700, Anthony Liguori wrote:
>
>> Jeremy Fitzhardinge wrote:
>>
>>> Alok Kataria wrote:
>>>
>>> I guess, but the bulk of the uses of this stuff are going to be
>>> hypervisor-specific. You're hard-pressed to come up with any other
>>> generic uses beyond tsc.
>>>
>> And arguably, storing TSC frequency in CPUID is a terrible interface
>> because the TSC frequency can change any time a guest is entered. It
>> really should be a shared memory area so that a guest doesn't have to
>> vmexit to read it (like it is with the Xen/KVM paravirt clock).
>>
>
> It's not terrible, it's actually brilliant.
But of course! Okay, not really :-)
> TSC is part of the
> processor architecture, the processor should a way to tell us what speed
> it is.
>
It does. 1 tick == 1 tick. The processor doesn't have a concept of
wall clock time so wall clock units don't make much sense. If it did,
I'd say, screw the TSC, just give me a ns granular time stamp and let's
all forget that the TSC even exists.
> And now we're trying to fiddle around with software wizardry what should
> be done in hardware in the first place. Once again, para-virtualization
> is basically useless. We can't agree on a solution without
> over-designing some complex system with interface signatures and
> multi-vendor cooperation and nonsense. Solve the non-virtualized
> problem and the virtualized problem goes away.
>
> Jun, you work at Intel. Can you ask for a new architecturally defined
> MSR that returns the TSC frequency? Not a virtualization specific MSR.
> A real MSR that would exist on physical processors. The TSC started as
> an MSR anyway. There should be another MSR that tells the frequency.
> If it's hard to do in hardware, it can be a write-once MSR that gets
> initialized by the BIOS.
rdtscp sort of gives you this. But still, just give me my rdnsc and
I'll be happy.
> I realize it's the wrong thing for us now, but long term, it's the only
> architecturally 'correct' approach. You can even extend it to have
> visible TSC frequency changes clocked via performance counter events
> (and then get interrupts on those events if you so wish), solving the
> dynamic problem too.
>
So a solution is needed that works for now. Anything that requires a
vmexit is bad because the TSC frequency can change quite often. Even if
you ignore the troubles with frequency scaling on older processors and
VCPU migration across NUMA nodes, there will be a very visible change in
TSC frequency after a live migration.
So there are two possible solutions. Have a shared memory area that the
guest can consult that has the latest TSC frequency (this is what KVM
and Xen do) or have some sort of interrupt mechanism that notifies the
guest when the TSC frequency changes after which, software can do
something that vmexits to get the TSC frequency.
The proposed solution doesn't include a TSC frequency change
notification mechanism.
This is part of the problem with this sort of approach to
standardization. It's hard to come up with the best interface at
first. You have to try a couple ways, and then everyone can eventually
standardize on the best one if one ever emerges.
Regards,
Anthony Liguori
> Paravirtualization is a symptom of an architectural problem. We should
> always be trying to fix the architecture first.
>
> Zach
>
>
next prev parent reply other threads:[~2008-10-02 0:42 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-01 17:14 [RFC] CPUID usage for interaction between Hypervisors and Linux Alok Kataria
2008-10-01 17:21 ` H. Peter Anvin
2008-10-01 17:33 ` Alok Kataria
2008-10-01 17:45 ` H. Peter Anvin
2008-10-01 18:06 ` Jeremy Fitzhardinge
2008-10-01 21:05 ` Alok Kataria
2008-10-01 22:46 ` H. Peter Anvin
2008-10-02 1:11 ` Nakajima, Jun
2008-10-02 1:24 ` H. Peter Anvin
2008-10-03 22:33 ` Nakajima, Jun
2008-10-03 23:30 ` H. Peter Anvin
2008-10-04 0:27 ` Nakajima, Jun
2008-10-04 0:35 ` H. Peter Anvin
2008-10-07 22:30 ` Nakajima, Jun
2008-10-07 22:37 ` H. Peter Anvin
2008-10-07 23:45 ` Jeremy Fitzhardinge
2008-10-08 1:09 ` H. Peter Anvin
2008-10-07 23:41 ` Jeremy Fitzhardinge
2008-10-07 23:45 ` H. Peter Anvin
2008-10-08 0:40 ` Jeremy Fitzhardinge
2008-10-04 8:53 ` Avi Kivity
2008-10-01 17:47 ` H. Peter Anvin
2008-10-01 18:04 ` Jeremy Fitzhardinge
2008-10-01 18:07 ` H. Peter Anvin
2008-10-01 18:12 ` Jeremy Fitzhardinge
2008-10-01 18:16 ` H. Peter Anvin
2008-10-01 18:36 ` Jeremy Fitzhardinge
2008-10-01 18:43 ` H. Peter Anvin
2008-10-01 19:56 ` Jeremy Fitzhardinge
2008-10-01 20:38 ` Chris Wright
2008-10-01 22:38 ` H. Peter Anvin
2008-10-01 21:01 ` Alok Kataria
2008-10-01 21:08 ` Anthony Liguori
2008-10-01 21:15 ` Chris Wright
2008-10-01 21:31 ` Anthony Liguori
2008-10-01 21:23 ` Alok Kataria
2008-10-01 21:29 ` Anthony Liguori
2008-10-01 21:17 ` Jeremy Fitzhardinge
2008-10-01 21:34 ` Anthony Liguori
2008-10-01 21:43 ` Chris Wright
2008-10-02 11:29 ` Avi Kivity
2008-10-01 23:47 ` Zachary Amsden
2008-10-02 0:39 ` H. Peter Anvin
2008-10-02 0:57 ` H. Peter Anvin
2008-10-02 1:11 ` Zachary Amsden
2008-10-02 1:21 ` H. Peter Anvin
2008-10-02 0:41 ` Anthony Liguori [this message]
[not found] ` <48E3BBC1.2050607__35819.6151479662$1222884502$gmane$org@goop.org>
2008-10-01 20:03 ` Anthony Liguori
2008-10-01 20:08 ` Jeremy Fitzhardinge
[not found] ` <48E3D8A8.604__13396.6479487301$1222891831$gmane$org@goop.org>
2008-10-01 21:03 ` Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48E418AE.1090306@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=akataria@vmware.com \
--cc=avi@redhat.com \
--cc=dhecht@vmware.com \
--cc=hpa@zytor.com \
--cc=jeremy@goop.org \
--cc=jun.nakajima@intel.com \
--cc=kraxel@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
--cc=virtualization@lists.linux-foundation.org \
--cc=x86@kernel.org \
--cc=zach@vmware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).