All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alok Kataria <akataria@vmware.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	"avi@redhat.com" <avi@redhat.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Zach Amsden <zach@vmware.com>, Daniel Hecht <dhecht@vmware.com>,
	"Jun.Nakajima@Intel.Com" <Jun.Nakajima@Intel.Com>,
	Tim Deegan <Tim.Deegan@citrix.com>
Subject: Re: Use CPUID to communicate with the hypervisor.
Date: Fri, 26 Sep 2008 20:11:19 -0700	[thread overview]
Message-ID: <1222485079.23825.13.camel@alok-dev1> (raw)
In-Reply-To: <48DD860C.50809@goop.org>

Hi Jeremy,

Please see my comments below.

On Fri, 2008-09-26 at 18:02 -0700, Jeremy Fitzhardinge wrote:
> Alok Kataria wrote:
> > From: Alok N Kataria <akataria@vmware.com>
> >
> > This patch proposes to use a cpuid interface to detect if we are running on an
> > hypervisor.
> > The discovery of a hypervisor is determined by bit 31 of CPUID#1_ECX, which is
> > defined to be "hypervisor present bit". For a VM, the bit is 1, otherwise it is
> > set to 0. This bit is not officially documented by either Intel/AMD yet, but
> > they plan to do so some time soon, in the meanwhile they have promised to keep
> > it reserved for virtualization.
> >
> > Also, Intel & AMD have reserved the cpuid levels 0x40000000 - 0x400000FF for
> > software use. Hypervisors can use these levels to provide an interface to pass
> > information from the hypervisor to the guest. This is similar to how we extract
> > information about a physical cpu by using cpuid.
> > XEN/KVM are already using the info leaf to get the hypervisor signature.
> >
> > VMware hardware version 7 defines some of these cpuid levels, below is a brief
> > description about those. These levels can be implemented by other hypervisors
> > too so that Linux has a standard way of communicating to any hypervisor.
> >
> > Leaf 0x40000000, Hypervisor CPUID information
> > # EAX: The maximum input value for hypervisor CPUID info (0x40000010).
> > # EBX, ECX, EDX: Hypervisor vendor ID signature. E.g. "VMwareVMware"
> >
> > Leaf 0x40000010,  Timing information.
> > # EAX: (Virtual) TSC frequency in kHz.
> > # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
> > # ECX, EDX: RESERVED
> >
> 
> I'm sympathetic to the idea, but it seems a bit under-defined.
> 
> Are you leaving a gap between 0x40000000 and -10 for what?  Future
> extension?  Avoiding existing hypervisor-specific leaves?

Avoiding existing leaves,
Microsoft's Hypervisor  is using levels 0x40000000 - 0x40000005.
The first 2 are standard levels and the rest of them are Microsoft
hypervisors specific levels. So started with 0x40000010.

> 
> I think there's a move towards doing a scan for a signature, such as
> checking every 16 leaves after 0x40000000 for "a while" looking for
> interesting signatures, so that a hypervisor can support multiple ABIs
> at once.  Given this, it would be better to define a "Generic Hypervisor
> ABI" signature, and put all the related leaves together.

Hmm interesting, do you have any pointers to this ?
> 
> And then, rather than having a simple "maximum leaf", it would be better
> to have cap bits for each specific feature.  For example, how would the
> "RESERVED" registers in "Timing information" ever get used?  How would
> you know that they were no longer reserved, but now meaningful?

The unused (reserved) value is set to zero right now, whenever a need is
felt we can define a meaningful value and that can be used. 

> 
> That said, I'm a bit worried about the whole idea of having these kinds
> of timing parameters.  It does assume that they're constant for the
> whole life of the VM.  What if they change due to power management or
> migration?

For power management, the trend, even on native hardware, is toward a
constant rate TSC. So, I don't see this is a big concern; after all a
virtual cpu should be able to virtualize the TSC as constant rate even
when the underlying TSC is not (by trapping out).  And since this is
only true for older processors, this seems acceptable.  In other words,
my feeling is we should think of the cpu-scaling issues as a legacy
issue and not optimize the interface for it.

As far as live migration, for full-virt, we think that it should happen
invisibly to the guest.  So even if we move to a host with different TSC
frequency it should be the job of the hypervisor to still emulate the
old frequency. 

Thanks,
Alok

> 
>     J


  parent reply	other threads:[~2008-09-27  3:11 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-26 23:46 Use CPUID to communicate with the hypervisor Alok Kataria
2008-09-27  0:09 ` H. Peter Anvin
2008-09-27  0:30   ` Alok Kataria
2008-09-27  0:32     ` H. Peter Anvin
2008-09-27  0:59       ` Nakajima, Jun
2008-09-27  1:55         ` H. Peter Anvin
2008-09-27  4:52           ` Nakajima, Jun
2008-09-29 20:56   ` Karel Zak
2008-09-27  1:02 ` Jeremy Fitzhardinge
2008-09-27  1:28   ` H. Peter Anvin
2008-09-27  3:11   ` Alok Kataria [this message]
2008-09-27  4:20     ` H. Peter Anvin
2008-09-27  5:37       ` Alok Kataria
2008-09-28  5:01     ` Jeremy Fitzhardinge
2008-09-29  9:28       ` Tim Deegan
2008-09-29  9:44         ` Avi Kivity
2008-09-29  6:55 ` Gleb Natapov
2008-09-29  7:37   ` Avi Kivity
2008-09-29  9:08     ` Bernd Eckenfels
2008-09-29  9:33       ` Gleb Natapov
2008-09-29 15:32     ` Nakajima, Jun
2008-09-30  9:16       ` Avi Kivity
2008-09-29  8:24 ` Gerd Hoffmann
2008-09-29 17:55   ` Alok Kataria
2008-09-29 17:58     ` H. Peter Anvin
2008-09-29 18:46     ` Gerd Hoffmann
2008-09-29 19:38       ` Alok Kataria
2008-09-29 20:31         ` H. Peter Anvin
2008-09-29 20:55         ` Nakajima, Jun
2008-09-29 21:07           ` H. Peter Anvin
2008-09-29 21:28             ` Jeremy Fitzhardinge
2008-09-29 21:49               ` H. Peter Anvin
2008-09-29 23:20               ` Zachary Amsden
2008-09-30  0:33                 ` H. Peter Anvin
2008-09-30  0:12               ` Alok Kataria
2008-09-30  0:31                 ` H. Peter Anvin
2008-09-30  0:56                   ` Nakajima, Jun
2008-09-30  0:58                     ` H. Peter Anvin
2008-09-30  1:14                       ` Nakajima, Jun
2008-09-30  2:21                         ` H. Peter Anvin
2008-09-30  3:14                           ` Nakajima, Jun
2008-09-30  3:48                             ` H. Peter Anvin
2008-09-29 22:46         ` Gerd Hoffmann
2008-09-30  0:33           ` Alok Kataria
2008-09-30  8:11             ` Gerd Hoffmann
2008-09-30 16:42               ` Zachary Amsden
2008-10-02 11:52                 ` Avi Kivity
2008-10-01  4:35               ` [Hypervisors] TSC frequency change Alok Kataria
2008-10-01  9:47                 ` Gerd Hoffmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1222485079.23825.13.camel@alok-dev1 \
    --to=akataria@vmware.com \
    --cc=Jun.Nakajima@Intel.Com \
    --cc=Tim.Deegan@citrix.com \
    --cc=avi@redhat.com \
    --cc=dhecht@vmware.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.