public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Zachary Amsden <zach@vmware.com>
To: Chris Wright <chrisw@sous-sol.org>
Cc: Linus Torvalds <torvalds@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Virtualization Mailing List <virtualization@lists.osdl.org>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Andrew Morton <akpm@osdl.org>, Dan Hecht <dhecht@vmware.com>,
	Dan Arai <arai@vmware.com>, Anne Holler <anne@vmware.com>,
	Pratap Subrahmanyam <pratap@vmware.com>,
	Christopher Li <chrisl@vmware.com>,
	Joshua LeVasseur <jtl@ira.uka.de>, Rik Van Riel <riel@redhat.com>,
	Jyothy Reddy <jreddy@vmware.com>, Jack Lo <jlo@vmware.com>,
	Kip Macy <kmacy@fsmware.com>, Jan Beulich <jbeulich@novell.com>,
	Ky Srinivasan <ksrinivasan@novell.com>,
	Wim Coekaerts <wim.coekaerts@oracle.com>,
	Leendert van Doorn <leendert@watson.ibm.com>
Subject: Re: [RFC, PATCH 3/24] i386 Vmi interface definition
Date: Mon, 13 Mar 2006 19:16:38 -0800	[thread overview]
Message-ID: <44163596.50207@vmware.com> (raw)
In-Reply-To: <20060314003956.GE12807@sorel.sous-sol.org>

Chris Wright wrote:
> * Zachary Amsden (zach@vmware.com) wrote:
>   
>> Master definition of VMI interface, including calls, constants, and
>> interface version.
>>     
>
>   
>> +/* VROM call table definitions */
>> +#define VROM_CALL_LEN             32
>> +
>> +typedef struct VROMCallEntry {
>> +   char f[VROM_CALL_LEN];
>> +} VROMCallEntry;
>>     
>
> And the call entry is meant to be handled in whatever mechanism hypervisor
> prefers for its entry points (ABI constraints notwithstanding) as in,
> arbitrary software interrupt, or call gate, etc?  I guess for transparent
> it has to, since those would be local calls.   Quite similar to the
> hypercall entry point that Xen places on the hypercall_page, so easily
> compatible.

See below.

> The document is slightly more descriptive.  The above reserved slots
> are shown as:
>
> 	char		reserved[32];
> 	char		elfHeader[64];
>
> But that's only 3 (0-2).  I think I'm missing some small bit of magic.
>
>   
>> +typedef struct VROMCallTable {
>> +   VROMCallEntry    vromCall[128];           // @ 0x80: ROM calls 4-127
>> +} VROMCallTable;
>>     
>
> That comment eludes me.  Are 0-3 special somehow (IOW, I thought it was
> just 0-2 as per above), and is it suggesting int 0x80?
>   

Yeah, most of this is rather crufty - it is in transition.  We had a 
full blown ROM image with 32-byte aligned stub points at one point for 
all of the VMI calls.  In fact, it still is in that form, and it was 
required to overlap exactly with a native ROM that was built into 
Linux.  See patch 6, VMI magic fixes for the details.

The machinery is already in place to do this, and it is a very nice 
thing that Xen has decided to adopt a similar approach (to the ROM) of 
publishing hypervisor code.  I think they even use 32-byte alignment as 
well.  The power of an indirection at this layer is just too attractive, 
and once you decide on a single binary image, I think it is inevitable 
that everyone will converge on the same idea.  The same concept surfaces 
over and over - vsyscall being another example.  You're basically 
dynamically linking in code at runtime, which is a pretty common thing 
to do, and gives you a very powerful redirection interface.

But we discovered that we couldn't achieve native performance, even 
using directly linked calls into the ROM.  Calling out for sti / cli / 
popf / pushf on fast paths is just too expensive.  We had to inline the 
native code to match native performance.  And this leaves an opportunity 
for flexibility and performance in the hypervisor implementation.

You don't want direct calls to 32-byte stubs; in fact, as Joshua found, 
you can't get optimal performance in a hypervisor that way.  What you 
need is the ability to preferentially inline certain hot calls into the 
VMI layer by using NOP padding.  The non-hot calls can call out to the 
hypercall page.

Publishing code is the first step - inlining is the second, and it gets 
you back the hit you took when indirecting your fast paths.

Zach

  reply	other threads:[~2006-03-14  3:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-13 18:01 [RFC, PATCH 3/24] i386 Vmi interface definition Zachary Amsden
2006-03-14  0:39 ` Chris Wright
2006-03-14  3:16   ` Zachary Amsden [this message]
2006-03-14 15:25 ` Christoph Hellwig
2006-03-14 16:11   ` Zachary Amsden
2006-03-22 20:06 ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2006-03-13 18:41 Zachary Amsden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44163596.50207@vmware.com \
    --to=zach@vmware.com \
    --cc=akpm@osdl.org \
    --cc=anne@vmware.com \
    --cc=arai@vmware.com \
    --cc=chrisl@vmware.com \
    --cc=chrisw@sous-sol.org \
    --cc=dhecht@vmware.com \
    --cc=jbeulich@novell.com \
    --cc=jlo@vmware.com \
    --cc=jreddy@vmware.com \
    --cc=jtl@ira.uka.de \
    --cc=kmacy@fsmware.com \
    --cc=ksrinivasan@novell.com \
    --cc=leendert@watson.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pratap@vmware.com \
    --cc=riel@redhat.com \
    --cc=torvalds@osdl.org \
    --cc=virtualization@lists.osdl.org \
    --cc=wim.coekaerts@oracle.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox