Hypervisor architecture?

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* Hypervisor architecture?
@ 2010-05-04 19:25 Etienne Martineau
  2010-05-04 20:33 ` Jeremy Fitzhardinge
  2010-05-04 23:20 ` TSFH
  0 siblings, 2 replies; 6+ messages in thread
From: Etienne Martineau @ 2010-05-04 19:25 UTC (permalink / raw)
  To: xen-devel

Is there any documents around that describe the 'internal' architecture
of the Hypervisor?

So far my understanding is such that the Hypervisor is a 'strip-down'
Linux 2.6.(12/13) + lot's of customization specific to Xen. 

Because there is so much documentation around for the Linux kernel I
think it would be _nice_ to have a 'conceptual mapping' between the
Hypervisor and the Linux kernel....

For example:
-Under Linux, processes are described by 'struct task_struct'
-Under Hypervisor, VMs are described by 'struct domain'

What is common?
What is different?

For those who are already familiar with the Linux kernel there would be
a net advantage when trying to 'pickup' on how things are done within
the hypervisor

-Etienne

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Hypervisor architecture?
  2010-05-04 19:25 Hypervisor architecture? Etienne Martineau
@ 2010-05-04 20:33 ` Jeremy Fitzhardinge
  2010-05-04 23:36   ` Etienne Martineau
  2010-05-04 23:20 ` TSFH
  1 sibling, 1 reply; 6+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-04 20:33 UTC (permalink / raw)
  To: Etienne Martineau; +Cc: xen-devel

On 05/04/2010 12:25 PM, Etienne Martineau wrote:
> Is there any documents around that describe the 'internal' architecture
> of the Hypervisor?
>
> So far my understanding is such that the Hypervisor is a 'strip-down'
> Linux 2.6.(12/13) + lot's of customization specific to Xen. 
>   

Xen and Linux diverged a long time ago, early in the Linux 2.6 series,
with occasional feature-specific transfusions from Linux into Xen
since.  There are some similarities in how they deal with platform
issues (APICs, etc), but they're mostly different now.

> What is common?
> What is different?
>   

A vcpu is akin to a task; a domain is like a process (where multi-vcpu =
multi-threaded).  Events are a bit like signals (pending events are
checked for on return to guest mode and the return is redirected to the
event handler).  Hypercalls are very similar to syscalls.  Memory
management is largely different.

One of the big differences is that Xen doesn't have a per-vcpu
hypervisor (kernel) stack, and vcpus don't have a hypervisor context. 
While they're actually running in the hypervisor they use a pcpu stack,
but if it blocks/deschedules then it must always return to guest
context, saving away enough info to continue what it was doing when it
re-enters Xen.  Once you get your head around that, a lot of things
become much clearer.

    J

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Hypervisor architecture?
  2010-05-04 20:33 ` Jeremy Fitzhardinge
@ 2010-05-04 23:36   ` Etienne Martineau
  2010-05-04 23:58     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 6+ messages in thread
From: Etienne Martineau @ 2010-05-04 23:36 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Tue, 2010-05-04 at 13:33 -0700, Jeremy Fitzhardinge wrote:
> On 05/04/2010 12:25 PM, Etienne Martineau wrote:
> > Is there any documents around that describe the 'internal' architecture
> > of the Hypervisor?
> >
> > So far my understanding is such that the Hypervisor is a 'strip-down'
> > Linux 2.6.(12/13) + lot's of customization specific to Xen. 
> >   
> 
> Xen and Linux diverged a long time ago, early in the Linux 2.6 series,
> with occasional feature-specific transfusions from Linux into Xen
> since.  There are some similarities in how they deal with platform
> issues (APICs, etc), but they're mostly different now.
> 
> > What is common?
> > What is different?
> >   
> 
> A vcpu is akin to a task; a domain is like a process (where multi-vcpu =
> multi-threaded).
Does the scheduler schedule 'vcpu' or 'domain'?

>   Events are a bit like signals (pending events are
> checked for on return to guest mode and the return is redirected to the
> event handler).  Hypercalls are very similar to syscalls.
Fair enough;

>   Memory
> management is largely different.
I see two modes of operation?
(A) HVM; it looks like the Hypervisor is doing 'paging' and maintain
sPTE.
(B) PV; Guest has access to %cr3 and it seems to me that the Hypervisor
is not involve on the fast path?

I still haven't figure out the 'page_fault' path all the way up to the
guest...

> 
> One of the big differences is that Xen doesn't have a per-vcpu
> hypervisor (kernel) stack, and vcpus don't have a hypervisor context. 
> While they're actually running in the hypervisor they use a pcpu stack,
> but if it blocks/deschedules then it must always return to guest
> context, saving away enough info to continue what it was doing when it
> re-enters Xen.
What is the rational behind the scene?

Is it a matter of optimization; Since Hypervisor doesn't 'execute' code
on the behalf of the Guest there is no requirement for per-vcpu
hypervisor stack? {How about system call then}

Or maybe a function of feature; It would be inconvenient to have an
Hypervisor context associated with a VM when trying to migrate to
another environment?

>   Once you get your head around that, a lot of things
> become much clearer.
> 
>     J

I see this as _valuable_ information for Xen newbies (like me). I think
it would be good to have a 'xen/Documentation' folder to capture
Hypervisor specific information?

Correct me if I'm wrong but it looks like the existing 'docs/*' is
geared toward 'operation' rather than internal implementation.

I volunteer to help if needed.


Thanks for your reply Jeremy.

-Etienne

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Hypervisor architecture?
  2010-05-04 23:36   ` Etienne Martineau
@ 2010-05-04 23:58     ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 6+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-04 23:58 UTC (permalink / raw)
  To: Etienne Martineau; +Cc: xen-devel

On 05/04/2010 04:36 PM, Etienne Martineau wrote:
>> A vcpu is akin to a task; a domain is like a process (where multi-vcpu =
>> multi-threaded).
>>     
> Does the scheduler schedule 'vcpu' or 'domain'?
>   

vcpus; domains aren't schedulable entities.  They're akin to a Linux mm.

>>   Memory
>> management is largely different.
>>     
> I see two modes of operation?
> (A) HVM; it looks like the Hypervisor is doing 'paging' and maintain
> sPTE.
>   

Yes, though with hap it doesn't need to maintain a shadow pagetable.

> (B) PV; Guest has access to %cr3 and it seems to me that the Hypervisor
> is not involve on the fast path?
>   

Yes and no.  The guest-visible cr3 is the "real" cr3, but the hypervisor
controls what it's being loaded with, and mediates every pte update. 
But the hypervisor doesn't need to do anything to satisfy a tlb miss.

With hvm+shadow, pte updates could be simple memory writes, but the
hypervisor needs to use tlb flushes to sync the guest and shadow
pagetables, and possibly rebuild the shadow pagetable on "tlb miss".

hvm+hap needs much less hypervisor intervention overall.  The main
downside is that tlb misses can be much more expensive in the processor.

>> One of the big differences is that Xen doesn't have a per-vcpu
>> hypervisor (kernel) stack, and vcpus don't have a hypervisor context. 
>> While they're actually running in the hypervisor they use a pcpu stack,
>> but if it blocks/deschedules then it must always return to guest
>> context, saving away enough info to continue what it was doing when it
>> re-enters Xen.
>>     
> What is the rational behind the scene?
>
> Is it a matter of optimization; Since Hypervisor doesn't 'execute' code
> on the behalf of the Guest there is no requirement for per-vcpu
> hypervisor stack? {How about system call then}
>
> Or maybe a function of feature; It would be inconvenient to have an
> Hypervisor context associated with a VM when trying to migrate to
> another environment?
>   

I think the idea is that hypercalls should always have a deterministic
execution time; they always return to the guest's control within a
certain time period, though they may not have completed their job yet. 
It is another way of dealing with "slow syscalls" and how to deal with
syscall interrupt/restart.

I don't know how deterministic hypercall timing is these days, but they
definitely can't hang around in the hypervisor indefinitely without
tying up a pcpu.

> I see this as _valuable_ information for Xen newbies (like me). I think
> it would be good to have a 'xen/Documentation' folder to capture
> Hypervisor specific information?
>
> Correct me if I'm wrong but it looks like the existing 'docs/*' is
> geared toward 'operation' rather than internal implementation.
>
> I volunteer to help if needed.

I'm sure patches will be accepted.

    J

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Hypervisor architecture?
  2010-05-04 19:25 Hypervisor architecture? Etienne Martineau
  2010-05-04 20:33 ` Jeremy Fitzhardinge
@ 2010-05-04 23:20 ` TSFH
  2010-05-07 15:05   ` George Dunlap
  1 sibling, 1 reply; 6+ messages in thread
From: TSFH @ 2010-05-04 23:20 UTC (permalink / raw)
  To: Etienne Martineau; +Cc: xen-devel

On Tue, 2010-05-04 at 12:25 -0700, Etienne Martineau wrote:
>
> What is common?
> What is different?
> 

There is a book you may find valuable written by David Chisnall

The definitive guide to the Xen hypervisor
ISBN-10: 013234971X
ISBN-13: 978-0132349710

The book covers in great detail the inner-workings of Xen 3.0. Not sure
how much of it is still valid in today's Xen. I know David was working
on an errata for a second print, covering Xen 3.1, but he has postponed
the work on the book for the past year as the sales (unsurprisingly)
have marginally reached 3k copies.

It would also be useful if someone on this list who have read the book
could enlighten the community WRT possible caveats on the book or parts
that no longer resemble the current implementation, as it appears Xen
has changed considerably since the book was released.

-- 
TSFH <taskstructfromhell@googlemail.com>
Context Hell, Inc.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Hypervisor architecture?
  2010-05-04 23:20 ` TSFH
@ 2010-05-07 15:05   ` George Dunlap
  0 siblings, 0 replies; 6+ messages in thread
From: George Dunlap @ 2010-05-07 15:05 UTC (permalink / raw)
  To: taskstructfromhell; +Cc: xen-devel, Etienne Martineau

On Tue, May 4, 2010 at 6:20 PM, TSFH <taskstructfromhell@googlemail.com> wrote:
> It would also be useful if someone on this list who have read the book
> could enlighten the community WRT possible caveats on the book or parts
> that no longer resemble the current implementation, as it appears Xen
> has changed considerably since the book was released.

I haven't read the book, but we have had several people come to the
list to try to get help getting the "pluggable scheduler" example from
the book working.  The two caveats are:
* The example code, as it's written in the book, doesn't actually work
* The scheduler is actually a lot harder than it looks.  It's at one
of the very lowest layers.  If there's a bug, the whole system hangs
(as opposed to just returning an error message).  Debugging it takes a
lot of skill.  So it's not a good place to start playing around with
Xen. :-)

 -George

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-05-07 15:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-04 19:25 Hypervisor architecture? Etienne Martineau
2010-05-04 20:33 ` Jeremy Fitzhardinge
2010-05-04 23:36   ` Etienne Martineau
2010-05-04 23:58     ` Jeremy Fitzhardinge
2010-05-04 23:20 ` TSFH
2010-05-07 15:05   ` George Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).