public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org
Cc: "kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org"
	<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	Christian Borntraeger
	<cborntra-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
Subject: Re: [PATCH/PFC 0/2] s390 host support
Date: Sun, 29 Apr 2007 13:48:14 +0300	[thread overview]
Message-ID: <463477EE.3000406@qumranet.com> (raw)
In-Reply-To: <4634726F.10705-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>

Carsten Otte wrote:
>
> Avi Kivity wrote:
>> We'll want to keep a vcpu fd.  If the vcpu is idle we'll be asleep in 
>> poll() or the like, and we need some kind of wakeup mechanism.
> Our userspace does idle/wakeup differently:
> One cpu exits sys_s390host_sie, and the intercept code indicates a 
> halt with interrupts enabled (cpu idle loop). Now userland parks our 
> vcpu thread in pthread_cond_wait. Once we want to wakeup this thread, 
> either by interprocessor signal (need_resched and such) or due to an 
> IO interrupt, we do a pthread_cond_signal to wakeup the thread again. 
> The thread will now enter sys_s390host_sie, and after entering the 
> vcpu context will execute the interrupt handler first.
> The advantage of waiting in userland I see, is that userspace can dump 
> interrupts to idle CPUs without kernel intervention. On the other 
> hand, my brain hurts when thinking about userland passing vcpu fds to 
> other threads/processes and when thinking about sys_fork().

In both cases you wait in the kernel; with an fd you wait in the kernel 
and with pthread_cond_wait you wait in futex(FUTEX_WAIT) or a close 
relative.

Can one do the equivalent of a futex wakeup from the kernel easily?

> In the end, you do the decision and we'll follow the way you lead to.
>

My primary concern is not to lock userspace into one way of working.  
This is really another sad side effect of the kernel providing a 
bazillion sleep/wakeup methods.

>> I guess some of the difference stems from the fact that on x86, the 
>> Linux pagetables are actually the hardware pagetables.  VT and SVM 
>> use a separate page table for the guest which cannot be shared with 
>> the host. This means that
>>
>> - we need to teach the Linux mm to look at shadow page tables when 
>> transferring dirty bits
>> - when Linux wants to write protect a page, it has to modify the 
>> shadow page tables too (and flush the guest tlbs, which is again a 
>> bit different)
>> - this means rmap has to be extended to include kvm
>>
>> I think that non-x86 have purely software page tables, maybe this 
>> make things easier.
> We do use hardware page tables too. Our hardware does know about 
> mutiple levels of page translation, and does its part of maintaining 
> different sets of dirty/reference bits for guest and host while 
> running in the virtual machine context. This process is transparent 
> for both virtual machine and host.

Nested page tables/extended page tables also provide this facility, with 
some caveats:

- on 32-bit hosts (or 64-bit hosts with 32-bit userspace), host 
userspace virtual address space is not enough to contain the guest 
physical address space.
- there is no way to protect the host userspace from the guest
- some annoying linker scripts need to be used to compile the host 
userspace to move it out of the guest userspace area, making it more 
difficult to write kvm userspace

I think there's a way to work around these issues on 64-bit npt 
hardware: allocate a pgd entry (at a non-zero offset) to hold guest 
physical memory, and copy this pgd entry into a guest-only pgd at offset 
zero.

Of course, there are many millions of non-npt/ept processors out there, 
and we can't leave them out in the cold, so we'll have to work something 
out for classical shadow page tables.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

  parent reply	other threads:[~2007-04-29 10:48 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-27 13:40 [PATCH/PFC 0/2] s390 host support Carsten Otte
     [not found] ` <1177681224.5770.20.camel-WIxn4w2hgUz3YA32ykw5MLlKpX0K8NHHQQ4Iyu8u01E@public.gmane.org>
2007-04-27 15:14   ` Carsten Otte
2007-04-28  6:27   ` Avi Kivity
     [not found]     ` <4632E94C.20904-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-28  8:45       ` Carsten Otte
     [not found]         ` <4633099D.3020709-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29  9:13           ` Avi Kivity
     [not found]             ` <463461B1.7060406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 10:24               ` Carsten Otte
     [not found]                 ` <4634726F.10705-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 10:48                   ` Avi Kivity [this message]
     [not found]                     ` <463477EE.3000406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 11:15                       ` Carsten Otte
     [not found]                         ` <46347E6D.90409-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 11:49                           ` Avi Kivity
     [not found]                             ` <46348661.6000909-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 14:27                               ` Carsten Otte
     [not found]                                 ` <4634AB6C.4020901-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 15:06                                   ` Avi Kivity
2007-04-30 14:48                               ` Carsten Otte
     [not found]                                 ` <463601A3.3070206-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-30 14:56                                   ` Avi Kivity
     [not found]                                     ` <463603B6.3010105-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-05-14 14:17                                       ` Carsten Otte
     [not found]                                         ` <46486F89.3080609-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 14:50                                           ` Avi Kivity
     [not found]                                             ` <4648774E.2060304-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-05-14 15:26                                               ` Carsten Otte
     [not found]                                                 ` <46487FA5.4090905-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 15:29                                                   ` Carsten Otte
     [not found]                                                     ` <46488047.8090404-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 15:55                                                       ` Avi Kivity
2007-05-14 15:53                                                   ` Avi Kivity
2007-04-29 12:13                       ` Heiko Carstens
     [not found]                         ` <20070429121351.GA8254-5VkHqLvV2o3MbYB6QlFGEg@public.gmane.org>
2007-04-29 12:27                           ` Avi Kivity
2007-04-29  8:11       ` Heiko Carstens
     [not found]         ` <20070429081157.GC8332-5VkHqLvV2o3MbYB6QlFGEg@public.gmane.org>
2007-04-29  8:45           ` Avi Kivity
2007-04-30 18:58             ` Hollis Blanchard
     [not found]               ` <pan.2007.04.30.18.58.56.432063-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-05-01  6:43                 ` Avi Kivity
2007-05-01 14:53                   ` Hollis Blanchard
     [not found]                     ` <pan.2007.05.01.14.53.20.257696-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-05-01 14:57                       ` Avi Kivity
2007-04-27 16:19 ` Hollis Blanchard
     [not found]   ` <pan.2007.04.27.16.18.10.889473-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-04-27 19:58     ` Carsten Otte
     [not found]       ` <463255F3.2000500-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-27 22:34         ` Dong, Eddie
2007-04-29  8:09     ` Heiko Carstens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=463477EE.3000406@qumranet.com \
    --to=avi-atkuwr5tajbwk0htik3j/w@public.gmane.org \
    --cc=carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=cborntra-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox