All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org
Cc: "kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org"
	<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	Christian Borntraeger
	<cborntra-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
Subject: Re: [PATCH/PFC 0/2] s390 host support
Date: Sun, 29 Apr 2007 13:48:14 +0300	[thread overview]
Message-ID: <463477EE.3000406@qumranet.com> (raw)
In-Reply-To: <4634726F.10705-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>

Carsten Otte wrote:
>
> Avi Kivity wrote:
>> We'll want to keep a vcpu fd.  If the vcpu is idle we'll be asleep in 
>> poll() or the like, and we need some kind of wakeup mechanism.
> Our userspace does idle/wakeup differently:
> One cpu exits sys_s390host_sie, and the intercept code indicates a 
> halt with interrupts enabled (cpu idle loop). Now userland parks our 
> vcpu thread in pthread_cond_wait. Once we want to wakeup this thread, 
> either by interprocessor signal (need_resched and such) or due to an 
> IO interrupt, we do a pthread_cond_signal to wakeup the thread again. 
> The thread will now enter sys_s390host_sie, and after entering the 
> vcpu context will execute the interrupt handler first.
> The advantage of waiting in userland I see, is that userspace can dump 
> interrupts to idle CPUs without kernel intervention. On the other 
> hand, my brain hurts when thinking about userland passing vcpu fds to 
> other threads/processes and when thinking about sys_fork().

In both cases you wait in the kernel; with an fd you wait in the kernel 
and with pthread_cond_wait you wait in futex(FUTEX_WAIT) or a close 
relative.

Can one do the equivalent of a futex wakeup from the kernel easily?

> In the end, you do the decision and we'll follow the way you lead to.
>

My primary concern is not to lock userspace into one way of working.  
This is really another sad side effect of the kernel providing a 
bazillion sleep/wakeup methods.

>> I guess some of the difference stems from the fact that on x86, the 
>> Linux pagetables are actually the hardware pagetables.  VT and SVM 
>> use a separate page table for the guest which cannot be shared with 
>> the host. This means that
>>
>> - we need to teach the Linux mm to look at shadow page tables when 
>> transferring dirty bits
>> - when Linux wants to write protect a page, it has to modify the 
>> shadow page tables too (and flush the guest tlbs, which is again a 
>> bit different)
>> - this means rmap has to be extended to include kvm
>>
>> I think that non-x86 have purely software page tables, maybe this 
>> make things easier.
> We do use hardware page tables too. Our hardware does know about 
> mutiple levels of page translation, and does its part of maintaining 
> different sets of dirty/reference bits for guest and host while 
> running in the virtual machine context. This process is transparent 
> for both virtual machine and host.

Nested page tables/extended page tables also provide this facility, with 
some caveats:

- on 32-bit hosts (or 64-bit hosts with 32-bit userspace), host 
userspace virtual address space is not enough to contain the guest 
physical address space.
- there is no way to protect the host userspace from the guest
- some annoying linker scripts need to be used to compile the host 
userspace to move it out of the guest userspace area, making it more 
difficult to write kvm userspace

I think there's a way to work around these issues on 64-bit npt 
hardware: allocate a pgd entry (at a non-zero offset) to hold guest 
physical memory, and copy this pgd entry into a guest-only pgd at offset 
zero.

Of course, there are many millions of non-npt/ept processors out there, 
and we can't leave them out in the cold, so we'll have to work something 
out for classical shadow page tables.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

  parent reply	other threads:[~2007-04-29 10:48 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-27 13:40 [PATCH/PFC 0/2] s390 host support Carsten Otte
2007-04-27 16:19 ` Hollis Blanchard
     [not found]   ` <pan.2007.04.27.16.18.10.889473-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-04-27 19:58     ` Carsten Otte
     [not found]       ` <463255F3.2000500-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-27 22:34         ` Dong, Eddie
2007-04-29  8:09     ` Heiko Carstens
     [not found] ` <1177681224.5770.20.camel-WIxn4w2hgUz3YA32ykw5MLlKpX0K8NHHQQ4Iyu8u01E@public.gmane.org>
2007-04-27 15:14   ` Carsten Otte
2007-04-28  6:27   ` Avi Kivity
     [not found]     ` <4632E94C.20904-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-28  8:45       ` Carsten Otte
     [not found]         ` <4633099D.3020709-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29  9:13           ` Avi Kivity
     [not found]             ` <463461B1.7060406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 10:24               ` Carsten Otte
     [not found]                 ` <4634726F.10705-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 10:48                   ` Avi Kivity [this message]
     [not found]                     ` <463477EE.3000406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 11:15                       ` Carsten Otte
     [not found]                         ` <46347E6D.90409-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 11:49                           ` Avi Kivity
     [not found]                             ` <46348661.6000909-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 14:27                               ` Carsten Otte
     [not found]                                 ` <4634AB6C.4020901-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 15:06                                   ` Avi Kivity
2007-04-30 14:48                               ` Carsten Otte
     [not found]                                 ` <463601A3.3070206-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-30 14:56                                   ` Avi Kivity
     [not found]                                     ` <463603B6.3010105-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-05-14 14:17                                       ` Carsten Otte
     [not found]                                         ` <46486F89.3080609-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 14:50                                           ` Avi Kivity
     [not found]                                             ` <4648774E.2060304-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-05-14 15:26                                               ` Carsten Otte
     [not found]                                                 ` <46487FA5.4090905-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 15:29                                                   ` Carsten Otte
     [not found]                                                     ` <46488047.8090404-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 15:55                                                       ` Avi Kivity
2007-05-14 15:53                                                   ` Avi Kivity
2007-04-29 12:13                       ` Heiko Carstens
     [not found]                         ` <20070429121351.GA8254-5VkHqLvV2o3MbYB6QlFGEg@public.gmane.org>
2007-04-29 12:27                           ` Avi Kivity
2007-04-29  8:11       ` Heiko Carstens
     [not found]         ` <20070429081157.GC8332-5VkHqLvV2o3MbYB6QlFGEg@public.gmane.org>
2007-04-29  8:45           ` Avi Kivity
2007-04-30 18:58             ` Hollis Blanchard
     [not found]               ` <pan.2007.04.30.18.58.56.432063-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-05-01  6:43                 ` Avi Kivity
2007-05-01 14:53                   ` Hollis Blanchard
     [not found]                     ` <pan.2007.05.01.14.53.20.257696-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-05-01 14:57                       ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=463477EE.3000406@qumranet.com \
    --to=avi-atkuwr5tajbwk0htik3j/w@public.gmane.org \
    --cc=carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=cborntra-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.