From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org
Cc: "kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org"
<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
Christian Borntraeger
<cborntra-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
Subject: Re: [PATCH/PFC 0/2] s390 host support
Date: Sun, 29 Apr 2007 13:48:14 +0300 [thread overview]
Message-ID: <463477EE.3000406@qumranet.com> (raw)
In-Reply-To: <4634726F.10705-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Carsten Otte wrote:
>
> Avi Kivity wrote:
>> We'll want to keep a vcpu fd. If the vcpu is idle we'll be asleep in
>> poll() or the like, and we need some kind of wakeup mechanism.
> Our userspace does idle/wakeup differently:
> One cpu exits sys_s390host_sie, and the intercept code indicates a
> halt with interrupts enabled (cpu idle loop). Now userland parks our
> vcpu thread in pthread_cond_wait. Once we want to wakeup this thread,
> either by interprocessor signal (need_resched and such) or due to an
> IO interrupt, we do a pthread_cond_signal to wakeup the thread again.
> The thread will now enter sys_s390host_sie, and after entering the
> vcpu context will execute the interrupt handler first.
> The advantage of waiting in userland I see, is that userspace can dump
> interrupts to idle CPUs without kernel intervention. On the other
> hand, my brain hurts when thinking about userland passing vcpu fds to
> other threads/processes and when thinking about sys_fork().
In both cases you wait in the kernel; with an fd you wait in the kernel
and with pthread_cond_wait you wait in futex(FUTEX_WAIT) or a close
relative.
Can one do the equivalent of a futex wakeup from the kernel easily?
> In the end, you do the decision and we'll follow the way you lead to.
>
My primary concern is not to lock userspace into one way of working.
This is really another sad side effect of the kernel providing a
bazillion sleep/wakeup methods.
>> I guess some of the difference stems from the fact that on x86, the
>> Linux pagetables are actually the hardware pagetables. VT and SVM
>> use a separate page table for the guest which cannot be shared with
>> the host. This means that
>>
>> - we need to teach the Linux mm to look at shadow page tables when
>> transferring dirty bits
>> - when Linux wants to write protect a page, it has to modify the
>> shadow page tables too (and flush the guest tlbs, which is again a
>> bit different)
>> - this means rmap has to be extended to include kvm
>>
>> I think that non-x86 have purely software page tables, maybe this
>> make things easier.
> We do use hardware page tables too. Our hardware does know about
> mutiple levels of page translation, and does its part of maintaining
> different sets of dirty/reference bits for guest and host while
> running in the virtual machine context. This process is transparent
> for both virtual machine and host.
Nested page tables/extended page tables also provide this facility, with
some caveats:
- on 32-bit hosts (or 64-bit hosts with 32-bit userspace), host
userspace virtual address space is not enough to contain the guest
physical address space.
- there is no way to protect the host userspace from the guest
- some annoying linker scripts need to be used to compile the host
userspace to move it out of the guest userspace area, making it more
difficult to write kvm userspace
I think there's a way to work around these issues on 64-bit npt
hardware: allocate a pgd entry (at a non-zero offset) to hold guest
physical memory, and copy this pgd entry into a guest-only pgd at offset
zero.
Of course, there are many millions of non-npt/ept processors out there,
and we can't leave them out in the cold, so we'll have to work something
out for classical shadow page tables.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
next prev parent reply other threads:[~2007-04-29 10:48 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-27 13:40 [PATCH/PFC 0/2] s390 host support Carsten Otte
[not found] ` <1177681224.5770.20.camel-WIxn4w2hgUz3YA32ykw5MLlKpX0K8NHHQQ4Iyu8u01E@public.gmane.org>
2007-04-27 15:14 ` Carsten Otte
2007-04-28 6:27 ` Avi Kivity
[not found] ` <4632E94C.20904-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-28 8:45 ` Carsten Otte
[not found] ` <4633099D.3020709-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 9:13 ` Avi Kivity
[not found] ` <463461B1.7060406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 10:24 ` Carsten Otte
[not found] ` <4634726F.10705-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 10:48 ` Avi Kivity [this message]
[not found] ` <463477EE.3000406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 11:15 ` Carsten Otte
[not found] ` <46347E6D.90409-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 11:49 ` Avi Kivity
[not found] ` <46348661.6000909-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-04-29 14:27 ` Carsten Otte
[not found] ` <4634AB6C.4020901-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-29 15:06 ` Avi Kivity
2007-04-30 14:48 ` Carsten Otte
[not found] ` <463601A3.3070206-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-30 14:56 ` Avi Kivity
[not found] ` <463603B6.3010105-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-05-14 14:17 ` Carsten Otte
[not found] ` <46486F89.3080609-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 14:50 ` Avi Kivity
[not found] ` <4648774E.2060304-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-05-14 15:26 ` Carsten Otte
[not found] ` <46487FA5.4090905-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 15:29 ` Carsten Otte
[not found] ` <46488047.8090404-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-05-14 15:55 ` Avi Kivity
2007-05-14 15:53 ` Avi Kivity
2007-04-29 12:13 ` Heiko Carstens
[not found] ` <20070429121351.GA8254-5VkHqLvV2o3MbYB6QlFGEg@public.gmane.org>
2007-04-29 12:27 ` Avi Kivity
2007-04-29 8:11 ` Heiko Carstens
[not found] ` <20070429081157.GC8332-5VkHqLvV2o3MbYB6QlFGEg@public.gmane.org>
2007-04-29 8:45 ` Avi Kivity
2007-04-30 18:58 ` Hollis Blanchard
[not found] ` <pan.2007.04.30.18.58.56.432063-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-05-01 6:43 ` Avi Kivity
2007-05-01 14:53 ` Hollis Blanchard
[not found] ` <pan.2007.05.01.14.53.20.257696-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-05-01 14:57 ` Avi Kivity
2007-04-27 16:19 ` Hollis Blanchard
[not found] ` <pan.2007.04.27.16.18.10.889473-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2007-04-27 19:58 ` Carsten Otte
[not found] ` <463255F3.2000500-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-04-27 22:34 ` Dong, Eddie
2007-04-29 8:09 ` Heiko Carstens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=463477EE.3000406@qumranet.com \
--to=avi-atkuwr5tajbwk0htik3j/w@public.gmane.org \
--cc=carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
--cc=cborntra-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
--cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
--cc=mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox