From mboxrd@z Thu Jan  1 00:00:00 1970
From: Carsten Otte <cotte-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH/PFC 0/2] s390 host support
Date: Sun, 29 Apr 2007 12:24:47 +0200
Message-ID: <4634726F.10705@de.ibm.com>
References: <1177681224.5770.20.camel@cotte.boeblingen.de.ibm.com>
	<4632E94C.20904@qumranet.com> <4633099D.3020709@de.ibm.com>
	<463461B1.7060406@qumranet.com>
Reply-To: carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Cc: carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org,
	"kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org" <kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	Christian Borntraeger <cborntra-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
To: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Return-path: <kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
In-Reply-To: <463461B1.7060406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/kvm-devel>,
	<mailto:kvm-devel-request-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum=kvm-devel>
List-Post: <mailto:kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
List-Help: <mailto:kvm-devel-request-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/kvm-devel>,
	<mailto:kvm-devel-request-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org?subject=subscribe>
Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
List-Id: kvm.vger.kernel.org


Avi Kivity wrote:
> We'll want to keep a vcpu fd.  If the vcpu is idle we'll be asleep in 
> poll() or the like, and we need some kind of wakeup mechanism.
Our userspace does idle/wakeup differently:
One cpu exits sys_s390host_sie, and the intercept code indicates a 
halt with interrupts enabled (cpu idle loop). Now userland parks our 
vcpu thread in pthread_cond_wait. Once we want to wakeup this thread, 
either by interprocessor signal (need_resched and such) or due to an 
IO interrupt, we do a pthread_cond_signal to wakeup the thread again. 
The thread will now enter sys_s390host_sie, and after entering the 
vcpu context will execute the interrupt handler first.
The advantage of waiting in userland I see, is that userspace can dump 
interrupts to idle CPUs without kernel intervention. On the other 
hand, my brain hurts when thinking about userland passing vcpu fds to 
other threads/processes and when thinking about sys_fork().
In the end, you do the decision and we'll follow the way you lead to.

> I guess some of the difference stems from the fact that on x86, the 
> Linux pagetables are actually the hardware pagetables.  VT and SVM use a 
> separate page table for the guest which cannot be shared with the host. 
> This means that
> 
> - we need to teach the Linux mm to look at shadow page tables when 
> transferring dirty bits
> - when Linux wants to write protect a page, it has to modify the shadow 
> page tables too (and flush the guest tlbs, which is again a bit different)
> - this means rmap has to be extended to include kvm
> 
> I think that non-x86 have purely software page tables, maybe this make 
> things easier.
We do use hardware page tables too. Our hardware does know about 
mutiple levels of page translation, and does its part of maintaining 
different sets of dirty/reference bits for guest and host while 
running in the virtual machine context. This process is transparent 
for both virtual machine and host.
For the x86 part, I will spend some time to read the kvm code a little 
more.

so long,
Carsten

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/