From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Otte Subject: Re: [PATCH/PFC 0/2] s390 host support Date: Sun, 29 Apr 2007 12:24:47 +0200 Message-ID: <4634726F.10705@de.ibm.com> References: <1177681224.5770.20.camel@cotte.boeblingen.de.ibm.com> <4632E94C.20904@qumranet.com> <4633099D.3020709@de.ibm.com> <463461B1.7060406@qumranet.com> Reply-To: carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: carsteno-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org, "kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org" , Christian Borntraeger , mschwid2-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org To: Avi Kivity Return-path: In-Reply-To: <463461B1.7060406-atKUWr5tajBWk0Htik3J/w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Avi Kivity wrote: > We'll want to keep a vcpu fd. If the vcpu is idle we'll be asleep in > poll() or the like, and we need some kind of wakeup mechanism. Our userspace does idle/wakeup differently: One cpu exits sys_s390host_sie, and the intercept code indicates a halt with interrupts enabled (cpu idle loop). Now userland parks our vcpu thread in pthread_cond_wait. Once we want to wakeup this thread, either by interprocessor signal (need_resched and such) or due to an IO interrupt, we do a pthread_cond_signal to wakeup the thread again. The thread will now enter sys_s390host_sie, and after entering the vcpu context will execute the interrupt handler first. The advantage of waiting in userland I see, is that userspace can dump interrupts to idle CPUs without kernel intervention. On the other hand, my brain hurts when thinking about userland passing vcpu fds to other threads/processes and when thinking about sys_fork(). In the end, you do the decision and we'll follow the way you lead to. > I guess some of the difference stems from the fact that on x86, the > Linux pagetables are actually the hardware pagetables. VT and SVM use a > separate page table for the guest which cannot be shared with the host. > This means that > > - we need to teach the Linux mm to look at shadow page tables when > transferring dirty bits > - when Linux wants to write protect a page, it has to modify the shadow > page tables too (and flush the guest tlbs, which is again a bit different) > - this means rmap has to be extended to include kvm > > I think that non-x86 have purely software page tables, maybe this make > things easier. We do use hardware page tables too. Our hardware does know about mutiple levels of page translation, and does its part of maintaining different sets of dirty/reference bits for guest and host while running in the virtual machine context. This process is transparent for both virtual machine and host. For the x86 part, I will spend some time to read the kvm code a little more. so long, Carsten ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/