From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run Date: Mon, 18 May 2009 01:31:50 +0300 Message-ID: <4A109056.2010003@redhat.com> References: <1241534358-32172-1-git-send-email-ehrhardt@linux.vnet.ibm.com> <1241534358-32172-2-git-send-email-ehrhardt@linux.vnet.ibm.com> <4A017C2E.6060306@redhat.com> <4A08217E.6000102@linux.vnet.ibm.com> <4A0824EF.5010201@redhat.com> <4A082C4E.60501@linux.vnet.ibm.com> <4A082FF3.4060908@redhat.com> <4A083956.2000200@linux.vnet.ibm.com> <4A083DCC.8000805@redhat.com> <4A093E1B.9030008@linux.vnet.ibm.com> <4A095EEA.60902@redhat.com> <4A097AA3.8000007@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm , Christian Borntraeger , Carsten Otte To: Christian Ehrhardt Return-path: Received: from mx2.redhat.com ([66.187.237.31]:42193 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754732AbZEQWbu (ORCPT ); Sun, 17 May 2009 18:31:50 -0400 In-Reply-To: <4A097AA3.8000007@linux.vnet.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: Christian Ehrhardt wrote: >>>>> The bad thing on vcpu->request in that case is that I don't want >>>>> the async behaviour of vcpu->requests in that case, I want the >>>>> memory slot updated in all vcpu's when the ioctl is returning. >>>> >>>> You mean, the hardware can access the vcpu control block even when >>>> the vcpu is not running? >>> No, hardware only uses it with a running vcpu, but I realised my own >>> fault while changing the code to vcpu->request style. >>> For s390 I need to update the KVM->arch and *all* >>> vcpu->arch->sie_block... data synchronously. >> >> Out of interest, can you explain why? > Sure I'll try to give an example. > > a) The whole guest has "one" memory slot representing all it's memory. > Therefore some important values like guest_origin and guest_memsize > (one slot so it's just addr+size) are kept at VM level in kvm->arch. It should really be kept in kvm->memslots[0]->{userspace_addr, npages}. This is common to all architectures. > b) We fortunately have cool hardware support for "nearly > everything"(tm) :-) In this case for example we set in > vcpu->arch.sie_block the values for origin and size translated into a > "limit" to get memory management virtualization support. x86 has something analogous; shadow or nested page tables are also per-vcpu and accessed by the hardware while the guest is running. > c) we have other code e.g. all our copy_from/to_guest stuff that uses > the kvm->arch values You want to drop these and use kvm_read_guest() / kvm_write_guest(). > If we would allow e.g. updates of a memslot (or as the patch supposes > to harden the set_memory_region code against inconsiderate code > changes in other sections) it might happen that we set the kvm->arch > information but the vcpu->arch->sie_block stuff not until next > reentry. Now concurrently the running vcpu could cause some kind of > fault that involves a copy_from/to_guest. That way we could end up > with potentially invalid handling of that fault (fault handling and > running guest would use different userspace adresses until it is > synced on next vcpu reentry) - it's theoretical I know, but it might > cause some issues that would be hard to find. I agree it should be protected. Here's how we do it in arch-independent code: - code that looks up memory slots takes slots_lock for read (future: rcu) - code that changes memory slots takes slots_lock for write, and requests an mmu reload (includes an IPI to force the vcpu out of guest mode) Now, once we begin changing a slot no one can touch memory or reenter the guest until we are done. > On the other hand for the long term I wanted to note that all our > copy_from/to_guest functions is per vcpu, so when we some day > implement updateable memslots, multiple memslots or even just fill > "free time"(tm) and streamline our code we could redesign that > origin/size storage. This could be done multiple ways, either just > store it per vcpu or with a lock for the kvm->arch level variables - > both ways and maybe more could then use the vcpu->request based > approach, but unfortunately it's neither part of that patch nor of the > current effort to do that. I think we should keep that in generic code. All of that applies to x86 (and ia64 and ppc), if I understand you correctly, and if I understand the other archs correctly (don't place a large bet). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.