From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Ehrhardt Subject: Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state Date: Tue, 26 May 2009 10:02:59 +0200 Message-ID: <4A1BA233.6080504@linux.vnet.ibm.com> References: <1243251652-27617-1-git-send-email-ehrhardt@linux.vnet.ibm.com> <1243251652-27617-2-git-send-email-ehrhardt@linux.vnet.ibm.com> <20090525202248.GA7608@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm@vger.kernel.org, avi@redhat.com, borntraeger@de.ibm.com, cotte@de.ibm.com, heiko.carstens@de.ibm.com, schwidefsky@de.ibm.com To: Marcelo Tosatti Return-path: Received: from mtagate1.uk.ibm.com ([194.196.100.161]:50669 "EHLO mtagate1.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752105AbZEZIDG (ORCPT ); Tue, 26 May 2009 04:03:06 -0400 Received: from d06nrmr1407.portsmouth.uk.ibm.com (d06nrmr1407.portsmouth.uk.ibm.com [9.149.38.185]) by mtagate1.uk.ibm.com (8.13.1/8.13.1) with ESMTP id n4Q8359i026327 for ; Tue, 26 May 2009 08:03:05 GMT Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by d06nrmr1407.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n4Q835po1048686 for ; Tue, 26 May 2009 09:03:05 +0100 Received: from d06av01.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n4Q834O2016459 for ; Tue, 26 May 2009 09:03:05 +0100 In-Reply-To: <20090525202248.GA7608@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: Marcelo Tosatti wrote: > On Mon, May 25, 2009 at 01:40:49PM +0200, ehrhardt@linux.vnet.ibm.com= wrote: > =20 >> From: Christian Ehrhardt >> >> To ensure vcpu's come out of guest context in certain cases this pat= ch adds a >> s390 specific way to kick them out of guest context. Currently it ki= cks them >> out to rerun the vcpu_run path in the s390 code, but the mechanism i= tself is >> expandable and with a new flag we could also add e.g. kicks to users= pace etc. >> >> Signed-off-by: Christian Ehrhardt >> =20 > > "For now I added the optimization to skip kicking vcpus out of guest > that had the request bit already set to the s390 specific loop (sent = as > v2 in a few minutes). > > We might one day consider standardizing some generic kickout levels e= =2Eg. > kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace= ", > ... whatever levels fit *all* our use cases. And then let that kicks = be > implemented in an kvm_arch_* backend as it might be very different ho= w > they behave on different architectures." > > That would be ideal, yes. Two things make_all_requests handles:=20 > > 1) It disables preemption with get_cpu(), so it can reliably check fo= r > cpu id. Somehow you don't need that for s390 when kicking multiple > vcpus? > =20 I don't even need the cpuid as make_all_requests does, I just insert a=20 special bit in the vcpu arch part and the vcpu will "come out to me (ho= st)". =46ortunateley the kick is rare and fast so I can just insert it=20 unconditionally (it's even ok to insert it if the vcpu is not in guest=20 state). That prevents us from needing vcpu lock or detailed checks whic= h=20 would end up where we started (no guarantee that vcpu's come out of=20 guest context while trying to aquire all vcpu locks) > 2) It uses smp_call_function_many(wait=3D1), which guarantees that by= the > time make_all_requests returns no vcpus will be using stale data (the > remote vcpus will have executed ack_flush). > =20 yes this is really a part my s390 implementation doesn't fulfill yet.=20 Currently on return vcpus might still use the old memslot information. As mentioned before letting all interrupts come "too far" out of the ho= t=20 loop would be a performance issue, therefore I think I will need some=20 request&confirm mechanism. I'm not sure yet but maybe it could be as=20 easy as this pseudo code example: # in make_all_requests # remember we have slots_lock write here and the reentry that updates=20 the vcpu specific data aquires slots_lock for read. loop vcpus set_bit in vcpu requests kick vcpu #arch function endloop loop vcpus until the requested bit is disappeared #as the reentry path uses=20 test_and_clear it will disappear endloop That would be a implicit synchronization and should work, as I wrote=20 before setting memslots while the guest is running is rare if ever=20 existant for s390. On x86 smp_call_many could then work without the wai= t=20 flag being set. But I assume that this synchronization approach is slower as it=20 serializes all vcpus on reentry (they wait for the slots_lock to get=20 dropped), therefore I wanted to ask how often setting memslots on=20 runtime will occur on x86 ? Would this approach be acceptable ? If it is too adventurous for now I can implement it that way in the s39= 0=20 code and we split the long term discussion (synchronization + generic=20 kickout levels + who knows what comes up). > If smp_call_function_many is hidden behind kvm_arch_kick_vcpus, can y= ou > make use of make_all_requests for S390 (without the smp_call_function= =20 > performance impact you mentioned) ? > =20 In combination with the request&confirm mechanism desribed above it=20 should work if smp_call function and all the cpuid gathering which=20 belongs to it is hidden behind kvm_arch_kick_vcpus. > For x86 we can further optimize make_all_requests by checking REQ_KIC= K, > and kvm_arch_kick_vcpus would be a good place for that. > > And the kickout levels idea you mentioned can come later, as an > optimization? yes I agree splitting that to a later optimization is a good idea. > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > =20 --=20 Gr=FCsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization=20