From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52767) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XBo3T-0000EL-IY for qemu-devel@nongnu.org; Mon, 28 Jul 2014 12:45:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XBo3N-0000zu-Fs for qemu-devel@nongnu.org; Mon, 28 Jul 2014 12:45:43 -0400 Received: from cantor2.suse.de ([195.135.220.15]:38171 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XBo3M-0000zk-RO for qemu-devel@nongnu.org; Mon, 28 Jul 2014 12:45:37 -0400 Message-ID: <53D67E2F.30100@suse.de> Date: Mon, 28 Jul 2014 18:45:35 +0200 From: Alexander Graf MIME-Version: 1.0 References: <1404997839-29038-1-git-send-email-borntraeger@de.ibm.com> <1404997839-29038-5-git-send-email-borntraeger@de.ibm.com> <53D654D2.40308@suse.de> <20140728161644.00c09b3f@thinkpad-w530> <2B39547D-B9A3-4509-808C-B0808067ED54@suse.de> <20140728170318.1eb8ed64@thinkpad-w530> In-Reply-To: <20140728170318.1eb8ed64@thinkpad-w530> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH/RFC 4/5] s390x/kvm: test whether a cpu is STOPPED when checking "has_work" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Hildenbrand Cc: linux-s390 , KVM , qemu-devel , Christian Borntraeger , Jens Freimann , Cornelia Huck , Paolo Bonzini On 28.07.14 17:03, David Hildenbrand wrote: >> On 28.07.2014, at 16:16, David Hildenbrand wrote: >> >>>> On 10.07.14 15:10, Christian Borntraeger wrote: >>>>> From: David Hildenbrand >>>>> >>>>> If a cpu is stopped, it must never be allowed to run and no interrupt may wake it >>>>> up. A cpu also has to be unhalted if it is halted and has work to do - this >>>>> scenario wasn't hit in kvm case yet, as only "disabled wait" is processed within >>>>> QEMU. >>>>> >>>>> Signed-off-by: David Hildenbrand >>>>> Reviewed-by: Cornelia Huck >>>>> Reviewed-by: Christian Borntraeger >>>>> Signed-off-by: Christian Borntraeger >>>> This looks like it's something that generic infrastructure should take >>>> care of, no? How does this work for the other archs? They always get an >>>> interrupt on the transition between !has_work -> has_work. Why don't we >>>> get one for s390x? >>>> >>>> >>>> Alex >>>> >>>> >>> Well, we have the special case on s390 as a CPU that is in the STOPPED or the >>> CHECK STOP state may never run - even if there is an interrupt. It's >>> basically like this CPU has been switched off. >>> >>> Imagine that it is tried to inject an interrupt into a stopped vcpu. It >>> will kick the stopped vcpu and thus lead to a call to >>> "kvm_arch_process_async_events()". We have to deny that this vcpu will ever >>> run as long as it is stopped. It's like a way to "suppress" the >>> interrupt for such a transition you mentioned. >> An interrupt kick usually just means we go back into the main loop. From there we check the interrupt bitmap which interrupt to handle. Check out the handling code here: >> >> http://git.qemu.org/?p=qemu.git;a=blob;f=cpu-exec.c;h=38e5f02a307523d99134f4e2e6c51683bb10b45b;hb=HEAD#l580 >> >> If you just check for the stopped state in here, do_interrupt() will never get called and thus the CPU shouldn't ever get executed. Unless I'm heavily mistaken :). > So you would rather move the check out of has_work() into the main loop in > cpu-exec.c and directly into kvm_arch_process_async_events()? > > This would on the other hand lead to an unhalt of the vcpu in cpu_exec() on any > CPU_INTERRUPT_HARD. A VCPU might thus be unhalted although it is not able to run. Is okay? Not really I think. We could create a new interrupt_request bit called CPU_INTERRUPT_STOPPED that doesn't get unset automatically and simply sets cpu->halted = 1 (similar to CPU_INTERRUPT_HALT). > > Looking at cpu.c:cpu_thread_is_idle(), we would maybe return false, although we > are idle (because we are idle when we are stopped)? > > My qemu kvm knowledge is way better than the qemu emulation knowledge, so I > appreciate any insights :) > >>> Later, another vcpu might decide to turn that vcpu back on (by e.g. sending a >>> SIGP START to that vcpu). >> Yes, in that case that other CPU generates a signal (a different bit in interrupt_request) and the first CPU would see that it has to wake up and wake up. >> >>> I am not sure if such a mechanism/scenario is applicable to any other arch. They >>> all seem to reset the cs->halted flag if they know they are able to run (e.g. >>> due to an interrupt) - they have no such thing as "stopped cpus", only >>> "halted/waiting cpus". >> There's not really much difference between the two. The only difference from a software point of view is that a "stopped" CPU has its external interrupt bits masked off, no? > Well the difference is, that a STOPPED vcpu can be woken up by non-interrupt > like things (SIGP START) AND a special interrupt (SIGP RESTART - which is like > a "SIPI"++ as it performs a psw exchange - "NMI"). So we basically have two > paths that can lead to a state change. All interrupt bits may be in any > combination (SIGP RESTART interrupts can't be masked out, nor can SIGP START be > denied). That's perfectly normal behavior. Just make it two different interrupt types: if (interrupt_request & CPU_INTERRUPT_STOPPED) { /* Go back to halted state */ ... } else if (interrupt_request & CPU_INTERRUPT_SIGP) { env->interrupt_request &= ~CPU_INTERRUPT_STOPPED; /* swap PSW */ ... } else if ((interrupt_request & CPU_INTERRUPT_HARD) && (env->psw.mask & PSW_MASK_EXT)) { ... } > > The other thing may be that on s390, each vcpu (including itself) can put > another vcpu into the STOPPED state - I assume that this is different for x86 " > INIT_RECEIVED". For this reason we have to watch out for bad race conditions > (e.g. multiple vcpus working on another vcpu)... TCG is single-threaded :). And if you stick to the interrupt logic above all should be good. Alex