From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yoshiaki Tamura Subject: Re: Question on skip_emulated_instructions() Date: Thu, 08 Apr 2010 14:27:53 +0900 Message-ID: <4BBD6959.6080003@lab.ntt.co.jp> References: <4BBAB46B.9010405@lab.ntt.co.jp> <20100406100522.GW5235@redhat.com> <20100407154324.GF303@redhat.com> <4BBCC2C9.1040301@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Gleb Natapov , kvm@vger.kernel.org, Marcelo Tosatti To: Avi Kivity Return-path: Received: from tama50.ecl.ntt.co.jp ([129.60.39.147]:63396 "EHLO tama50.ecl.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751506Ab0DHF2J (ORCPT ); Thu, 8 Apr 2010 01:28:09 -0400 In-Reply-To: <4BBCC2C9.1040301@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > On 04/07/2010 08:21 PM, Yoshiaki Tamura wrote: >> >> The problem here is that, I needed to transfer the VM state which is >> just *before* the output to the devices. Otherwise, the VM state has >> already been proceeded, and after failover, some I/O didn't work as I >> expected. >> I tracked down this issue, and figured out rip was already proceeded >> in KVM, >> and transferring this VCPU state was meaningless. >> >> I'm planning to post the patch set of Kemari soon, but I would like to >> solve >> this rip issue before that. If there is no drawback, I'm happy to work >> and post a patch. > > vcpu state is undefined when an mmio operation is pending, > Documentation/kvm/api.txt says the following: > >> NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the corresponding >> operations are complete (and guest state is consistent) only after >> userspace >> has re-entered the kernel with KVM_RUN. The kernel side will first finish >> incomplete operations and then check for pending signals. Userspace >> can re-enter the guest with an unmasked signal pending to complete >> pending operations. Thanks for the information. So the point is the vcpu state that can been observed from qemu upon KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI should not be used because it's not complete/consistent? > Currently we complete instructions for output operations and leave them > incomplete for input operations. Deferring completion for output > operations should work, except it may break the vmware backdoor port > (see hw/vmport.c), which changes register state following an output > instruction, and KVM_EXIT_TPR_ACCESS, where userspace reads the state > following a write instruction. > > Do you really need to transfer the vcpu state before the instruction, or > do you just need a consistent state? If the latter, then you can get > away by posting a signal and re-entering the guest. kvm will complete > the instruction and exit immediately, and you will have fully consistent > state. The requirement is that the guest must always be able to replay at least the instruction which triggered the synchronization on the primary. From that point of view, I think I need to transfer the vcpu state before the instruction. If I post a signal and let the guest or emulator proceed, I'm not sure whether the guest on the secondary can be replay as expected. Please point out if I were misunderstanding.