From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: Question on skip_emulated_instructions() Date: Wed, 7 Apr 2010 18:43:25 +0300 Message-ID: <20100407154324.GF303@redhat.com> References: <4BBAB46B.9010405@lab.ntt.co.jp> <20100406100522.GW5235@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm@vger.kernel.org, Avi Kivity , Marcelo Tosatti To: Yoshiaki Tamura Return-path: Received: from mx1.redhat.com ([209.132.183.28]:18816 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932662Ab0DGPna convert rfc822-to-8bit (ORCPT ); Wed, 7 Apr 2010 11:43:30 -0400 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Apr 07, 2010 at 03:25:10PM +0900, Yoshiaki Tamura wrote: > 2010/4/6 Gleb Natapov : > > On Tue, Apr 06, 2010 at 01:11:23PM +0900, Yoshiaki Tamura wrote: > >> Hi. > >> > >> When handle_io() is called, rip is currently proceeded *before* ac= tually having > >> I/O handled by qemu in userland. =9AUpon implementing Kemari for > >> KVM(http://www.mail-archive.com/kvm@vger.kernel.org/msg25141.html)= mainly in > >> userland qemu, we encountered a problem that synchronizing the con= tent of VCPU > >> before handling I/O in qemu is too late because rip is already pro= ceeded in KVM, > >> Although we avoided this issue with temporal hack, I would like to= ask a few > >> question on skip_emulated_instructions. > >> > >> 1. Does rip need to be proceeded before having I/O handled by qemu= ? > > In current kvm.git rip is proceeded before I/O is handled by qemu o= nly > > in case of "out" instruction. From architecture point of view I thi= nk > > it's OK since on real HW you can't guaranty that I/O will take effe= ct > > before instruction pointer is advanced. It is done like that becaus= e we > > want "out" emulation to be real fast so we skip x86 emulator. >=20 > Thanks for your reply. >=20 > If proceeding rip later doesn't break the behavior of devices or > introduce slow down, I would like that to be done. >=20 Device can not care less about what value rip register currently has. Why is it matters for you code? > >> 2. If no, is it possible to divide skip_emulated_instructions(), l= ike > >> rec_emulated_instructions() to remember to next_rip, and > >> skip_emulated_instructions() to actually proceed the rip. > > Currently only emulator can call userspace to do I/O, so after > > userspace returns after I/O exit, control is handled back to emulat= or > > unconditionally. =9A"out" instruction skips emulator, but there is = nothing > > to do after userspace returns, so regular cpu loop is executed. If = we > > want to advance rip only after userspace executed I/O done by "out"= we > > need to distinguish who requested I/O (emulator or kvm_fast_pio_out= ()) > > and call different code depending on who that was. It can be done b= y > > having a callback that (if not null) is called on return from users= pace. >=20 > Your suggestion is to introduce a callback entry, and instead of > calling kvm_rip_write(), set it to the entry before calling > kvm_fast_pio_out(), > and check the entry upon return from the userspace, correct? >=20 Something like that, yes. > According to the comment in x86.c, when it was "out" instruction > vcpu->arch.pio.count is set to 0 to skip the emulator. > To call kvm_fast_pio_out(), "!string" and "!in" must be set. > If we can check, vcpu->arch.pio.count, "string" and "in" on return > from the userspace, can't we distinguish who requested I/O, emulator > or kvm_fast_pio_out()? >=20 May be, but callback approach is much cleaner. "string" and "in" can ha= ve stale data for instance. > >> 3. svm has next_rip but when it is 0, nop is emulated. =9ACan this= be modified to > >> continue without emulating nop when next_rip is 0? > >> > > I don't see where nop is emulated if next_rip is 0. As far as I see= in > > case of next_rip=3D=3D0 an instruction at rip is decoded to figure = out its > > length and then rip is advanced by instruction length. Anyway next_= rip > > is svm thing only. >=20 > Sorry. I wasn't understanding the code enough. >=20 > static void skip_emulated_instruction(struct kvm_vcpu *vcpu) > { > ... > if (!svm->next_rip) { > if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) !=3D > EMULATE_DONE) > printk(KERN_DEBUG "%s: NOP\n", __func__); > return; > } >=20 > Since the printk says NOP, I thought emulate_instruction was doing so= =2E.. >=20 > The reason I asked about next_rip is because I was hoping to use this > entry to advance rip only after userspace executed I/O done by "out", > like if next_rip is !0, > call kvm_rip_write(), and introduce next_rip to vmx if it is usable > because vmx is > currently using local variable rip. >=20 > Yoshi -- Gleb.