From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Ehrhardt Date: Tue, 02 Sep 2008 07:30:30 +0000 Subject: Re: [PATCH 3/6] kvmppc: rewrite guest code - sprg0-3 Message-Id: <48BCEB96.7040309@linux.vnet.ibm.com> List-Id: References: <1219142205-12062-4-git-send-email-ehrhardt@linux.vnet.ibm.com> In-Reply-To: <1219142205-12062-4-git-send-email-ehrhardt@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable To: kvm-ppc@vger.kernel.org Liu Yu-B13201 wrote: >> -----Original Message----- >> From: Christian Ehrhardt [mailto:ehrhardt@linux.vnet.ibm.com]=20 >> Sent: Monday, September 01, 2008 8:15 PM >> To: Liu Yu-B13201 >> Cc: kvm-ppc@vger.kernel.org; hollisb@us.ibm.com >> Subject: Re: [PATCH 3/6] kvmppc: rewrite guest code - sprg0-3 >> >> =20 > > =20 >>>> + >>>> + down_read(¤t->mm->mmap_sem); >>>> + pcpage =3D gfn_to_page(vcpu->kvm, >>>> + guest_pcaddr >>=20 >>>> KVM_PPCPV_MAGIC_PAGE_SHIFT); >>>> + up_read(¤t->mm->mmap_sem); >>>> + if (pcpage =3D bad_page) >>>> + return -EFAULT; >>>> + >>>> + pcmem =3D kmap_atomic(pcpage, KM_USER0); >>>> + if (!pcmem) >>>> + return -EFAULT; >>>> + pcaddr =3D ((unsigned long)pcmem >>>> + | (vcpu->arch.pc &=20 >>>> KVM_PPCPV_MAGIC_PAGE_MASK)); >>>> + BUG_ON(inst !=3D *((u32 *)pcaddr)); >>>> + create_instruction(pcaddr, newinst); >>>> + kunmap_atomic(pcpage, KM_USER0); >>>> + } >>>> =20 >>>> =20 >>> I think you are writing new instruction back here. >>> Why not use kvm_write_guest_page() (kvm_main.c) directly? >>> =20 >>> =20 >> hmm I'm exchaning instructions in memory and the "create_instruction"=20 >> function I use is exactly doing that. >> I expect that the kvm function would miss the needed sync=20 >> functionality=20 >> to ensure that all old code is dropped from the cpu caches.=20 >> Well I'm not=20 >> an expert in that, I need to look into that. >> >> =20 > > It seems there wouldn't be cache consistency problem for PowerPC as long = as the memory is updated by processor. > Btw, where is the definition of create_instruction? > > =20 -> asm/system.h:521 >>> =20 >>> =20 >>>> + >>>> + return err; >>>> +} >>>> =20 >>>> /* XXX to do: >>>> * lhax >>>> @@ -260,6 +415,9 @@ >>>> int dcrn; >>>> enum emulation_result emulated =3D EMULATE_DONE; >>>> int advance =3D 1; >>>> + >>>> + if (kvmppc_has_pvmem(vcpu) &&=20 >>>> !kvmppc_pvmem_rewrite_instruction(vcpu)) >>>> + return EMULATE_DONE; >>>> =20 >>>> =20 >>> Rewirting instruction is one-off, so it's not that important to pay >>> attention to performance. >>> However, putting rewriting at the beginning of the function=20 >>> =20 >> add extra >> =20 >>> workload to other emulations. >>> >>> How about this way? >>> >>> int rewrite =3D 0; >>> >>> ....... >>> case mfspr0: >>> rewrite =3D 1; >>> ...... >>> >>> // at the end of the function >>> if(unlikely(rewrite) && kvmppc_has_pvmem(vcpu)) >>> kvmppc_pvmem_rewrite_instruction(vcpu); >>> =09 >>> =20 >>> =20 >> This would spread checks with kvmppc_has_pvmem all over the code,=20 >> because if we rewrite the instruction it is not allowed to emulate it. >> On the other hand we could emulate it and change the instruction for=20 >> their "next" hit (not jump back to current pc but at pc+4 as it is=20 >> emulated). >> Yeah that could work, and once most rewrite actions are done the=20 >> "unlikely" is the right thing because later on we won't hit=20 >> it anymore.=20 >> That should speed up emulation a bit (not much compared to exit/enter=20 >> guest, but it's never wrong to code things performance wise). >> >> =20 > > In fact, not only the unlikely thing, I am worried about that > putting kvmppc_pvmem_rewrite_instruction() at beginning=20 > will cosume some time for normal emulations, ( a lot comparisons and jump= s) > especially more and more instructions would be replaced in it in the futu= re. > > =20 Sure, I think I already got your point. This is what I described with=20 "emulate & rewrite it" allowing us to put the=20 kvmppc_pvmem_rewrite_instruction() at the end guarded by an additional=20 if that will eventually reduce the overhead to nearly zero once the=20 trapping code is rewritten. The code flow I have in mind is that: emulate instruction if this is a rewritable one set flag increase program counter because instruction is emulated if unlikely(rewritable) flag is set and kvm_has_pvmem call=20 kvmppc_pvmem_rewrite_instruction -> The next time guest executes that code it will already use the new=20 rewritten instruction -> non rewritable instructions that are emulated don't set the flag and=20 therefore don't call kvmppc_pvmem_rewrite_instruction saving overhead I'll submit an updated patch series somewhen this week. Then you can=20 check if I got your hint right :-) --=20 Gr=FCsse / regards,=20 Christian Ehrhardt IBM Linux Technology Center, Open Virtualization