From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: RFC: cache_regs in kvm_emulate_pio Date: Fri, 20 Jun 2008 23:30:05 +0300 Message-ID: <485C134D.2070905@qumranet.com> References: <20080619233123.GA24183@dmt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm-devel To: Marcelo Tosatti Return-path: Received: from il.qumranet.com ([212.179.150.194]:26697 "EHLO il.qumranet.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1764218AbYFTUaH (ORCPT ); Fri, 20 Jun 2008 16:30:07 -0400 In-Reply-To: <20080619233123.GA24183@dmt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: Marcelo Tosatti wrote: > Hi, > > From my understanding the ->cache_regs call on kvm_emulate_pio() is > necessary only on AMD, where vcpu->arch.regs[RAX] is not copied during > exit in svm_vcpu_load(). > > On both architectures, the remaining general purpose registers are saved > on exit. > > The following patch saves 100 cycles out of both light and heavy exits > on Intel (if correct, kvm_emulate_hypercall and complete_pio could also > benefit, thus saving 200 cycles for in-kernel devices). > ISTR vmwrite as 50 cycles and vmread as much lower. > BTW, the decache_regs(vcpu) call at the end of complete_pio() could also > be a noop on Intel from what I can tell ? > > I think so. decache_regs() is actually more important. > diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h > index 851184d..95a0736 100644 > --- a/include/asm-x86/kvm_host.h > +++ b/include/asm-x86/kvm_host.h > @@ -410,6 +410,7 @@ struct kvm_x86_ops { > unsigned long (*get_dr)(struct kvm_vcpu *vcpu, int dr); > void (*set_dr)(struct kvm_vcpu *vcpu, int dr, unsigned long value, > int *exception); > + void (*cache_rax)(struct kvm_vcpu *vcpu); > void (*cache_regs)(struct kvm_vcpu *vcpu); > void (*decache_regs)(struct kvm_vcpu *vcpu); > unsigned long (*get_rflags)(struct kvm_vcpu *vcpu); > ugh, another callback. how about instead /* in vcpu structure */ u16 regs_available; u16 regs_dirty; /* read from cache if possible */ if (!test_bit(VCPU_REG_RAX, ®s_available)) ->cache_regs(); printk("%d\n", regs[VCPU_REGS_RAX]); /* write to cache, ->vcpu_run() will flush */ regs[VCPU_REGS_RAX] = 17; __set_bit(VCPU_REGS_RAX, ®s_dirty); -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.