From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [PATCHv2 5/5] KVM: Provide fast path for "rep ins" emulation if possible. Date: Fri, 29 Jun 2012 19:26:38 -0300 Message-ID: <20120629222638.GA12437@amt.cnet> References: <1339502487-30049-1-git-send-email-gleb@redhat.com> <1339502487-30049-6-git-send-email-gleb@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, avi@redhat.com To: Gleb Natapov Return-path: Received: from mx1.redhat.com ([209.132.183.28]:40598 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750825Ab2F2W14 (ORCPT ); Fri, 29 Jun 2012 18:27:56 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q5TMRt2O028549 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 29 Jun 2012 18:27:56 -0400 Content-Disposition: inline In-Reply-To: <1339502487-30049-6-git-send-email-gleb@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Jun 12, 2012 at 03:01:27PM +0300, Gleb Natapov wrote: > "rep ins" emulation is going through emulator now. This is slow because > emulator knows how to write back only one datum at a time. This patch > provides fast path for the instruction in certain conditions. The > conditions are: DF flag is not set, destination memory is RAM and single > datum does not cross page boundary. If fast path code fails it falls > back to emulation. > > Signed-off-by: Gleb Natapov > --- > arch/x86/include/asm/kvm_host.h | 6 ++ > arch/x86/kvm/svm.c | 20 +++++-- > arch/x86/kvm/vmx.c | 25 +++++-- > arch/x86/kvm/x86.c | 133 ++++++++++++++++++++++++++++++++++++-- > 4 files changed, 165 insertions(+), 19 deletions(-) > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index 7a41878..f3e7bb3 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c > @@ -1887,21 +1887,31 @@ static int io_interception(struct vcpu_svm *svm) > { > struct kvm_vcpu *vcpu = &svm->vcpu; > u32 io_info = svm->vmcb->control.exit_info_1; /* address size bug? */ > - int size, in, string; > + int size, in, string, rep; > unsigned port; > > ++svm->vcpu.stat.io_exits; > string = (io_info & SVM_IOIO_STR_MASK) != 0; > + rep = (io_info & SVM_IOIO_REP_MASK) != 0; > in = (io_info & SVM_IOIO_TYPE_MASK) != 0; > - if (string || in) > - return emulate_instruction(vcpu, 0) == EMULATE_DONE; > > port = io_info >> 16; > size = (io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT; > svm->next_rip = svm->vmcb->control.exit_info_2; > - skip_emulated_instruction(&svm->vcpu); > > - return kvm_fast_pio_out(vcpu, size, port); > + if (!string && !in) { > + skip_emulated_instruction(&svm->vcpu); > + return kvm_fast_pio_out(vcpu, size, port); > + } else if (string && in && rep) { Is there a reason to restrict optimization to rep ? That is, it should be easy to extend to normal in? > + kvm_x86_ops->skip_emulated_instruction(vcpu); > + return EMULATE_DONE; > + } > + if (kvm_get_rflags(vcpu) & X86_EFLAGS_DF) > + return EMULATE_FAIL; > + if (ad_bytes_idx > 2) > + return EMULATE_FAIL; > + > + ad_bytes = (u8[]){2, 4, 8}[ad_bytes_idx]; > + > + rdi = kvm_address_mask(ad_bytes, rdi); > + > + count = (PAGE_SIZE - offset_in_page(rdi))/size; > + > + if (count == 0) /* 'in' crosses page boundry */ > + return EMULATE_FAIL; > + > + count = min(count, kvm_address_mask(ad_bytes, rcx)); > + > + r = kvm_linearize_address(vcpu, get_emulation_mode(vcpu), > + rdi, VCPU_SREG_ES, count, true, false, ad_bytes, > + &linear_addr); kvm_linearize_address expects size parameter in bytes?