From mboxrd@z Thu Jan 1 00:00:00 1970 From: Cedric Le Goater Subject: Re: [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests Date: Mon, 07 Oct 2013 16:23:49 +0200 Message-ID: <5252C3F5.4010908@fr.ibm.com> References: <1380798224-27024-1-git-send-email-clg@fr.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: paulus@samba.org, kvm-ppc@vger.kernel.org, kvm@vger.kernel.org To: Alexander Graf Return-path: Received: from e06smtp16.uk.ibm.com ([195.75.94.112]:44364 "EHLO e06smtp16.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752764Ab3JGOYE (ORCPT ); Mon, 7 Oct 2013 10:24:04 -0400 Received: from /spool/local by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 7 Oct 2013 15:24:02 +0100 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: Hi Alex, On 10/04/2013 02:50 PM, Alexander Graf wrote: >=20 > On 03.10.2013, at 13:03, C=E9dric Le Goater wrote: >=20 >> MMIO emulation reads the last instruction executed by the guest >> and then emulates. If the guest is running in Little Endian mode, >> the instruction needs to be byte-swapped before being emulated. >> >> This patch stores the last instruction in the endian order of the >> host, primarily doing a byte-swap if needed. The common code >> which fetches last_inst uses a helper routine kvmppc_need_byteswap()= =2E >> and the exit paths for the Book3S PV and HR guests use their own >> version in assembly. >> >> kvmppc_emulate_instruction() also uses kvmppc_need_byteswap() to >> define in which endian order the mmio needs to be done. >> >> The patch is based on Alex Graf's kvm-ppc-queue branch and it >> has been tested on Big Endian and Little Endian HV guests and >> Big Endian PR guests. >> >> Signed-off-by: C=E9dric Le Goater >> --- >> >> Here are some comments/questions :=20 >> >> * the host is assumed to be running in Big Endian. when Little Endia= n >> hosts are supported in the future, we will use the cpu features to >> fix kvmppc_need_byteswap() >> >> * the 'is_bigendian' parameter of the routines kvmppc_handle_load() >> and kvmppc_handle_store() seems redundant but the *BRX opcodes=20 >> make the improvements unclear. We could eventually rename the >> parameter to byteswap and the attribute vcpu->arch.mmio_is_bigendi= an >> to vcpu->arch.mmio_need_byteswap. Anyhow, the current naming sucks >> and I would happy to have some directions to fix it. >> >> >> >> arch/powerpc/include/asm/kvm_book3s.h | 15 ++++++- >> arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 ++ >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 14 +++++- >> arch/powerpc/kvm/book3s_segment.S | 14 +++++- >> arch/powerpc/kvm/emulate.c | 71 +++++++++++++++++----= ---------- >> 5 files changed, 83 insertions(+), 35 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/in= clude/asm/kvm_book3s.h >> index 0ec00f4..36c5573 100644 >> --- a/arch/powerpc/include/asm/kvm_book3s.h >> +++ b/arch/powerpc/include/asm/kvm_book3s.h >> @@ -270,14 +270,22 @@ static inline ulong kvmppc_get_pc(struct kvm_v= cpu *vcpu) >> return vcpu->arch.pc; >> } >> >> +static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu) >> +{ >> + return vcpu->arch.shared->msr & MSR_LE; >> +} >> + >> static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu) >> { >> ulong pc =3D kvmppc_get_pc(vcpu); >> >> /* Load the instruction manually if it failed to do so in the >> * exit path */ >> - if (vcpu->arch.last_inst =3D=3D KVM_INST_FETCH_FAILED) >> + if (vcpu->arch.last_inst =3D=3D KVM_INST_FETCH_FAILED) { >> kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false); >> + if (kvmppc_need_byteswap(vcpu)) >> + vcpu->arch.last_inst =3D swab32(vcpu->arch.last_inst); >=20 > Could you please introduce a new helper to load 32bit numbers? Someth= ing like kvmppc_ldl or kvmppc_ld32. That'll be easier to read here then= :). ok. I did something in that spirit in the next patchset I am about to s= end. I will respin if needed but there is one fuzzy area though : kvmppc_read_inst(= ). =20 It calls kvmppc_get_last_inst() and then again kvmppc_ld(). Is that act= ually useful ?=20 >> + } >> >> return vcpu->arch.last_inst; >> } >> @@ -293,8 +301,11 @@ static inline u32 kvmppc_get_last_sc(struct kvm= _vcpu *vcpu) >> >> /* Load the instruction manually if it failed to do so in the >> * exit path */ >> - if (vcpu->arch.last_inst =3D=3D KVM_INST_FETCH_FAILED) >> + if (vcpu->arch.last_inst =3D=3D KVM_INST_FETCH_FAILED) { >> kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false); >> + if (kvmppc_need_byteswap(vcpu)) >> + vcpu->arch.last_inst =3D swab32(vcpu->arch.last_inst); >> + } >> >> return vcpu->arch.last_inst; >> } >> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/= book3s_64_mmu_hv.c >> index 3a89b85..28130c7 100644 >> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c >> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c >> @@ -547,6 +547,10 @@ static int kvmppc_hv_emulate_mmio(struct kvm_ru= n *run, struct kvm_vcpu *vcpu, >> ret =3D kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false); >> if (ret !=3D EMULATE_DONE || last_inst =3D=3D KVM_INST_FETCH_FAILE= D) >> return RESUME_GUEST; >> + >> + if (kvmppc_need_byteswap(vcpu)) >> + last_inst =3D swab32(last_inst); >> + >> vcpu->arch.last_inst =3D last_inst; >> } >> >> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/= kvm/book3s_hv_rmhandlers.S >> index dd80953..1d3ee40 100644 >> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> @@ -1393,14 +1393,26 @@ fast_interrupt_c_return: >> lwz r8, 0(r10) >> mtmsrd r3 >> >> + ld r0, VCPU_MSR(r9) >> + >> + /* r10 =3D vcpu->arch.msr & MSR_LE */ >> + rldicl r10, r0, 0, 63 >=20 > rldicl.? sure. >> + cmpdi r10, 0 >> + bne 2f >=20 > I think it makes sense to inline that branch in here instead. Just ma= ke this >=20 > stw r8, VCPU_LAST_INST(r9) > beq after_inst_store > /* Little endian instruction, swap for big endian hosts */ > addi ... > stwbrx ... >=20 > after_inst_store: >=20 > The duplicate store shouldn't really hurt too badly, but in our "fast= path" we're only doing one store anyway :). And the code becomes more = readable. It is indeed more readable. I have changed that. >> + >> /* Store the result */ >> stw r8, VCPU_LAST_INST(r9) >> >> /* Unset guest mode. */ >> - li r0, KVM_GUEST_MODE_NONE >> +1: li r0, KVM_GUEST_MODE_NONE >> stb r0, HSTATE_IN_GUEST(r13) >> b guest_exit_cont >> >> + /* Swap and store the result */ >> +2: addi r11, r9, VCPU_LAST_INST >> + stwbrx r8, 0, r11 >> + b 1b >> + >> /* >> * Similarly for an HISI, reflect it to the guest as an ISI unless >> * it is an HPTE not found fault for a page that we have paged out. >> diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/bo= ok3s_segment.S >> index 1abe478..bf20b45 100644 >> --- a/arch/powerpc/kvm/book3s_segment.S >> +++ b/arch/powerpc/kvm/book3s_segment.S >> @@ -287,7 +287,19 @@ ld_last_inst: >> sync >> >> #endif >> - stw r0, SVCPU_LAST_INST(r13) >> + ld r8, SVCPU_SHADOW_SRR1(r13) >> + >> + /* r10 =3D vcpu->arch.msr & MSR_LE */ >> + rldicl r10, r0, 0, 63 >> + cmpdi r10, 0 >> + beq 1f >> + >> + /* swap and store the result */ >> + addi r11, r13, SVCPU_LAST_INST >> + stwbrx r0, 0, r11 >> + b no_ld_last_inst >> + >> +1: stw r0, SVCPU_LAST_INST(r13) >> >> no_ld_last_inst: >> >> diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c >> index 751cd45..20529ca 100644 >> --- a/arch/powerpc/kvm/emulate.c >> +++ b/arch/powerpc/kvm/emulate.c >> @@ -232,6 +232,7 @@ int kvmppc_emulate_instruction(struct kvm_run *r= un, struct kvm_vcpu *vcpu) >> int sprn =3D get_sprn(inst); >> enum emulation_result emulated =3D EMULATE_DONE; >> int advance =3D 1; >> + int dont_byteswap =3D !kvmppc_need_byteswap(vcpu); >=20 > The parameter to kvmppc_handle_load is "is_bigendian" which is also t= he flag that we interpret=20 > for our byte swaps later. I think we should preserve that semantic. P= lease call your variable=20 > "is_bigendian" and create a separate helper for that one. >=20 > When little endian host kernels come, we only need to change the way = kvmppc_complete_mmio_load=20 > and kvmppc_handle_store swap things - probably according to user spac= e endianness even. ok. Thanks for the review Alex, I will be sending a new patchset shortly. C.