kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests
@ 2013-10-03 11:03 Cédric Le Goater
  2013-10-04 12:50 ` Alexander Graf
  2013-10-04 13:48 ` Aneesh Kumar K.V
  0 siblings, 2 replies; 6+ messages in thread
From: Cédric Le Goater @ 2013-10-03 11:03 UTC (permalink / raw)
  To: agraf, paulus; +Cc: kvm-ppc, kvm, Cédric Le Goater

MMIO emulation reads the last instruction executed by the guest
and then emulates. If the guest is running in Little Endian mode,
the instruction needs to be byte-swapped before being emulated.

This patch stores the last instruction in the endian order of the
host, primarily doing a byte-swap if needed. The common code
which fetches last_inst uses a helper routine kvmppc_need_byteswap().
and the exit paths for the Book3S PV and HR guests use their own
version in assembly.

kvmppc_emulate_instruction() also uses kvmppc_need_byteswap() to
define in which endian order the mmio needs to be done.

The patch is based on Alex Graf's kvm-ppc-queue branch and it
has been tested on Big Endian and Little Endian HV guests and
Big Endian PR guests.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
---

Here are some comments/questions : 

 * the host is assumed to be running in Big Endian. when Little Endian
   hosts are supported in the future, we will use the cpu features to
   fix kvmppc_need_byteswap()

 * the 'is_bigendian' parameter of the routines kvmppc_handle_load()
   and kvmppc_handle_store() seems redundant but the *BRX opcodes 
   make the improvements unclear. We could eventually rename the
   parameter to byteswap and the attribute vcpu->arch.mmio_is_bigendian
   to vcpu->arch.mmio_need_byteswap. Anyhow, the current naming sucks
   and I would happy to have some directions to fix it.



 arch/powerpc/include/asm/kvm_book3s.h   |   15 ++++++-
 arch/powerpc/kvm/book3s_64_mmu_hv.c     |    4 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   14 +++++-
 arch/powerpc/kvm/book3s_segment.S       |   14 +++++-
 arch/powerpc/kvm/emulate.c              |   71 +++++++++++++++++--------------
 5 files changed, 83 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 0ec00f4..36c5573 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -270,14 +270,22 @@ static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
 	return vcpu->arch.pc;
 }
 
+static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.shared->msr & MSR_LE;
+}
+
 static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
 {
 	ulong pc = kvmppc_get_pc(vcpu);
 
 	/* Load the instruction manually if it failed to do so in the
 	 * exit path */
-	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
+	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
 		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
+		if (kvmppc_need_byteswap(vcpu))
+			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
+	}
 
 	return vcpu->arch.last_inst;
 }
@@ -293,8 +301,11 @@ static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
 
 	/* Load the instruction manually if it failed to do so in the
 	 * exit path */
-	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
+	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
 		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
+		if (kvmppc_need_byteswap(vcpu))
+			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
+	}
 
 	return vcpu->arch.last_inst;
 }
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 3a89b85..28130c7 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -547,6 +547,10 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
 		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
 			return RESUME_GUEST;
+
+		if (kvmppc_need_byteswap(vcpu))
+			last_inst = swab32(last_inst);
+
 		vcpu->arch.last_inst = last_inst;
 	}
 
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index dd80953..1d3ee40 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1393,14 +1393,26 @@ fast_interrupt_c_return:
 	lwz	r8, 0(r10)
 	mtmsrd	r3
 
+	ld	r0, VCPU_MSR(r9)
+
+	/* r10 = vcpu->arch.msr & MSR_LE */
+	rldicl	r10, r0, 0, 63
+	cmpdi	r10, 0
+	bne	2f
+
 	/* Store the result */
 	stw	r8, VCPU_LAST_INST(r9)
 
 	/* Unset guest mode. */
-	li	r0, KVM_GUEST_MODE_NONE
+1:	li	r0, KVM_GUEST_MODE_NONE
 	stb	r0, HSTATE_IN_GUEST(r13)
 	b	guest_exit_cont
 
+	/* Swap and store the result */
+2:	addi	r11, r9, VCPU_LAST_INST
+	stwbrx	r8, 0, r11
+	b	1b
+
 /*
  * Similarly for an HISI, reflect it to the guest as an ISI unless
  * it is an HPTE not found fault for a page that we have paged out.
diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
index 1abe478..bf20b45 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -287,7 +287,19 @@ ld_last_inst:
 	sync
 
 #endif
-	stw	r0, SVCPU_LAST_INST(r13)
+	ld	r8, SVCPU_SHADOW_SRR1(r13)
+
+	/* r10 = vcpu->arch.msr & MSR_LE */
+	rldicl	r10, r0, 0, 63
+	cmpdi	r10, 0
+	beq	1f
+
+	/* swap and store the result */
+	addi	r11, r13, SVCPU_LAST_INST
+	stwbrx	r0, 0, r11
+	b	no_ld_last_inst
+
+1:	stw	r0, SVCPU_LAST_INST(r13)
 
 no_ld_last_inst:
 
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 751cd45..20529ca 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -232,6 +232,7 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 	int sprn = get_sprn(inst);
 	enum emulation_result emulated = EMULATE_DONE;
 	int advance = 1;
+	int dont_byteswap = !kvmppc_need_byteswap(vcpu);
 
 	/* this default type might be overwritten by subcategories */
 	kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
@@ -266,47 +267,53 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 			advance = 0;
 			break;
 		case OP_31_XOP_LWZX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
+			emulated = kvmppc_handle_load(run, vcpu, rt, 4,
+						      dont_byteswap);
 			break;
 
 		case OP_31_XOP_LBZX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
+			emulated = kvmppc_handle_load(run, vcpu, rt, 1,
+						      dont_byteswap);
 			break;
 
 		case OP_31_XOP_LBZUX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
+			emulated = kvmppc_handle_load(run, vcpu, rt, 1,
+						      dont_byteswap);
 			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 			break;
 
 		case OP_31_XOP_STWX:
 			emulated = kvmppc_handle_store(run, vcpu,
 						       kvmppc_get_gpr(vcpu, rs),
-			                               4, 1);
+						       4, dont_byteswap);
 			break;
 
 		case OP_31_XOP_STBX:
 			emulated = kvmppc_handle_store(run, vcpu,
 						       kvmppc_get_gpr(vcpu, rs),
-			                               1, 1);
+						       1, dont_byteswap);
 			break;
 
 		case OP_31_XOP_STBUX:
 			emulated = kvmppc_handle_store(run, vcpu,
 						       kvmppc_get_gpr(vcpu, rs),
-			                               1, 1);
+						       1, dont_byteswap);
 			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 			break;
 
 		case OP_31_XOP_LHAX:
-			emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
+			emulated = kvmppc_handle_loads(run, vcpu, rt, 2,
+						       dont_byteswap);
 			break;
 
 		case OP_31_XOP_LHZX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
+			emulated = kvmppc_handle_load(run, vcpu, rt, 2,
+						      dont_byteswap);
 			break;
 
 		case OP_31_XOP_LHZUX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
+			emulated = kvmppc_handle_load(run, vcpu, rt, 2,
+						      dont_byteswap);
 			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 			break;
 
@@ -317,13 +324,13 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		case OP_31_XOP_STHX:
 			emulated = kvmppc_handle_store(run, vcpu,
 						       kvmppc_get_gpr(vcpu, rs),
-			                               2, 1);
+						       2, dont_byteswap);
 			break;
 
 		case OP_31_XOP_STHUX:
 			emulated = kvmppc_handle_store(run, vcpu,
 						       kvmppc_get_gpr(vcpu, rs),
-			                               2, 1);
+						       2, dont_byteswap);
 			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 			break;
 
@@ -342,7 +349,8 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 			break;
 
 		case OP_31_XOP_LWBRX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 4, 0);
+			emulated = kvmppc_handle_load(run, vcpu, rt, 4,
+						      !dont_byteswap);
 			break;
 
 		case OP_31_XOP_TLBSYNC:
@@ -351,17 +359,18 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		case OP_31_XOP_STWBRX:
 			emulated = kvmppc_handle_store(run, vcpu,
 						       kvmppc_get_gpr(vcpu, rs),
-			                               4, 0);
+						       4, !dont_byteswap);
 			break;
 
 		case OP_31_XOP_LHBRX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 2, 0);
+			emulated = kvmppc_handle_load(run, vcpu, rt, 2,
+						      !dont_byteswap);
 			break;
 
 		case OP_31_XOP_STHBRX:
 			emulated = kvmppc_handle_store(run, vcpu,
 						       kvmppc_get_gpr(vcpu, rs),
-			                               2, 0);
+						       2, !dont_byteswap);
 			break;
 
 		default:
@@ -371,33 +380,33 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		break;
 
 	case OP_LWZ:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
+		emulated = kvmppc_handle_load(run, vcpu, rt, 4, dont_byteswap);
 		break;
 
 	/* TBD: Add support for other 64 bit load variants like ldu, ldux, ldx etc. */
 	case OP_LD:
 		rt = get_rt(inst);
-		emulated = kvmppc_handle_load(run, vcpu, rt, 8, 1);
+		emulated = kvmppc_handle_load(run, vcpu, rt, 8, dont_byteswap);
 		break;
 
 	case OP_LWZU:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
+		emulated = kvmppc_handle_load(run, vcpu, rt, 4, dont_byteswap);
 		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 		break;
 
 	case OP_LBZ:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
+		emulated = kvmppc_handle_load(run, vcpu, rt, 1, dont_byteswap);
 		break;
 
 	case OP_LBZU:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
+		emulated = kvmppc_handle_load(run, vcpu, rt, 1, dont_byteswap);
 		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 		break;
 
 	case OP_STW:
 		emulated = kvmppc_handle_store(run, vcpu,
 					       kvmppc_get_gpr(vcpu, rs),
-		                               4, 1);
+					       4, dont_byteswap);
 		break;
 
 	/* TBD: Add support for other 64 bit store variants like stdu, stdux, stdx etc. */
@@ -405,57 +414,57 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		rs = get_rs(inst);
 		emulated = kvmppc_handle_store(run, vcpu,
 					       kvmppc_get_gpr(vcpu, rs),
-		                               8, 1);
+					       8, dont_byteswap);
 		break;
 
 	case OP_STWU:
 		emulated = kvmppc_handle_store(run, vcpu,
 					       kvmppc_get_gpr(vcpu, rs),
-		                               4, 1);
+					       4, dont_byteswap);
 		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 		break;
 
 	case OP_STB:
 		emulated = kvmppc_handle_store(run, vcpu,
 					       kvmppc_get_gpr(vcpu, rs),
-		                               1, 1);
+					       1, dont_byteswap);
 		break;
 
 	case OP_STBU:
 		emulated = kvmppc_handle_store(run, vcpu,
 					       kvmppc_get_gpr(vcpu, rs),
-		                               1, 1);
+					       1, dont_byteswap);
 		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 		break;
 
 	case OP_LHZ:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
+		emulated = kvmppc_handle_load(run, vcpu, rt, 2, dont_byteswap);
 		break;
 
 	case OP_LHZU:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
+		emulated = kvmppc_handle_load(run, vcpu, rt, 2, dont_byteswap);
 		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 		break;
 
 	case OP_LHA:
-		emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
+		emulated = kvmppc_handle_loads(run, vcpu, rt, 2, dont_byteswap);
 		break;
 
 	case OP_LHAU:
-		emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
+		emulated = kvmppc_handle_loads(run, vcpu, rt, 2, dont_byteswap);
 		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 		break;
 
 	case OP_STH:
 		emulated = kvmppc_handle_store(run, vcpu,
 					       kvmppc_get_gpr(vcpu, rs),
-		                               2, 1);
+					       2, dont_byteswap);
 		break;
 
 	case OP_STHU:
 		emulated = kvmppc_handle_store(run, vcpu,
 					       kvmppc_get_gpr(vcpu, rs),
-		                               2, 1);
+					       2, dont_byteswap);
 		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
 		break;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests
  2013-10-03 11:03 [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests Cédric Le Goater
@ 2013-10-04 12:50 ` Alexander Graf
  2013-10-07 14:23   ` Cedric Le Goater
  2013-10-04 13:48 ` Aneesh Kumar K.V
  1 sibling, 1 reply; 6+ messages in thread
From: Alexander Graf @ 2013-10-04 12:50 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: paulus, kvm-ppc, kvm


On 03.10.2013, at 13:03, Cédric Le Goater wrote:

> MMIO emulation reads the last instruction executed by the guest
> and then emulates. If the guest is running in Little Endian mode,
> the instruction needs to be byte-swapped before being emulated.
> 
> This patch stores the last instruction in the endian order of the
> host, primarily doing a byte-swap if needed. The common code
> which fetches last_inst uses a helper routine kvmppc_need_byteswap().
> and the exit paths for the Book3S PV and HR guests use their own
> version in assembly.
> 
> kvmppc_emulate_instruction() also uses kvmppc_need_byteswap() to
> define in which endian order the mmio needs to be done.
> 
> The patch is based on Alex Graf's kvm-ppc-queue branch and it
> has been tested on Big Endian and Little Endian HV guests and
> Big Endian PR guests.
> 
> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
> ---
> 
> Here are some comments/questions : 
> 
> * the host is assumed to be running in Big Endian. when Little Endian
>   hosts are supported in the future, we will use the cpu features to
>   fix kvmppc_need_byteswap()
> 
> * the 'is_bigendian' parameter of the routines kvmppc_handle_load()
>   and kvmppc_handle_store() seems redundant but the *BRX opcodes 
>   make the improvements unclear. We could eventually rename the
>   parameter to byteswap and the attribute vcpu->arch.mmio_is_bigendian
>   to vcpu->arch.mmio_need_byteswap. Anyhow, the current naming sucks
>   and I would happy to have some directions to fix it.
> 
> 
> 
> arch/powerpc/include/asm/kvm_book3s.h   |   15 ++++++-
> arch/powerpc/kvm/book3s_64_mmu_hv.c     |    4 ++
> arch/powerpc/kvm/book3s_hv_rmhandlers.S |   14 +++++-
> arch/powerpc/kvm/book3s_segment.S       |   14 +++++-
> arch/powerpc/kvm/emulate.c              |   71 +++++++++++++++++--------------
> 5 files changed, 83 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
> index 0ec00f4..36c5573 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -270,14 +270,22 @@ static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
> 	return vcpu->arch.pc;
> }
> 
> +static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.shared->msr & MSR_LE;
> +}
> +
> static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
> {
> 	ulong pc = kvmppc_get_pc(vcpu);
> 
> 	/* Load the instruction manually if it failed to do so in the
> 	 * exit path */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
> 		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
> +		if (kvmppc_need_byteswap(vcpu))
> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);

Could you please introduce a new helper to load 32bit numbers? Something like kvmppc_ldl or kvmppc_ld32. That'll be easier to read here then :).

> +	}
> 
> 	return vcpu->arch.last_inst;
> }
> @@ -293,8 +301,11 @@ static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
> 
> 	/* Load the instruction manually if it failed to do so in the
> 	 * exit path */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
> 		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
> +		if (kvmppc_need_byteswap(vcpu))
> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
> +	}
> 
> 	return vcpu->arch.last_inst;
> }
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 3a89b85..28130c7 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -547,6 +547,10 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
> 		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> 		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
> 			return RESUME_GUEST;
> +
> +		if (kvmppc_need_byteswap(vcpu))
> +			last_inst = swab32(last_inst);
> +
> 		vcpu->arch.last_inst = last_inst;
> 	}
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index dd80953..1d3ee40 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1393,14 +1393,26 @@ fast_interrupt_c_return:
> 	lwz	r8, 0(r10)
> 	mtmsrd	r3
> 
> +	ld	r0, VCPU_MSR(r9)
> +
> +	/* r10 = vcpu->arch.msr & MSR_LE */
> +	rldicl	r10, r0, 0, 63

rldicl.?

> +	cmpdi	r10, 0
> +	bne	2f

I think it makes sense to inline that branch in here instead. Just make this

  stw r8, VCPU_LAST_INST(r9)
  beq after_inst_store
  /* Little endian instruction, swap for big endian hosts */
  addi ...
  stwbrx ...

after_inst_store:

The duplicate store shouldn't really hurt too badly, but in our "fast path" we're only doing one store anyway :). And the code becomes more readable.

> +
> 	/* Store the result */
> 	stw	r8, VCPU_LAST_INST(r9)
> 
> 	/* Unset guest mode. */
> -	li	r0, KVM_GUEST_MODE_NONE
> +1:	li	r0, KVM_GUEST_MODE_NONE
> 	stb	r0, HSTATE_IN_GUEST(r13)
> 	b	guest_exit_cont
> 
> +	/* Swap and store the result */
> +2:	addi	r11, r9, VCPU_LAST_INST
> +	stwbrx	r8, 0, r11
> +	b	1b
> +
> /*
>  * Similarly for an HISI, reflect it to the guest as an ISI unless
>  * it is an HPTE not found fault for a page that we have paged out.
> diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
> index 1abe478..bf20b45 100644
> --- a/arch/powerpc/kvm/book3s_segment.S
> +++ b/arch/powerpc/kvm/book3s_segment.S
> @@ -287,7 +287,19 @@ ld_last_inst:
> 	sync
> 
> #endif
> -	stw	r0, SVCPU_LAST_INST(r13)
> +	ld	r8, SVCPU_SHADOW_SRR1(r13)
> +
> +	/* r10 = vcpu->arch.msr & MSR_LE */
> +	rldicl	r10, r0, 0, 63
> +	cmpdi	r10, 0
> +	beq	1f
> +
> +	/* swap and store the result */
> +	addi	r11, r13, SVCPU_LAST_INST
> +	stwbrx	r0, 0, r11
> +	b	no_ld_last_inst
> +
> +1:	stw	r0, SVCPU_LAST_INST(r13)
> 
> no_ld_last_inst:
> 
> diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
> index 751cd45..20529ca 100644
> --- a/arch/powerpc/kvm/emulate.c
> +++ b/arch/powerpc/kvm/emulate.c
> @@ -232,6 +232,7 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
> 	int sprn = get_sprn(inst);
> 	enum emulation_result emulated = EMULATE_DONE;
> 	int advance = 1;
> +	int dont_byteswap = !kvmppc_need_byteswap(vcpu);

The parameter to kvmppc_handle_load is "is_bigendian" which is also the flag that we interpret for our byte swaps later. I think we should preserve that semantic. Please call your variable "is_bigendian" and create a separate helper for that one.

When little endian host kernels come, we only need to change the way kvmppc_complete_mmio_load and kvmppc_handle_store swap things - probably according to user space endianness even.


Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests
  2013-10-03 11:03 [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests Cédric Le Goater
  2013-10-04 12:50 ` Alexander Graf
@ 2013-10-04 13:48 ` Aneesh Kumar K.V
  2013-10-07 14:26   ` Cedric Le Goater
  1 sibling, 1 reply; 6+ messages in thread
From: Aneesh Kumar K.V @ 2013-10-04 13:48 UTC (permalink / raw)
  To: Cédric Le Goater, agraf, paulus; +Cc: kvm-ppc, kvm, Cédric Le Goater

Cédric Le Goater <clg@fr.ibm.com> writes:

> MMIO emulation reads the last instruction executed by the guest
> and then emulates. If the guest is running in Little Endian mode,
> the instruction needs to be byte-swapped before being emulated.
>
> This patch stores the last instruction in the endian order of the
> host, primarily doing a byte-swap if needed. The common code
> which fetches last_inst uses a helper routine kvmppc_need_byteswap().
> and the exit paths for the Book3S PV and HR guests use their own
> version in assembly.
>
> kvmppc_emulate_instruction() also uses kvmppc_need_byteswap() to
> define in which endian order the mmio needs to be done.
>
> The patch is based on Alex Graf's kvm-ppc-queue branch and it
> has been tested on Big Endian and Little Endian HV guests and
> Big Endian PR guests.
>
> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
> ---
>
> Here are some comments/questions : 
>
>  * the host is assumed to be running in Big Endian. when Little Endian
>    hosts are supported in the future, we will use the cpu features to
>    fix kvmppc_need_byteswap()
>
>  * the 'is_bigendian' parameter of the routines kvmppc_handle_load()
>    and kvmppc_handle_store() seems redundant but the *BRX opcodes 
>    make the improvements unclear. We could eventually rename the
>    parameter to byteswap and the attribute vcpu->arch.mmio_is_bigendian
>    to vcpu->arch.mmio_need_byteswap. Anyhow, the current naming sucks
>    and I would happy to have some directions to fix it.
>
>
>
>  arch/powerpc/include/asm/kvm_book3s.h   |   15 ++++++-
>  arch/powerpc/kvm/book3s_64_mmu_hv.c     |    4 ++
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   14 +++++-
>  arch/powerpc/kvm/book3s_segment.S       |   14 +++++-
>  arch/powerpc/kvm/emulate.c              |   71 +++++++++++++++++--------------
>  5 files changed, 83 insertions(+), 35 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
> index 0ec00f4..36c5573 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -270,14 +270,22 @@ static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
>  	return vcpu->arch.pc;
>  }
>  
> +static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.shared->msr & MSR_LE;
> +}
> +

May be kvmppc_need_instbyteswap ?, because for data it also depend on
SLE bit ? Don't also need to check the host platform endianness here ?
ie, if host os __BIG_ENDIAN__ ?

>  static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
>  {
>  	ulong pc = kvmppc_get_pc(vcpu);
>  
>  	/* Load the instruction manually if it failed to do so in the
>  	 * exit path */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
>  		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
> +		if (kvmppc_need_byteswap(vcpu))
> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
> +	}
>  
>  	return vcpu->arch.last_inst;
>  }
> @@ -293,8 +301,11 @@ static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
>  
>  	/* Load the instruction manually if it failed to do so in the
>  	 * exit path */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
>  		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
> +		if (kvmppc_need_byteswap(vcpu))
> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
> +	}
>  
>  	return vcpu->arch.last_inst;
>  }
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 3a89b85..28130c7 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -547,6 +547,10 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>  		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
>  		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
>  			return RESUME_GUEST;
> +
> +		if (kvmppc_need_byteswap(vcpu))
> +			last_inst = swab32(last_inst);
> +
>  		vcpu->arch.last_inst = last_inst;
>  	}
>  
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index dd80953..1d3ee40 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1393,14 +1393,26 @@ fast_interrupt_c_return:
>  	lwz	r8, 0(r10)
>  	mtmsrd	r3
>  
> +	ld	r0, VCPU_MSR(r9)
> +
> +	/* r10 = vcpu->arch.msr & MSR_LE */
> +	rldicl	r10, r0, 0, 63
> +	cmpdi	r10, 0
> +	bne	2f
> +
>  	/* Store the result */
>  	stw	r8, VCPU_LAST_INST(r9)
>  
>  	/* Unset guest mode. */
> -	li	r0, KVM_GUEST_MODE_NONE
> +1:	li	r0, KVM_GUEST_MODE_NONE
>  	stb	r0, HSTATE_IN_GUEST(r13)
>  	b	guest_exit_cont
>  
> +	/* Swap and store the result */
> +2:	addi	r11, r9, VCPU_LAST_INST
> +	stwbrx	r8, 0, r11
> +	b	1b
> +
>  /*
>   * Similarly for an HISI, reflect it to the guest as an ISI unless
>   * it is an HPTE not found fault for a page that we have paged out.
> diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
> index 1abe478..bf20b45 100644
> --- a/arch/powerpc/kvm/book3s_segment.S
> +++ b/arch/powerpc/kvm/book3s_segment.S
> @@ -287,7 +287,19 @@ ld_last_inst:
>  	sync
>  
>  #endif
> -	stw	r0, SVCPU_LAST_INST(r13)
> +	ld	r8, SVCPU_SHADOW_SRR1(r13)
> +
> +	/* r10 = vcpu->arch.msr & MSR_LE */
> +	rldicl	r10, r0, 0, 63

that should be  ?
	rldicl	r10, r8, 0, 63

-aneesh


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests
  2013-10-04 12:50 ` Alexander Graf
@ 2013-10-07 14:23   ` Cedric Le Goater
  2013-10-07 14:52     ` Alexander Graf
  0 siblings, 1 reply; 6+ messages in thread
From: Cedric Le Goater @ 2013-10-07 14:23 UTC (permalink / raw)
  To: Alexander Graf; +Cc: paulus, kvm-ppc, kvm

Hi Alex,

On 10/04/2013 02:50 PM, Alexander Graf wrote:
> 
> On 03.10.2013, at 13:03, Cédric Le Goater wrote:
> 
>> MMIO emulation reads the last instruction executed by the guest
>> and then emulates. If the guest is running in Little Endian mode,
>> the instruction needs to be byte-swapped before being emulated.
>>
>> This patch stores the last instruction in the endian order of the
>> host, primarily doing a byte-swap if needed. The common code
>> which fetches last_inst uses a helper routine kvmppc_need_byteswap().
>> and the exit paths for the Book3S PV and HR guests use their own
>> version in assembly.
>>
>> kvmppc_emulate_instruction() also uses kvmppc_need_byteswap() to
>> define in which endian order the mmio needs to be done.
>>
>> The patch is based on Alex Graf's kvm-ppc-queue branch and it
>> has been tested on Big Endian and Little Endian HV guests and
>> Big Endian PR guests.
>>
>> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
>> ---
>>
>> Here are some comments/questions : 
>>
>> * the host is assumed to be running in Big Endian. when Little Endian
>>   hosts are supported in the future, we will use the cpu features to
>>   fix kvmppc_need_byteswap()
>>
>> * the 'is_bigendian' parameter of the routines kvmppc_handle_load()
>>   and kvmppc_handle_store() seems redundant but the *BRX opcodes 
>>   make the improvements unclear. We could eventually rename the
>>   parameter to byteswap and the attribute vcpu->arch.mmio_is_bigendian
>>   to vcpu->arch.mmio_need_byteswap. Anyhow, the current naming sucks
>>   and I would happy to have some directions to fix it.
>>
>>
>>
>> arch/powerpc/include/asm/kvm_book3s.h   |   15 ++++++-
>> arch/powerpc/kvm/book3s_64_mmu_hv.c     |    4 ++
>> arch/powerpc/kvm/book3s_hv_rmhandlers.S |   14 +++++-
>> arch/powerpc/kvm/book3s_segment.S       |   14 +++++-
>> arch/powerpc/kvm/emulate.c              |   71 +++++++++++++++++--------------
>> 5 files changed, 83 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
>> index 0ec00f4..36c5573 100644
>> --- a/arch/powerpc/include/asm/kvm_book3s.h
>> +++ b/arch/powerpc/include/asm/kvm_book3s.h
>> @@ -270,14 +270,22 @@ static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
>> 	return vcpu->arch.pc;
>> }
>>
>> +static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>> +{
>> +	return vcpu->arch.shared->msr & MSR_LE;
>> +}
>> +
>> static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
>> {
>> 	ulong pc = kvmppc_get_pc(vcpu);
>>
>> 	/* Load the instruction manually if it failed to do so in the
>> 	 * exit path */
>> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
>> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
>> 		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
>> +		if (kvmppc_need_byteswap(vcpu))
>> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
> 
> Could you please introduce a new helper to load 32bit numbers? Something like kvmppc_ldl or kvmppc_ld32. That'll be easier to read here then :).

ok. I did something in that spirit in the next patchset I am about to send. I will
respin if needed but there is one fuzzy area though : kvmppc_read_inst().  

It calls kvmppc_get_last_inst() and then again kvmppc_ld(). Is that actually useful ? 

>> +	}
>>
>> 	return vcpu->arch.last_inst;
>> }
>> @@ -293,8 +301,11 @@ static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
>>
>> 	/* Load the instruction manually if it failed to do so in the
>> 	 * exit path */
>> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
>> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
>> 		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
>> +		if (kvmppc_need_byteswap(vcpu))
>> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
>> +	}
>>
>> 	return vcpu->arch.last_inst;
>> }
>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> index 3a89b85..28130c7 100644
>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> @@ -547,6 +547,10 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>> 		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
>> 		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
>> 			return RESUME_GUEST;
>> +
>> +		if (kvmppc_need_byteswap(vcpu))
>> +			last_inst = swab32(last_inst);
>> +
>> 		vcpu->arch.last_inst = last_inst;
>> 	}
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> index dd80953..1d3ee40 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> @@ -1393,14 +1393,26 @@ fast_interrupt_c_return:
>> 	lwz	r8, 0(r10)
>> 	mtmsrd	r3
>>
>> +	ld	r0, VCPU_MSR(r9)
>> +
>> +	/* r10 = vcpu->arch.msr & MSR_LE */
>> +	rldicl	r10, r0, 0, 63
> 
> rldicl.?

sure.

>> +	cmpdi	r10, 0
>> +	bne	2f
> 
> I think it makes sense to inline that branch in here instead. Just make this
> 
>   stw r8, VCPU_LAST_INST(r9)
>   beq after_inst_store
>   /* Little endian instruction, swap for big endian hosts */
>   addi ...
>   stwbrx ...
> 
> after_inst_store:
> 
> The duplicate store shouldn't really hurt too badly, but in our "fast path" we're only doing one store anyway :). And the code becomes more readable.

It is indeed more readable. I have changed that.

>> +
>> 	/* Store the result */
>> 	stw	r8, VCPU_LAST_INST(r9)
>>
>> 	/* Unset guest mode. */
>> -	li	r0, KVM_GUEST_MODE_NONE
>> +1:	li	r0, KVM_GUEST_MODE_NONE
>> 	stb	r0, HSTATE_IN_GUEST(r13)
>> 	b	guest_exit_cont
>>
>> +	/* Swap and store the result */
>> +2:	addi	r11, r9, VCPU_LAST_INST
>> +	stwbrx	r8, 0, r11
>> +	b	1b
>> +
>> /*
>>  * Similarly for an HISI, reflect it to the guest as an ISI unless
>>  * it is an HPTE not found fault for a page that we have paged out.
>> diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
>> index 1abe478..bf20b45 100644
>> --- a/arch/powerpc/kvm/book3s_segment.S
>> +++ b/arch/powerpc/kvm/book3s_segment.S
>> @@ -287,7 +287,19 @@ ld_last_inst:
>> 	sync
>>
>> #endif
>> -	stw	r0, SVCPU_LAST_INST(r13)
>> +	ld	r8, SVCPU_SHADOW_SRR1(r13)
>> +
>> +	/* r10 = vcpu->arch.msr & MSR_LE */
>> +	rldicl	r10, r0, 0, 63
>> +	cmpdi	r10, 0
>> +	beq	1f
>> +
>> +	/* swap and store the result */
>> +	addi	r11, r13, SVCPU_LAST_INST
>> +	stwbrx	r0, 0, r11
>> +	b	no_ld_last_inst
>> +
>> +1:	stw	r0, SVCPU_LAST_INST(r13)
>>
>> no_ld_last_inst:
>>
>> diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
>> index 751cd45..20529ca 100644
>> --- a/arch/powerpc/kvm/emulate.c
>> +++ b/arch/powerpc/kvm/emulate.c
>> @@ -232,6 +232,7 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
>> 	int sprn = get_sprn(inst);
>> 	enum emulation_result emulated = EMULATE_DONE;
>> 	int advance = 1;
>> +	int dont_byteswap = !kvmppc_need_byteswap(vcpu);
> 
> The parameter to kvmppc_handle_load is "is_bigendian" which is also the flag that we interpret 
> for our byte swaps later. I think we should preserve that semantic. Please call your variable 
> "is_bigendian" and create a separate helper for that one.
> 
> When little endian host kernels come, we only need to change the way kvmppc_complete_mmio_load 
> and kvmppc_handle_store swap things - probably according to user space endianness even.

ok.

Thanks for the review Alex, I will be sending a new patchset shortly.

C.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests
  2013-10-04 13:48 ` Aneesh Kumar K.V
@ 2013-10-07 14:26   ` Cedric Le Goater
  0 siblings, 0 replies; 6+ messages in thread
From: Cedric Le Goater @ 2013-10-07 14:26 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: agraf, paulus, kvm-ppc, kvm

On 10/04/2013 03:48 PM, Aneesh Kumar K.V wrote:
> Cédric Le Goater <clg@fr.ibm.com> writes:
> 
>> MMIO emulation reads the last instruction executed by the guest
>> and then emulates. If the guest is running in Little Endian mode,
>> the instruction needs to be byte-swapped before being emulated.
>>
>> This patch stores the last instruction in the endian order of the
>> host, primarily doing a byte-swap if needed. The common code
>> which fetches last_inst uses a helper routine kvmppc_need_byteswap().
>> and the exit paths for the Book3S PV and HR guests use their own
>> version in assembly.
>>
>> kvmppc_emulate_instruction() also uses kvmppc_need_byteswap() to
>> define in which endian order the mmio needs to be done.
>>
>> The patch is based on Alex Graf's kvm-ppc-queue branch and it
>> has been tested on Big Endian and Little Endian HV guests and
>> Big Endian PR guests.
>>
>> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
>> ---
>>
>> Here are some comments/questions : 
>>
>>  * the host is assumed to be running in Big Endian. when Little Endian
>>    hosts are supported in the future, we will use the cpu features to
>>    fix kvmppc_need_byteswap()
>>
>>  * the 'is_bigendian' parameter of the routines kvmppc_handle_load()
>>    and kvmppc_handle_store() seems redundant but the *BRX opcodes 
>>    make the improvements unclear. We could eventually rename the
>>    parameter to byteswap and the attribute vcpu->arch.mmio_is_bigendian
>>    to vcpu->arch.mmio_need_byteswap. Anyhow, the current naming sucks
>>    and I would happy to have some directions to fix it.
>>
>>
>>
>>  arch/powerpc/include/asm/kvm_book3s.h   |   15 ++++++-
>>  arch/powerpc/kvm/book3s_64_mmu_hv.c     |    4 ++
>>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   14 +++++-
>>  arch/powerpc/kvm/book3s_segment.S       |   14 +++++-
>>  arch/powerpc/kvm/emulate.c              |   71 +++++++++++++++++--------------
>>  5 files changed, 83 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
>> index 0ec00f4..36c5573 100644
>> --- a/arch/powerpc/include/asm/kvm_book3s.h
>> +++ b/arch/powerpc/include/asm/kvm_book3s.h
>> @@ -270,14 +270,22 @@ static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
>>  	return vcpu->arch.pc;
>>  }
>>  
>> +static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>> +{
>> +	return vcpu->arch.shared->msr & MSR_LE;
>> +}
>> +
> 
> May be kvmppc_need_instbyteswap ?, because for data it also depend on
> SLE bit ? Don't also need to check the host platform endianness here ?
> ie, if host os __BIG_ENDIAN__ ?

I think we will wait for the host to become Little Endian before adding
more complexity. 

>>  static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
>>  {
>>  	ulong pc = kvmppc_get_pc(vcpu);
>>  
>>  	/* Load the instruction manually if it failed to do so in the
>>  	 * exit path */
>> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
>> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
>>  		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
>> +		if (kvmppc_need_byteswap(vcpu))
>> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
>> +	}
>>  
>>  	return vcpu->arch.last_inst;
>>  }
>> @@ -293,8 +301,11 @@ static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
>>  
>>  	/* Load the instruction manually if it failed to do so in the
>>  	 * exit path */
>> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
>> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
>>  		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
>> +		if (kvmppc_need_byteswap(vcpu))
>> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
>> +	}
>>  
>>  	return vcpu->arch.last_inst;
>>  }
>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> index 3a89b85..28130c7 100644
>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> @@ -547,6 +547,10 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>  		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
>>  		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
>>  			return RESUME_GUEST;
>> +
>> +		if (kvmppc_need_byteswap(vcpu))
>> +			last_inst = swab32(last_inst);
>> +
>>  		vcpu->arch.last_inst = last_inst;
>>  	}
>>  
>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> index dd80953..1d3ee40 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> @@ -1393,14 +1393,26 @@ fast_interrupt_c_return:
>>  	lwz	r8, 0(r10)
>>  	mtmsrd	r3
>>  
>> +	ld	r0, VCPU_MSR(r9)
>> +
>> +	/* r10 = vcpu->arch.msr & MSR_LE */
>> +	rldicl	r10, r0, 0, 63
>> +	cmpdi	r10, 0
>> +	bne	2f
>> +
>>  	/* Store the result */
>>  	stw	r8, VCPU_LAST_INST(r9)
>>  
>>  	/* Unset guest mode. */
>> -	li	r0, KVM_GUEST_MODE_NONE
>> +1:	li	r0, KVM_GUEST_MODE_NONE
>>  	stb	r0, HSTATE_IN_GUEST(r13)
>>  	b	guest_exit_cont
>>  
>> +	/* Swap and store the result */
>> +2:	addi	r11, r9, VCPU_LAST_INST
>> +	stwbrx	r8, 0, r11
>> +	b	1b
>> +
>>  /*
>>   * Similarly for an HISI, reflect it to the guest as an ISI unless
>>   * it is an HPTE not found fault for a page that we have paged out.
>> diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
>> index 1abe478..bf20b45 100644
>> --- a/arch/powerpc/kvm/book3s_segment.S
>> +++ b/arch/powerpc/kvm/book3s_segment.S
>> @@ -287,7 +287,19 @@ ld_last_inst:
>>  	sync
>>  
>>  #endif
>> -	stw	r0, SVCPU_LAST_INST(r13)
>> +	ld	r8, SVCPU_SHADOW_SRR1(r13)
>> +
>> +	/* r10 = vcpu->arch.msr & MSR_LE */
>> +	rldicl	r10, r0, 0, 63
> 
> that should be  ?
> 	rldicl	r10, r8, 0, 63

oups. Good catch.

Thanks for the review Aneesh.

C.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests
  2013-10-07 14:23   ` Cedric Le Goater
@ 2013-10-07 14:52     ` Alexander Graf
  0 siblings, 0 replies; 6+ messages in thread
From: Alexander Graf @ 2013-10-07 14:52 UTC (permalink / raw)
  To: Cedric Le Goater
  Cc: Paul Mackerras, kvm-ppc, kvm@vger.kernel.org mailing list


On 07.10.2013, at 16:23, Cedric Le Goater <clg@fr.ibm.com> wrote:

> Hi Alex,
> 
> On 10/04/2013 02:50 PM, Alexander Graf wrote:
>> 
>> On 03.10.2013, at 13:03, Cédric Le Goater wrote:
>> 
>>> MMIO emulation reads the last instruction executed by the guest
>>> and then emulates. If the guest is running in Little Endian mode,
>>> the instruction needs to be byte-swapped before being emulated.
>>> 
>>> This patch stores the last instruction in the endian order of the
>>> host, primarily doing a byte-swap if needed. The common code
>>> which fetches last_inst uses a helper routine kvmppc_need_byteswap().
>>> and the exit paths for the Book3S PV and HR guests use their own
>>> version in assembly.
>>> 
>>> kvmppc_emulate_instruction() also uses kvmppc_need_byteswap() to
>>> define in which endian order the mmio needs to be done.
>>> 
>>> The patch is based on Alex Graf's kvm-ppc-queue branch and it
>>> has been tested on Big Endian and Little Endian HV guests and
>>> Big Endian PR guests.
>>> 
>>> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
>>> ---
>>> 
>>> Here are some comments/questions : 
>>> 
>>> * the host is assumed to be running in Big Endian. when Little Endian
>>>  hosts are supported in the future, we will use the cpu features to
>>>  fix kvmppc_need_byteswap()
>>> 
>>> * the 'is_bigendian' parameter of the routines kvmppc_handle_load()
>>>  and kvmppc_handle_store() seems redundant but the *BRX opcodes 
>>>  make the improvements unclear. We could eventually rename the
>>>  parameter to byteswap and the attribute vcpu->arch.mmio_is_bigendian
>>>  to vcpu->arch.mmio_need_byteswap. Anyhow, the current naming sucks
>>>  and I would happy to have some directions to fix it.
>>> 
>>> 
>>> 
>>> arch/powerpc/include/asm/kvm_book3s.h   |   15 ++++++-
>>> arch/powerpc/kvm/book3s_64_mmu_hv.c     |    4 ++
>>> arch/powerpc/kvm/book3s_hv_rmhandlers.S |   14 +++++-
>>> arch/powerpc/kvm/book3s_segment.S       |   14 +++++-
>>> arch/powerpc/kvm/emulate.c              |   71 +++++++++++++++++--------------
>>> 5 files changed, 83 insertions(+), 35 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
>>> index 0ec00f4..36c5573 100644
>>> --- a/arch/powerpc/include/asm/kvm_book3s.h
>>> +++ b/arch/powerpc/include/asm/kvm_book3s.h
>>> @@ -270,14 +270,22 @@ static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
>>> 	return vcpu->arch.pc;
>>> }
>>> 
>>> +static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>>> +{
>>> +	return vcpu->arch.shared->msr & MSR_LE;
>>> +}
>>> +
>>> static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
>>> {
>>> 	ulong pc = kvmppc_get_pc(vcpu);
>>> 
>>> 	/* Load the instruction manually if it failed to do so in the
>>> 	 * exit path */
>>> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
>>> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
>>> 		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
>>> +		if (kvmppc_need_byteswap(vcpu))
>>> +			vcpu->arch.last_inst = swab32(vcpu->arch.last_inst);
>> 
>> Could you please introduce a new helper to load 32bit numbers? Something like kvmppc_ldl or kvmppc_ld32. That'll be easier to read here then :).
> 
> ok. I did something in that spirit in the next patchset I am about to send. I will
> respin if needed but there is one fuzzy area though : kvmppc_read_inst().  
> 
> It calls kvmppc_get_last_inst() and then again kvmppc_ld(). Is that actually useful ? 

We can only assume that the contents of vcpu->arch.last_inst is valid (which is what kvmppc_get_last_inst relies on) when we hit one of these interrupts:

        /* We only load the last instruction when it's safe */
        cmpwi   r12, BOOK3S_INTERRUPT_DATA_STORAGE
        beq     ld_last_inst
        cmpwi   r12, BOOK3S_INTERRUPT_PROGRAM
        beq     ld_last_inst
        cmpwi   r12, BOOK3S_INTERRUPT_SYSCALL
        beq     ld_last_prev_inst
        cmpwi   r12, BOOK3S_INTERRUPT_ALIGNMENT
        beq-    ld_last_inst
#ifdef CONFIG_PPC64
BEGIN_FTR_SECTION
        cmpwi   r12, BOOK3S_INTERRUPT_H_EMUL_ASSIST
        beq-    ld_last_inst
END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
#endif

        b       no_ld_last_inst

Outside of these interrupt handlers, we have to ensure that we manually load the instruction and if that fails, inject an interrupt into the guest to indicate that we couldn't load it.

I have to admit that the code flow is slightly confusing here. If you have suggestions how to improve it, I'm more than happy to see patches :).


Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-10-07 14:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-03 11:03 [RFC PATCH] KVM: PPC: Book3S: MMIO emulation support for little endian guests Cédric Le Goater
2013-10-04 12:50 ` Alexander Graf
2013-10-07 14:23   ` Cedric Le Goater
2013-10-07 14:52     ` Alexander Graf
2013-10-04 13:48 ` Aneesh Kumar K.V
2013-10-07 14:26   ` Cedric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).