linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE fixes
@ 2016-12-01  7:18 Nicholas Piggin
  2016-12-01  7:18 ` [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV Nicholas Piggin
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Nicholas Piggin @ 2016-12-01  7:18 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nicholas Piggin, Alexander Graf, kvm-ppc, Michael Ellerman,
	linuxppc-dev

Hi,

I didn't get any objections to the approach proposed in my
earlier RFC, so I've gone ahead with R12 = (CR | trap #) approach.
It avoided an extra register save with HV and the PR handler ended
up not being too bad.

This passed KVM boot testing with 64-bit HV and PR, with the (host)
kernel running at non-0. Without these patches, the same configuration
crashes immediately.

Thanks,
Nick

Nicholas Piggin (3):
  KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on
    HV
  KVM: PPC: Book3S: Move 64-bit KVM interrupt handler out from alt
    section
  KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interrupts

 arch/powerpc/include/asm/exception-64s.h | 67 +++++++++++++++++++++++++-------
 arch/powerpc/include/asm/head-64.h       |  2 +-
 arch/powerpc/kernel/exceptions-64s.S     | 10 ++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 19 +++++----
 arch/powerpc/kvm/book3s_segment.S        | 32 +++++++++++----
 5 files changed, 94 insertions(+), 36 deletions(-)

-- 
2.10.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV
  2016-12-01  7:18 [PATCH 0/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE fixes Nicholas Piggin
@ 2016-12-01  7:18 ` Nicholas Piggin
  2016-12-06  6:09   ` Paul Mackerras
  2016-12-01  7:18 ` [PATCH 2/3] KVM: PPC: Book3S: Move 64-bit KVM interrupt handler out from alt section Nicholas Piggin
  2016-12-01  7:18 ` [PATCH 3/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interrupts Nicholas Piggin
  2 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2016-12-01  7:18 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nicholas Piggin, Alexander Graf, kvm-ppc, Michael Ellerman,
	linuxppc-dev

Change the calling convention to put the trap number together with
CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV
handler, and r9 free.

The 64-bit PR handler entry translates the calling convention back
to match the previous call convention (i.e., shared with 32-bit), for
simplicity.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 28 +++++++++++++++-------------
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 15 +++++++--------
 arch/powerpc/kvm/book3s_segment.S        | 27 ++++++++++++++++++++-------
 3 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 9a3eee6..bc8fc45 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 #endif
 
-#define __KVM_HANDLER_PROLOG(area, n)					\
+#define __KVM_HANDLER(area, h, n)					\
 	BEGIN_FTR_SECTION_NESTED(947)					\
 	ld	r10,area+EX_CFAR(r13);					\
 	std	r10,HSTATE_CFAR(r13);					\
@@ -243,30 +243,32 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	std	r10,HSTATE_PPR(r13);					\
 	END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);	\
 	ld	r10,area+EX_R10(r13);					\
-	stw	r9,HSTATE_SCRATCH1(r13);				\
-	ld	r9,area+EX_R9(r13);					\
 	std	r12,HSTATE_SCRATCH0(r13);				\
-
-#define __KVM_HANDLER(area, h, n)					\
-	__KVM_HANDLER_PROLOG(area, n)					\
-	li	r12,n;							\
+	li	r12,(n);						\
+	sldi	r12,r12,32;						\
+	or	r12,r12,r9;						\
+	ld	r9,area+EX_R9(r13);					\
+	std	r9,HSTATE_SCRATCH1(r13);				\
 	b	kvmppc_interrupt
 
 #define __KVM_HANDLER_SKIP(area, h, n)					\
 	cmpwi	r10,KVM_GUEST_MODE_SKIP;				\
-	ld	r10,area+EX_R10(r13);					\
 	beq	89f;							\
-	stw	r9,HSTATE_SCRATCH1(r13);				\
 	BEGIN_FTR_SECTION_NESTED(948)					\
-	ld	r9,area+EX_PPR(r13);					\
-	std	r9,HSTATE_PPR(r13);					\
+	ld	r10,area+EX_PPR(r13);					\
+	std	r10,HSTATE_PPR(r13);					\
 	END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);	\
-	ld	r9,area+EX_R9(r13);					\
+	ld	r10,area+EX_R10(r13);					\
 	std	r12,HSTATE_SCRATCH0(r13);				\
-	li	r12,n;							\
+	li	r12,(n);						\
+	sldi	r12,r12,32;						\
+	or	r12,r12,r9;						\
+	ld	r9,area+EX_R9(r13);					\
+	std	r9,HSTATE_SCRATCH1(r13);				\
 	b	kvmppc_interrupt;					\
 89:	mtocrf	0x80,r9;						\
 	ld	r9,area+EX_R9(r13);					\
+	ld	r10,area+EX_R10(r13);					\
 	b	kvmppc_skip_##h##interrupt
 
 #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c3c1d1b..0536c73 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1043,19 +1043,18 @@ hdec_soon:
 kvmppc_interrupt_hv:
 	/*
 	 * Register contents:
-	 * R12		= interrupt vector
+	 * R12		= (interrupt vector << 32) | guest CR
 	 * R13		= PACA
-	 * guest CR, R12 saved in shadow VCPU SCRATCH1/0
+	 * R9		= unused
+	 * guest R12, R9 saved in shadow VCPU SCRATCH0/1 respectively
 	 * guest R13 saved in SPRN_SCRATCH0
 	 */
-	std	r9, HSTATE_SCRATCH2(r13)
-
 	lbz	r9, HSTATE_IN_GUEST(r13)
 	cmpwi	r9, KVM_GUEST_MODE_HOST_HV
 	beq	kvmppc_bad_host_intr
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	cmpwi	r9, KVM_GUEST_MODE_GUEST
-	ld	r9, HSTATE_SCRATCH2(r13)
+	ld	r9, HSTATE_SCRATCH1(r13)
 	beq	kvmppc_interrupt_pr
 #endif
 	/* We're now back in the host but in guest MMU context */
@@ -1075,14 +1074,13 @@ kvmppc_interrupt_hv:
 	std	r6, VCPU_GPR(R6)(r9)
 	std	r7, VCPU_GPR(R7)(r9)
 	std	r8, VCPU_GPR(R8)(r9)
-	ld	r0, HSTATE_SCRATCH2(r13)
+	ld	r0, HSTATE_SCRATCH1(r13)
 	std	r0, VCPU_GPR(R9)(r9)
 	std	r10, VCPU_GPR(R10)(r9)
 	std	r11, VCPU_GPR(R11)(r9)
 	ld	r3, HSTATE_SCRATCH0(r13)
-	lwz	r4, HSTATE_SCRATCH1(r13)
 	std	r3, VCPU_GPR(R12)(r9)
-	stw	r4, VCPU_CR(r9)
+	stw	r12, VCPU_CR(r9)	/* CR is in the low half of r12 */
 BEGIN_FTR_SECTION
 	ld	r3, HSTATE_CFAR(r13)
 	std	r3, VCPU_CFAR(r9)
@@ -1100,6 +1098,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	mfspr	r11, SPRN_SRR1
 	std	r10, VCPU_SRR0(r9)
 	std	r11, VCPU_SRR1(r9)
+	srdi	r12, r12, 32		/* trap is in the high half of r12 */
 	andi.	r0, r12, 2		/* need to read HSRR0/1? */
 	beq	1f
 	mfspr	r10, SPRN_HSRR0
diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
index ca8f174..3b29f0f 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -167,20 +167,33 @@ kvmppc_handler_trampoline_enter_end:
  *                                                                            *
  *****************************************************************************/
 
-.global kvmppc_handler_trampoline_exit
-kvmppc_handler_trampoline_exit:
-
 .global kvmppc_interrupt_pr
 kvmppc_interrupt_pr:
+	/* 64-bit entry. Register usage at this point:
+	 *
+	 * SPRG_SCRATCH0   = guest R13
+	 * R9              = unused
+	 * R12             = (exit handler id << 32) | guest CR
+	 * R13             = PACA
+	 * HSTATE.SCRATCH0 = guest R12
+	 * HSTATE.SCRATCH1 = guest R9
+	 */
+#ifdef CONFIG_PPC64
+	/* Match 32-bit entry */
+	ld	r9,HSTATE_SCRATCH1(r13)
+	stw	r12,HSTATE_SCRATCH1(r13) /* CR is in the low half of r12 */
+	srdi	r12, r12, 32		 /* trap is in the high half of r12 */
+#endif
 
+.global kvmppc_handler_trampoline_exit
+kvmppc_handler_trampoline_exit:
 	/* Register usage at this point:
 	 *
-	 * SPRG_SCRATCH0  = guest R13
-	 * R12            = exit handler id
-	 * R13            = shadow vcpu (32-bit) or PACA (64-bit)
+	 * SPRG_SCRATCH0   = guest R13
+	 * R12             = exit handler id
+	 * R13             = shadow vcpu (32-bit) or PACA (64-bit)
 	 * HSTATE.SCRATCH0 = guest R12
 	 * HSTATE.SCRATCH1 = guest CR
-	 *
 	 */
 
 	/* Save registers */
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] KVM: PPC: Book3S: Move 64-bit KVM interrupt handler out from alt section
  2016-12-01  7:18 [PATCH 0/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE fixes Nicholas Piggin
  2016-12-01  7:18 ` [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV Nicholas Piggin
@ 2016-12-01  7:18 ` Nicholas Piggin
  2016-12-01  7:18 ` [PATCH 3/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interrupts Nicholas Piggin
  2 siblings, 0 replies; 8+ messages in thread
From: Nicholas Piggin @ 2016-12-01  7:18 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nicholas Piggin, Alexander Graf, kvm-ppc, Michael Ellerman,
	linuxppc-dev

A subsequent patch to make KVM handlers relocation-safe makes them
unusable from within alt section "else" cases (due to the way fixed
addresses are taken from within fixed section head code).

Stop open-coding the KVM handlers, and add them both as normal. A more
optimal fix may be to allow some level of alternate feature patching in
the exception macros themselves, but for now this will do.

The TRAMP_KVM handlers must be moved to the "virt" fixed section area
(name is arbitrary) in order to be closer to .text and avoid the dreaded
"relocation truncated to fit" error.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 1ba82ea..5faff1c 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -717,13 +717,9 @@ hardware_interrupt_hv:
 	BEGIN_FTR_SECTION
 		_MASKABLE_EXCEPTION_PSERIES(0x500, hardware_interrupt_common,
 					    EXC_HV, SOFTEN_TEST_HV)
-do_kvm_H0x500:
-		KVM_HANDLER(PACA_EXGEN, EXC_HV, 0x502)
 	FTR_SECTION_ELSE
 		_MASKABLE_EXCEPTION_PSERIES(0x500, hardware_interrupt_common,
 					    EXC_STD, SOFTEN_TEST_PR)
-do_kvm_0x500:
-		KVM_HANDLER(PACA_EXGEN, EXC_STD, 0x500)
 	ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
 EXC_REAL_END(hardware_interrupt, 0x500, 0x600)
 
@@ -737,6 +733,8 @@ hardware_interrupt_relon_hv:
 	ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
 EXC_VIRT_END(hardware_interrupt, 0x4500, 0x4600)
 
+TRAMP_KVM(PACA_EXGEN, 0x500)
+TRAMP_KVM_HV(PACA_EXGEN, 0x500)
 EXC_COMMON_ASYNC(hardware_interrupt_common, 0x500, do_IRQ)
 
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interrupts
  2016-12-01  7:18 [PATCH 0/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE fixes Nicholas Piggin
  2016-12-01  7:18 ` [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV Nicholas Piggin
  2016-12-01  7:18 ` [PATCH 2/3] KVM: PPC: Book3S: Move 64-bit KVM interrupt handler out from alt section Nicholas Piggin
@ 2016-12-01  7:18 ` Nicholas Piggin
  2 siblings, 0 replies; 8+ messages in thread
From: Nicholas Piggin @ 2016-12-01  7:18 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nicholas Piggin, Alexander Graf, kvm-ppc, Michael Ellerman,
	linuxppc-dev

64-bit Book3S exception handlers must find the dynamic kernel base
to add to the target address when branching beyond __end_interrupts,
in order to support kernel running at non-0 physical address.

Support this in KVM by branching with CTR, similarly to regular
interrupt handlers. The guest CTR saved in HSTATE_SCRATCH2 and
restored after the branch.

Without this, the host kernel hangs and crashes randomly when it is
running at a non-0 address and a KVM guest is started.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 39 ++++++++++++++++++++++++++++++--
 arch/powerpc/include/asm/head-64.h       |  2 +-
 arch/powerpc/kernel/exceptions-64s.S     |  4 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |  6 +++++
 arch/powerpc/kvm/book3s_segment.S        |  5 ++++
 5 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index bc8fc45..000b317 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -97,6 +97,11 @@
 	ld	reg,PACAKBASE(r13);					\
 	ori	reg,reg,(ABS_ADDR(label))@l;
 
+#define __LOAD_FAR_HANDLER(reg, label)					\
+	ld	reg,PACAKBASE(r13);					\
+	ori	reg,reg,(ABS_ADDR(label))@l;				\
+	addis	reg,reg,(ABS_ADDR(label))@h;
+
 /* Exception register prefixes */
 #define EXC_HV	H
 #define EXC_STD
@@ -227,12 +232,42 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	mtctr	reg;							\
 	bctr
 
+/*
+ * KVM requires >64K branches when branching from unrelocated code.
+ */
+#define BRANCH_TO_KVM_EXIT(reg, label)					\
+	mfctr	reg;							\
+	std	reg,HSTATE_SCRATCH2(r13);				\
+	__LOAD_FAR_HANDLER(reg, label);					\
+	mtctr	reg;							\
+	bctr
+
+#define BRANCH_TO_KVM(reg, label)					\
+	__LOAD_FAR_HANDLER(reg, label);					\
+	mtctr	reg;							\
+	bctr
+
+#define BRANCH_LINK_TO_KVM(reg, label)					\
+	__LOAD_FAR_HANDLER(reg, label);					\
+	mtctr	reg;							\
+	bctrl
+
 #else
 #define BRANCH_TO_COMMON(reg, label)					\
 	b	label
 
+#define BRANCH_TO_KVM(reg, label)					\
+	b	label
+
+#define BRANCH_TO_KVM_EXIT(reg, label)					\
+	b	label
+
+#define BRANCH_LINK_TO_KVM(reg, label)					\
+	bl	label
+
 #endif
 
+
 #define __KVM_HANDLER(area, h, n)					\
 	BEGIN_FTR_SECTION_NESTED(947)					\
 	ld	r10,area+EX_CFAR(r13);					\
@@ -249,7 +284,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	or	r12,r12,r9;						\
 	ld	r9,area+EX_R9(r13);					\
 	std	r9,HSTATE_SCRATCH1(r13);				\
-	b	kvmppc_interrupt
+	BRANCH_TO_KVM_EXIT(r9, kvmppc_interrupt)
 
 #define __KVM_HANDLER_SKIP(area, h, n)					\
 	cmpwi	r10,KVM_GUEST_MODE_SKIP;				\
@@ -265,7 +300,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	or	r12,r12,r9;						\
 	ld	r9,area+EX_R9(r13);					\
 	std	r9,HSTATE_SCRATCH1(r13);				\
-	b	kvmppc_interrupt;					\
+	BRANCH_TO_KVM_EXIT(r9, kvmppc_interrupt);			\
 89:	mtocrf	0x80,r9;						\
 	ld	r9,area+EX_R9(r13);					\
 	ld	r10,area+EX_R10(r13);					\
diff --git a/arch/powerpc/include/asm/head-64.h b/arch/powerpc/include/asm/head-64.h
index f7131cf..a5cbc1c 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -228,7 +228,7 @@ end_##sname:
 
 #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
 #define TRAMP_KVM_BEGIN(name)						\
-	TRAMP_REAL_BEGIN(name)
+	TRAMP_VIRT_BEGIN(name)
 #else
 #define TRAMP_KVM_BEGIN(name)
 #endif
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 5faff1c..955fc76 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -142,7 +142,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
 	lbz	r0,HSTATE_HWTHREAD_REQ(r13)
 	cmpwi	r0,0
 	beq	1f
-	b	kvm_start_guest
+	BRANCH_TO_KVM(r10, kvm_start_guest)
 1:
 #endif
 
@@ -977,7 +977,7 @@ TRAMP_REAL_BEGIN(hmi_exception_early)
 	EXCEPTION_PROLOG_COMMON_2(PACA_EXGEN)
 	EXCEPTION_PROLOG_COMMON_3(0xe60)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	hmi_exception_realmode
+	BRANCH_LINK_TO_KVM(r4, hmi_exception_realmode)
 	/* Windup the stack. */
 	/* Move original HSRR0 and HSRR1 into the respective regs */
 	ld	r9,_MSR(r1)
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 0536c73..1d07cea 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1047,8 +1047,14 @@ kvmppc_interrupt_hv:
 	 * R13		= PACA
 	 * R9		= unused
 	 * guest R12, R9 saved in shadow VCPU SCRATCH0/1 respectively
+	 * guest CTR saved in shadow VCPU SCRATCH2 if RELOCATABLE
 	 * guest R13 saved in SPRN_SCRATCH0
 	 */
+#ifdef CONFIG_RELOCATABLE
+	ld	r9, HSTATE_SCRATCH2(r13)
+	mtctr	r9
+#endif
+
 	lbz	r9, HSTATE_IN_GUEST(r13)
 	cmpwi	r9, KVM_GUEST_MODE_HOST_HV
 	beq	kvmppc_bad_host_intr
diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
index 3b29f0f..4d25b7b 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -177,9 +177,14 @@ kvmppc_interrupt_pr:
 	 * R13             = PACA
 	 * HSTATE.SCRATCH0 = guest R12
 	 * HSTATE.SCRATCH1 = guest R9
+	 * HSTATE.SCRATCH2 = guest CTR if RELOCATABLE
 	 */
 #ifdef CONFIG_PPC64
 	/* Match 32-bit entry */
+#ifdef CONFIG_RELOCATABLE
+	ld	r9,HSTATE_SCRATCH2(r13)
+	mtctr	r9
+#endif
 	ld	r9,HSTATE_SCRATCH1(r13)
 	stw	r12,HSTATE_SCRATCH1(r13) /* CR is in the low half of r12 */
 	srdi	r12, r12, 32		 /* trap is in the high half of r12 */
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV
  2016-12-01  7:18 ` [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV Nicholas Piggin
@ 2016-12-06  6:09   ` Paul Mackerras
  2016-12-06  8:31     ` Nicholas Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Mackerras @ 2016-12-06  6:09 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: Alexander Graf, kvm-ppc, Michael Ellerman, linuxppc-dev

On Thu, Dec 01, 2016 at 06:18:10PM +1100, Nicholas Piggin wrote:
> Change the calling convention to put the trap number together with
> CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV
> handler, and r9 free.

Cute idea!  Some comments below...

> The 64-bit PR handler entry translates the calling convention back
> to match the previous call convention (i.e., shared with 32-bit), for
> simplicity.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/include/asm/exception-64s.h | 28 +++++++++++++++-------------
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 15 +++++++--------
>  arch/powerpc/kvm/book3s_segment.S        | 27 ++++++++++++++++++++-------
>  3 files changed, 42 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 9a3eee6..bc8fc45 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
>  
>  #endif
>  
> -#define __KVM_HANDLER_PROLOG(area, n)					\
> +#define __KVM_HANDLER(area, h, n)					\
>  	BEGIN_FTR_SECTION_NESTED(947)					\
>  	ld	r10,area+EX_CFAR(r13);					\
>  	std	r10,HSTATE_CFAR(r13);					\
> @@ -243,30 +243,32 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
>  	std	r10,HSTATE_PPR(r13);					\
>  	END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);	\
>  	ld	r10,area+EX_R10(r13);					\
> -	stw	r9,HSTATE_SCRATCH1(r13);				\
> -	ld	r9,area+EX_R9(r13);					\
>  	std	r12,HSTATE_SCRATCH0(r13);				\
> -
> -#define __KVM_HANDLER(area, h, n)					\
> -	__KVM_HANDLER_PROLOG(area, n)					\
> -	li	r12,n;							\
> +	li	r12,(n);						\
> +	sldi	r12,r12,32;						\
> +	or	r12,r12,r9;						\

Did you consider doing it the other way around, i.e. with r12
containing (cr << 32) | trap?  That would save 1 instruction in each
handler:

+	sldi	r12,r9,32;						\
+	ori	r12,r12,(n);						\

> +	ld	r9,area+EX_R9(r13);					\
> +	std	r9,HSTATE_SCRATCH1(r13);				\

Why not put this std in kvmppc_interrupt[_hv] rather than in each
handler?

>  	b	kvmppc_interrupt
>  
>  #define __KVM_HANDLER_SKIP(area, h, n)					\
>  	cmpwi	r10,KVM_GUEST_MODE_SKIP;				\
> -	ld	r10,area+EX_R10(r13);					\
>  	beq	89f;							\
> -	stw	r9,HSTATE_SCRATCH1(r13);				\
>  	BEGIN_FTR_SECTION_NESTED(948)					\
> -	ld	r9,area+EX_PPR(r13);					\
> -	std	r9,HSTATE_PPR(r13);					\
> +	ld	r10,area+EX_PPR(r13);					\
> +	std	r10,HSTATE_PPR(r13);					\
>  	END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);	\
> -	ld	r9,area+EX_R9(r13);					\
> +	ld	r10,area+EX_R10(r13);					\
>  	std	r12,HSTATE_SCRATCH0(r13);				\
> -	li	r12,n;							\
> +	li	r12,(n);						\
> +	sldi	r12,r12,32;						\
> +	or	r12,r12,r9;						\
> +	ld	r9,area+EX_R9(r13);					\
> +	std	r9,HSTATE_SCRATCH1(r13);				\

Same comment again, of course.

>  	b	kvmppc_interrupt;					\
>  89:	mtocrf	0x80,r9;						\
>  	ld	r9,area+EX_R9(r13);					\
> +	ld	r10,area+EX_R10(r13);					\
>  	b	kvmppc_skip_##h##interrupt
>  
>  #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index c3c1d1b..0536c73 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1043,19 +1043,18 @@ hdec_soon:
>  kvmppc_interrupt_hv:
>  	/*
>  	 * Register contents:
> -	 * R12		= interrupt vector
> +	 * R12		= (interrupt vector << 32) | guest CR
>  	 * R13		= PACA
> -	 * guest CR, R12 saved in shadow VCPU SCRATCH1/0
> +	 * R9		= unused
> +	 * guest R12, R9 saved in shadow VCPU SCRATCH0/1 respectively
>  	 * guest R13 saved in SPRN_SCRATCH0
>  	 */
> -	std	r9, HSTATE_SCRATCH2(r13)
> -
>  	lbz	r9, HSTATE_IN_GUEST(r13)
>  	cmpwi	r9, KVM_GUEST_MODE_HOST_HV
>  	beq	kvmppc_bad_host_intr
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>  	cmpwi	r9, KVM_GUEST_MODE_GUEST
> -	ld	r9, HSTATE_SCRATCH2(r13)
> +	ld	r9, HSTATE_SCRATCH1(r13)
>  	beq	kvmppc_interrupt_pr
>  #endif
>  	/* We're now back in the host but in guest MMU context */
> @@ -1075,14 +1074,13 @@ kvmppc_interrupt_hv:
>  	std	r6, VCPU_GPR(R6)(r9)
>  	std	r7, VCPU_GPR(R7)(r9)
>  	std	r8, VCPU_GPR(R8)(r9)
> -	ld	r0, HSTATE_SCRATCH2(r13)
> +	ld	r0, HSTATE_SCRATCH1(r13)
>  	std	r0, VCPU_GPR(R9)(r9)
>  	std	r10, VCPU_GPR(R10)(r9)
>  	std	r11, VCPU_GPR(R11)(r9)
>  	ld	r3, HSTATE_SCRATCH0(r13)
> -	lwz	r4, HSTATE_SCRATCH1(r13)
>  	std	r3, VCPU_GPR(R12)(r9)
> -	stw	r4, VCPU_CR(r9)
> +	stw	r12, VCPU_CR(r9)	/* CR is in the low half of r12 */

This would then need to be srdi r4, r12, 32; stw r4, VCPU_CR(r9)

>  BEGIN_FTR_SECTION
>  	ld	r3, HSTATE_CFAR(r13)
>  	std	r3, VCPU_CFAR(r9)
> @@ -1100,6 +1098,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>  	mfspr	r11, SPRN_SRR1
>  	std	r10, VCPU_SRR0(r9)
>  	std	r11, VCPU_SRR1(r9)
> +	srdi	r12, r12, 32		/* trap is in the high half of r12 */

and this would become clrldi r12,r12,32 though arguably that's not
totally necessary since we always do cmpwi/stw/lwz on r12 (but I'd
feel safer with the clrldi in place).

Cheers,
Paul.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV
  2016-12-06  6:09   ` Paul Mackerras
@ 2016-12-06  8:31     ` Nicholas Piggin
  0 siblings, 0 replies; 8+ messages in thread
From: Nicholas Piggin @ 2016-12-06  8:31 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Alexander Graf, kvm-ppc, Michael Ellerman, linuxppc-dev

On Tue, 6 Dec 2016 17:09:07 +1100
Paul Mackerras <paulus@ozlabs.org> wrote:

> On Thu, Dec 01, 2016 at 06:18:10PM +1100, Nicholas Piggin wrote:
> > Change the calling convention to put the trap number together with
> > CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV
> > handler, and r9 free.  
> 
> Cute idea!  Some comments below...
> 
> > The 64-bit PR handler entry translates the calling convention back
> > to match the previous call convention (i.e., shared with 32-bit), for
> > simplicity.
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >  arch/powerpc/include/asm/exception-64s.h | 28 +++++++++++++++-------------
> >  arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 15 +++++++--------
> >  arch/powerpc/kvm/book3s_segment.S        | 27 ++++++++++++++++++++-------
> >  3 files changed, 42 insertions(+), 28 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> > index 9a3eee6..bc8fc45 100644
> > --- a/arch/powerpc/include/asm/exception-64s.h
> > +++ b/arch/powerpc/include/asm/exception-64s.h
> > @@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
> >  
> >  #endif
> >  
> > -#define __KVM_HANDLER_PROLOG(area, n)					\
> > +#define __KVM_HANDLER(area, h, n)					\
> >  	BEGIN_FTR_SECTION_NESTED(947)					\
> >  	ld	r10,area+EX_CFAR(r13);					\
> >  	std	r10,HSTATE_CFAR(r13);					\
> > @@ -243,30 +243,32 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
> >  	std	r10,HSTATE_PPR(r13);					\
> >  	END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);	\
> >  	ld	r10,area+EX_R10(r13);					\
> > -	stw	r9,HSTATE_SCRATCH1(r13);				\
> > -	ld	r9,area+EX_R9(r13);					\
> >  	std	r12,HSTATE_SCRATCH0(r13);				\
> > -
> > -#define __KVM_HANDLER(area, h, n)					\
> > -	__KVM_HANDLER_PROLOG(area, n)					\
> > -	li	r12,n;							\
> > +	li	r12,(n);						\
> > +	sldi	r12,r12,32;						\
> > +	or	r12,r12,r9;						\  
> 
> Did you consider doing it the other way around, i.e. with r12
> containing (cr << 32) | trap?  That would save 1 instruction in each
> handler:

When I tinkered with it I thought it came out slightly nicer this way, but
your suggested versions seem to prove me wrong. I can change it if you'd
like.

> 
> +	sldi	r12,r9,32;						\
> +	ori	r12,r12,(n);						\
> 
> > +	ld	r9,area+EX_R9(r13);					\
> > +	std	r9,HSTATE_SCRATCH1(r13);				\  
> 
> Why not put this std in kvmppc_interrupt[_hv] rather than in each
> handler?

Patch 3/3 uses r9 to load the ctr when CONFIG_RELOCATABLE is turned on, so
this resulted in the smaller difference between the two cases. I agree it's
not ideal when config relocatable is off.

[snip]

Thanks,
Nick

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV
  2016-12-21 18:29 [PATCH v2 0/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE fixes Nicholas Piggin
@ 2016-12-21 18:29 ` Nicholas Piggin
  2017-01-27  2:21   ` Paul Mackerras
  0 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2016-12-21 18:29 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nicholas Piggin, Alexander Graf, kvm-ppc, Michael Ellerman,
	linuxppc-dev

Change the calling convention to put the trap number together with
CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV
handler.

The 64-bit PR handler entry translates the calling convention back
to match the previous call convention (i.e., shared with 32-bit), for
simplicity.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 24 +++++++++++-------------
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 16 +++++++++-------
 arch/powerpc/kvm/book3s_segment.S        | 25 ++++++++++++++++++-------
 3 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 9a3eee661297..a02a268bde6b 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 #endif
 
-#define __KVM_HANDLER_PROLOG(area, n)					\
+#define __KVM_HANDLER(area, h, n)					\
 	BEGIN_FTR_SECTION_NESTED(947)					\
 	ld	r10,area+EX_CFAR(r13);					\
 	std	r10,HSTATE_CFAR(r13);					\
@@ -243,30 +243,28 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	std	r10,HSTATE_PPR(r13);					\
 	END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);	\
 	ld	r10,area+EX_R10(r13);					\
-	stw	r9,HSTATE_SCRATCH1(r13);				\
-	ld	r9,area+EX_R9(r13);					\
 	std	r12,HSTATE_SCRATCH0(r13);				\
-
-#define __KVM_HANDLER(area, h, n)					\
-	__KVM_HANDLER_PROLOG(area, n)					\
-	li	r12,n;							\
+	sldi	r12,r9,32;						\
+	ori	r12,r12,(n);						\
+	ld	r9,area+EX_R9(r13);					\
 	b	kvmppc_interrupt
 
 #define __KVM_HANDLER_SKIP(area, h, n)					\
 	cmpwi	r10,KVM_GUEST_MODE_SKIP;				\
-	ld	r10,area+EX_R10(r13);					\
 	beq	89f;							\
-	stw	r9,HSTATE_SCRATCH1(r13);				\
 	BEGIN_FTR_SECTION_NESTED(948)					\
-	ld	r9,area+EX_PPR(r13);					\
-	std	r9,HSTATE_PPR(r13);					\
+	ld	r10,area+EX_PPR(r13);					\
+	std	r10,HSTATE_PPR(r13);					\
 	END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);	\
-	ld	r9,area+EX_R9(r13);					\
+	ld	r10,area+EX_R10(r13);					\
 	std	r12,HSTATE_SCRATCH0(r13);				\
-	li	r12,n;							\
+	sldi	r12,r9,32;						\
+	ori	r12,r12,(n);						\
+	ld	r9,area+EX_R9(r13);					\
 	b	kvmppc_interrupt;					\
 89:	mtocrf	0x80,r9;						\
 	ld	r9,area+EX_R9(r13);					\
+	ld	r10,area+EX_R10(r13);					\
 	b	kvmppc_skip_##h##interrupt
 
 #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 9338a818e05c..11882aac8216 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1057,19 +1057,18 @@ hdec_soon:
 kvmppc_interrupt_hv:
 	/*
 	 * Register contents:
-	 * R12		= interrupt vector
+	 * R12		= (guest CR << 32) | interrupt vector
 	 * R13		= PACA
-	 * guest CR, R12 saved in shadow VCPU SCRATCH1/0
+	 * guest R12 saved in shadow VCPU SCRATCH0
 	 * guest R13 saved in SPRN_SCRATCH0
 	 */
-	std	r9, HSTATE_SCRATCH2(r13)
-
+	std	r9, HSTATE_SCRATCH1(r13)
 	lbz	r9, HSTATE_IN_GUEST(r13)
 	cmpwi	r9, KVM_GUEST_MODE_HOST_HV
 	beq	kvmppc_bad_host_intr
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	cmpwi	r9, KVM_GUEST_MODE_GUEST
-	ld	r9, HSTATE_SCRATCH2(r13)
+	ld	r9, HSTATE_SCRATCH1(r13)
 	beq	kvmppc_interrupt_pr
 #endif
 	/* We're now back in the host but in guest MMU context */
@@ -1089,13 +1088,14 @@ kvmppc_interrupt_hv:
 	std	r6, VCPU_GPR(R6)(r9)
 	std	r7, VCPU_GPR(R7)(r9)
 	std	r8, VCPU_GPR(R8)(r9)
-	ld	r0, HSTATE_SCRATCH2(r13)
+	ld	r0, HSTATE_SCRATCH1(r13)
 	std	r0, VCPU_GPR(R9)(r9)
 	std	r10, VCPU_GPR(R10)(r9)
 	std	r11, VCPU_GPR(R11)(r9)
 	ld	r3, HSTATE_SCRATCH0(r13)
-	lwz	r4, HSTATE_SCRATCH1(r13)
 	std	r3, VCPU_GPR(R12)(r9)
+	/* CR is in the high half of r12 */
+	srdi	r4, r12, 32
 	stw	r4, VCPU_CR(r9)
 BEGIN_FTR_SECTION
 	ld	r3, HSTATE_CFAR(r13)
@@ -1114,6 +1114,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	mfspr	r11, SPRN_SRR1
 	std	r10, VCPU_SRR0(r9)
 	std	r11, VCPU_SRR1(r9)
+	/* trap is in the low half of r12, clear CR from the high half */
+	clrldi	r12, r12, 32
 	andi.	r0, r12, 2		/* need to read HSRR0/1? */
 	beq	1f
 	mfspr	r10, SPRN_HSRR0
diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
index ca8f174289bb..68e45080cf93 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -167,20 +167,31 @@ kvmppc_handler_trampoline_enter_end:
  *                                                                            *
  *****************************************************************************/
 
-.global kvmppc_handler_trampoline_exit
-kvmppc_handler_trampoline_exit:
-
 .global kvmppc_interrupt_pr
 kvmppc_interrupt_pr:
+	/* 64-bit entry. Register usage at this point:
+	 *
+	 * SPRG_SCRATCH0   = guest R13
+	 * R12             = (guest CR << 32) | exit handler id
+	 * R13             = PACA
+	 * HSTATE.SCRATCH0 = guest R12
+	 */
+#ifdef CONFIG_PPC64
+	/* Match 32-bit entry */
+	rotldi	r12, r12, 32		  /* Flip R12 halves for stw */
+	stw	r12, HSTATE_SCRATCH1(r13) /* CR is now in the low half */
+	srdi	r12, r12, 32		  /* shift trap into low half */
+#endif
 
+.global kvmppc_handler_trampoline_exit
+kvmppc_handler_trampoline_exit:
 	/* Register usage at this point:
 	 *
-	 * SPRG_SCRATCH0  = guest R13
-	 * R12            = exit handler id
-	 * R13            = shadow vcpu (32-bit) or PACA (64-bit)
+	 * SPRG_SCRATCH0   = guest R13
+	 * R12             = exit handler id
+	 * R13             = shadow vcpu (32-bit) or PACA (64-bit)
 	 * HSTATE.SCRATCH0 = guest R12
 	 * HSTATE.SCRATCH1 = guest CR
-	 *
 	 */
 
 	/* Save registers */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV
  2016-12-21 18:29 ` [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV Nicholas Piggin
@ 2017-01-27  2:21   ` Paul Mackerras
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Mackerras @ 2017-01-27  2:21 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: Alexander Graf, kvm-ppc, Michael Ellerman, linuxppc-dev

On Thu, Dec 22, 2016 at 04:29:25AM +1000, Nicholas Piggin wrote:
> Change the calling convention to put the trap number together with
> CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV
> handler.
> 
> The 64-bit PR handler entry translates the calling convention back
> to match the previous call convention (i.e., shared with 32-bit), for
> simplicity.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Acked-by: Paul Mackerras <paulus@ozlabs.org>

I notice that I forgot to add the code to save CFAR to the
__KVM_HANDLER_SKIP macro.  We should fix that.

Paul.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-01-27  2:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-01  7:18 [PATCH 0/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE fixes Nicholas Piggin
2016-12-01  7:18 ` [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV Nicholas Piggin
2016-12-06  6:09   ` Paul Mackerras
2016-12-06  8:31     ` Nicholas Piggin
2016-12-01  7:18 ` [PATCH 2/3] KVM: PPC: Book3S: Move 64-bit KVM interrupt handler out from alt section Nicholas Piggin
2016-12-01  7:18 ` [PATCH 3/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interrupts Nicholas Piggin
  -- strict thread matches above, loose matches on Subject: below --
2016-12-21 18:29 [PATCH v2 0/3] KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE fixes Nicholas Piggin
2016-12-21 18:29 ` [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV Nicholas Piggin
2017-01-27  2:21   ` Paul Mackerras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).