linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] powerpc/64s: restore_math remove TM test
@ 2020-06-23 23:41 Nicholas Piggin
  2020-06-23 23:41 ` [PATCH 2/3] powerpc/64s: Fix restore_math unnecessarily changing MSR Nicholas Piggin
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Nicholas Piggin @ 2020-06-23 23:41 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Anton Blanchard, Nicholas Piggin, Gustavo Romero

The TM test in restore_math added by commit dc16b553c949e ("powerpc:
Always restore FPU/VEC/VSX if hardware transactional memory in use") is
no longer necessary after commit a8318c13e79ba ("powerpc/tm: Fix
restoring FP/VMX facility incorrectly on interrupts"), which removed
the cases where restore_math has to restore if TM is active.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/process.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 7bb7faf84490..c6c1add91bf3 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -530,8 +530,7 @@ void notrace restore_math(struct pt_regs *regs)
 {
 	unsigned long msr;
 
-	if (!MSR_TM_ACTIVE(regs->msr) &&
-		!current->thread.load_fp && !loadvec(current->thread))
+	if (!current->thread.load_fp && !loadvec(current->thread))
 		return;
 
 	msr = regs->msr;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] powerpc/64s: Fix restore_math unnecessarily changing MSR
  2020-06-23 23:41 [PATCH 1/3] powerpc/64s: restore_math remove TM test Nicholas Piggin
@ 2020-06-23 23:41 ` Nicholas Piggin
  2020-06-23 23:41 ` [PATCH 3/3] powerpc: re-initialise lazy FPU/VEC counters on every fault Nicholas Piggin
  2020-07-16 12:56 ` [PATCH 1/3] powerpc/64s: restore_math remove TM test Michael Ellerman
  2 siblings, 0 replies; 4+ messages in thread
From: Nicholas Piggin @ 2020-06-23 23:41 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Anton Blanchard, Nicholas Piggin, Gustavo Romero

Before returning to user, if there are missing FP/VEC/VSX bits from the
user MSR then those registers had been saved and must be restored again
before use. restore_math will decide whether to restore immediately, or
skip the restore and let fp/vec/vsx unavailable faults demand load the
registers.

Each time restore_math restores one of the FP/VSX or VEC register sets
is loaded, an 8-bit counter is incremented (load_fp and load_vec). When
these wrap to zero, restore_math no longer restores that register set
until after they are next demand faulted.

It's quite usual for those counters to have different values, so if one
wraps to zero and restore_math no longer restores its registers or user
MSR bit but the other is not zero yet does not need to be restored
(because the kernel is not frequently using the FPU), then restore_math
will be called and it will also not return in the early exit check.
This causes msr_check_and_set to test and set the MSR at every kernel
exit despite having no work to do.

This can cause workloads (e.g., a NULL syscall microbenchmark) to run
fast for a time while both counters are non-zero, then slow down when
one of the counters reaches zero, then speed up again after the second
counter reaches zero. The cost is significant, about 10% slowdown on a
NULL syscall benchmark, and the jittery behaviour is very undesirable.

Fix this by having restore_math test all conditions first, and only
update MSR if we will be loading registers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/process.c    | 100 ++++++++++++++++++-------------
 arch/powerpc/kernel/syscall_64.c |   8 +++
 2 files changed, 68 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index c6c1add91bf3..4f72aa4ed717 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -471,49 +471,58 @@ EXPORT_SYMBOL(giveup_all);
 
 #ifdef CONFIG_PPC_BOOK3S_64
 #ifdef CONFIG_PPC_FPU
-static int restore_fp(struct task_struct *tsk)
+static bool should_restore_fp(void)
 {
-	if (tsk->thread.load_fp) {
-		load_fp_state(&current->thread.fp_state);
+	if (current->thread.load_fp) {
 		current->thread.load_fp++;
-		return 1;
+		return true;
 	}
-	return 0;
+	return false;
+}
+
+static void do_restore_fp(void)
+{
+	load_fp_state(&current->thread.fp_state);
 }
 #else
-static int restore_fp(struct task_struct *tsk) { return 0; }
+static bool should_restore_fp(void) { return false; }
+static void do_restore_fp(void) { }
 #endif /* CONFIG_PPC_FPU */
 
 #ifdef CONFIG_ALTIVEC
-#define loadvec(thr) ((thr).load_vec)
-static int restore_altivec(struct task_struct *tsk)
+static bool should_restore_altivec(void)
 {
-	if (cpu_has_feature(CPU_FTR_ALTIVEC) && (tsk->thread.load_vec)) {
-		load_vr_state(&tsk->thread.vr_state);
-		tsk->thread.used_vr = 1;
-		tsk->thread.load_vec++;
-
-		return 1;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC) && (current->thread.load_vec)) {
+		current->thread.load_vec++;
+		return true;
 	}
-	return 0;
+	return false;
+}
+
+static void do_restore_altivec(void)
+{
+	load_vr_state(&current->thread.vr_state);
+	current->thread.used_vr = 1;
 }
 #else
-#define loadvec(thr) 0
-static inline int restore_altivec(struct task_struct *tsk) { return 0; }
+static bool should_restore_altivec(void) { return false; }
+static void do_restore_altivec(void) { }
 #endif /* CONFIG_ALTIVEC */
 
 #ifdef CONFIG_VSX
-static int restore_vsx(struct task_struct *tsk)
+static bool should_restore_vsx(void)
 {
-	if (cpu_has_feature(CPU_FTR_VSX)) {
-		tsk->thread.used_vsr = 1;
-		return 1;
-	}
-
-	return 0;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		return true;
+	return false;
+}
+static void do_restore_vsx(void)
+{
+	current->thread.used_vsr = 1;
 }
 #else
-static inline int restore_vsx(struct task_struct *tsk) { return 0; }
+static bool should_restore_vsx(void) { return false; }
+static void do_restore_vsx(void) { }
 #endif /* CONFIG_VSX */
 
 /*
@@ -529,31 +538,42 @@ static inline int restore_vsx(struct task_struct *tsk) { return 0; }
 void notrace restore_math(struct pt_regs *regs)
 {
 	unsigned long msr;
-
-	if (!current->thread.load_fp && !loadvec(current->thread))
-		return;
+	unsigned long new_msr = 0;
 
 	msr = regs->msr;
-	msr_check_and_set(msr_all_available);
 
 	/*
-	 * Only reload if the bit is not set in the user MSR, the bit BEING set
-	 * indicates that the registers are hot
+	 * new_msr tracks the facilities that are to be restored. Only reload
+	 * if the bit is not set in the user MSR (if it is set, the registers
+	 * are live for the user thread).
 	 */
-	if ((!(msr & MSR_FP)) && restore_fp(current))
-		msr |= MSR_FP | current->thread.fpexc_mode;
+	if ((!(msr & MSR_FP)) && should_restore_fp())
+		new_msr |= MSR_FP | current->thread.fpexc_mode;
 
-	if ((!(msr & MSR_VEC)) && restore_altivec(current))
-		msr |= MSR_VEC;
+	if ((!(msr & MSR_VEC)) && should_restore_altivec())
+		new_msr |= MSR_VEC;
 
-	if ((msr & (MSR_FP | MSR_VEC)) == (MSR_FP | MSR_VEC) &&
-			restore_vsx(current)) {
-		msr |= MSR_VSX;
+	if ((!(msr & MSR_VSX)) && should_restore_vsx()) {
+		if (((msr | new_msr) & (MSR_FP | MSR_VEC)) == (MSR_FP | MSR_VEC))
+			new_msr |= MSR_VSX;
 	}
 
-	msr_check_and_clear(msr_all_available);
+	if (new_msr) {
+		msr_check_and_set(new_msr);
+
+		if (new_msr & MSR_FP)
+			do_restore_fp();
+
+		if (new_msr & MSR_VEC)
+			do_restore_altivec();
 
-	regs->msr = msr;
+		if (new_msr & MSR_VSX)
+			do_restore_vsx();
+
+		msr_check_and_clear(new_msr);
+
+		regs->msr |= new_msr;
+	}
 }
 #endif
 
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 79edba3ab312..5126f1d3184a 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -206,6 +206,13 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
 				mathflags |= MSR_VEC;
 
+			/*
+			 * If userspace MSR has all available FP bits set,
+			 * then they are live and no need to restore. If not,
+			 * it means the regs were given up and restore_math
+			 * may decide to restore them (to avoid taking an FP
+			 * fault).
+			 */
 			if ((regs->msr & mathflags) != mathflags)
 				restore_math(regs);
 		}
@@ -277,6 +284,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned
 			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
 				mathflags |= MSR_VEC;
 
+			/* See above restore_math comment */
 			if ((regs->msr & mathflags) != mathflags)
 				restore_math(regs);
 		}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] powerpc: re-initialise lazy FPU/VEC counters on every fault
  2020-06-23 23:41 [PATCH 1/3] powerpc/64s: restore_math remove TM test Nicholas Piggin
  2020-06-23 23:41 ` [PATCH 2/3] powerpc/64s: Fix restore_math unnecessarily changing MSR Nicholas Piggin
@ 2020-06-23 23:41 ` Nicholas Piggin
  2020-07-16 12:56 ` [PATCH 1/3] powerpc/64s: restore_math remove TM test Michael Ellerman
  2 siblings, 0 replies; 4+ messages in thread
From: Nicholas Piggin @ 2020-06-23 23:41 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Anton Blanchard, Nicholas Piggin, Gustavo Romero

When a FP/VEC/VSX unavailable fault loads registers and enables the
facility in the MSR, re-set the lazy restore counters to 1 rather
than incrementing them so every fault gets the same number of
restores before the next fault.

This probably shouldn't be a practical change because if a lazy counter
was non-zero then it should have been restored and would not cause a
fault when userspace tries to access it. However the code and comment
implies otherwise so that's misleading and unnecessary.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/fpu.S    | 4 +---
 arch/powerpc/kernel/vector.S | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index cac22cb97a8c..4ae39db70044 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -107,9 +107,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 	or	r12,r12,r4
 	std	r12,_MSR(r1)
 #endif
-	/* Don't care if r4 overflows, this is desired behaviour */
-	lbz	r4,THREAD_LOAD_FP(r5)
-	addi	r4,r4,1
+	li	r4,1
 	stb	r4,THREAD_LOAD_FP(r5)
 	addi	r10,r5,THREAD_FPSTATE
 	lfd	fr0,FPSTATE_FPSCR(r10)
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index efc5b52f95d2..801dc28fdcca 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -76,9 +76,7 @@ _GLOBAL(load_up_altivec)
 	oris	r12,r12,MSR_VEC@h
 	std	r12,_MSR(r1)
 #endif
-	/* Don't care if r4 overflows, this is desired behaviour */
-	lbz	r4,THREAD_LOAD_VEC(r5)
-	addi	r4,r4,1
+	li	r4,1
 	stb	r4,THREAD_LOAD_VEC(r5)
 	addi	r6,r5,THREAD_VRSTATE
 	li	r4,1
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/3] powerpc/64s: restore_math remove TM test
  2020-06-23 23:41 [PATCH 1/3] powerpc/64s: restore_math remove TM test Nicholas Piggin
  2020-06-23 23:41 ` [PATCH 2/3] powerpc/64s: Fix restore_math unnecessarily changing MSR Nicholas Piggin
  2020-06-23 23:41 ` [PATCH 3/3] powerpc: re-initialise lazy FPU/VEC counters on every fault Nicholas Piggin
@ 2020-07-16 12:56 ` Michael Ellerman
  2 siblings, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2020-07-16 12:56 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Anton Blanchard, Gustavo Romero

On Wed, 24 Jun 2020 09:41:37 +1000, Nicholas Piggin wrote:
> The TM test in restore_math added by commit dc16b553c949e ("powerpc:
> Always restore FPU/VEC/VSX if hardware transactional memory in use") is
> no longer necessary after commit a8318c13e79ba ("powerpc/tm: Fix
> restoring FP/VMX facility incorrectly on interrupts"), which removed
> the cases where restore_math has to restore if TM is active.

Applied to powerpc/next.

[1/3] powerpc/64s: restore_math remove TM test
      https://git.kernel.org/powerpc/c/891b4fe8fe3d09f20948b391f24c9fc5b7580a2b
[2/3] powerpc/64s: Fix restore_math unnecessarily changing MSR
      https://git.kernel.org/powerpc/c/01eb01877f3386d4bd5de75909abdd0af45a5fa2
[3/3] powerpc: re-initialise lazy FPU/VEC counters on every fault
      https://git.kernel.org/powerpc/c/b2b46304e9360f3dda49c9d8ba4a1478b9eecf1d

cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-16 14:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-23 23:41 [PATCH 1/3] powerpc/64s: restore_math remove TM test Nicholas Piggin
2020-06-23 23:41 ` [PATCH 2/3] powerpc/64s: Fix restore_math unnecessarily changing MSR Nicholas Piggin
2020-06-23 23:41 ` [PATCH 3/3] powerpc: re-initialise lazy FPU/VEC counters on every fault Nicholas Piggin
2020-07-16 12:56 ` [PATCH 1/3] powerpc/64s: restore_math remove TM test Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).