* [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume
@ 2014-01-31 10:13 Ard Biesheuvel
2014-01-31 10:13 ` [PATCH resend 2/2] arm64: add support for kernel mode NEON in interrupt context Ard Biesheuvel
2014-02-03 16:36 ` [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Will Deacon
0 siblings, 2 replies; 6+ messages in thread
From: Ard Biesheuvel @ 2014-01-31 10:13 UTC (permalink / raw)
To: linux-arm-kernel
If a task gets scheduled out and back in again and nothing has touched
its FPSIMD state in the mean time, there is really no reason to reload
it from memory. Similarly, repeated calls to kernel_neon_begin() and
kernel_neon_end() will preserve and restore the FPSIMD state every time.
This patch defers the FPSIMD state restore to the last possible moment,
i.e., right before the task re-enters userland. If a task does not enter
userland at all (for any reason), the existing FPSIMD state is preserved
and may be reused by the owning task if it gets scheduled in again on the
same CPU.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/fpsimd.h | 3 ++
arch/arm64/include/asm/thread_info.h | 4 +-
arch/arm64/kernel/entry.S | 2 +-
arch/arm64/kernel/fpsimd.c | 79 +++++++++++++++++++++++++++++++-----
arch/arm64/kernel/process.c | 3 +-
arch/arm64/kernel/signal.c | 3 ++
6 files changed, 81 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index c43b4ac13008..609bc44ceb8d 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -37,6 +37,8 @@ struct fpsimd_state {
u32 fpcr;
};
};
+ /* the id of the last cpu to have restored this state */
+ unsigned int last_cpu;
};
#if defined(__KERNEL__) && defined(CONFIG_COMPAT)
@@ -57,6 +59,7 @@ extern void fpsimd_load_state(struct fpsimd_state *state);
extern void fpsimd_thread_switch(struct task_struct *next);
extern void fpsimd_flush_thread(void);
+extern void fpsimd_reload_fpstate(void);
#endif
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 720e70b66ffd..4a1ca1cfb2f8 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -100,6 +100,7 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_SIGPENDING 0
#define TIF_NEED_RESCHED 1
#define TIF_NOTIFY_RESUME 2 /* callback before returning to user */
+#define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */
#define TIF_SYSCALL_TRACE 8
#define TIF_POLLING_NRFLAG 16
#define TIF_MEMDIE 18 /* is terminating due to OOM killer */
@@ -112,10 +113,11 @@ static inline struct thread_info *current_thread_info(void)
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
+#define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
- _TIF_NOTIFY_RESUME)
+ _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE)
#endif /* __KERNEL__ */
#endif /* __ASM_THREAD_INFO_H */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 39ac630d83de..80464e2fb1a5 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -576,7 +576,7 @@ fast_work_pending:
str x0, [sp, #S_X0] // returned x0
work_pending:
tbnz x1, #TIF_NEED_RESCHED, work_resched
- /* TIF_SIGPENDING or TIF_NOTIFY_RESUME case */
+ /* TIF_SIGPENDING, TIF_NOTIFY_RESUME or TIF_FOREIGN_FPSTATE case */
ldr x2, [sp, #S_PSTATE]
mov x0, sp // 'regs'
tst x2, #PSR_MODE_MASK // user mode regs?
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 4aef42a04bdc..226a495e019c 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -35,6 +35,23 @@
#define FPEXC_IDF (1 << 7)
/*
+ * In order to reduce the number of times the fpsimd state is needlessly saved
+ * and restored, keep track here of which task's userland owns the current state
+ * of the FPSIMD register file.
+ *
+ * This percpu variable points to the fpsimd_state.last_cpu field of the task
+ * whose FPSIMD state was most recently loaded onto this cpu. The last_cpu field
+ * itself contains the id of the cpu onto which the task's FPSIMD state was
+ * loaded most recently. So, to decide whether we can skip reloading the FPSIMD
+ * state, we need to check
+ * (a) whether this task was the last one to have its FPSIMD state loaded onto
+ * this cpu
+ * (b) whether this task may have manipulated its FPSIMD state on another cpu in
+ * the meantime
+ */
+static DEFINE_PER_CPU(unsigned int *, fpsimd_last_task);
+
+/*
* Trapped FP/ASIMD access.
*/
void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs)
@@ -72,18 +89,56 @@ void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
void fpsimd_thread_switch(struct task_struct *next)
{
- /* check if not kernel threads */
- if (current->mm)
+ /*
+ * The thread flag TIF_FOREIGN_FPSTATE conveys that the userland FPSIMD
+ * state belonging to the current task is not present in the registers
+ * but has (already) been saved to memory in order for the kernel to be
+ * able to go off and use the registers for something else. Therefore,
+ * we must not (re)save the register contents if this flag is set.
+ */
+ if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE))
fpsimd_save_state(¤t->thread.fpsimd_state);
- if (next->mm)
- fpsimd_load_state(&next->thread.fpsimd_state);
+
+ if (next->mm) {
+ /*
+ * If we are switching to a task whose most recent userland NEON
+ * contents are already in the registers of *this* cpu, we can
+ * skip loading the state from memory. Otherwise, set the
+ * TIF_FOREIGN_FPSTATE flag so the state will be loaded upon the
+ * next entry of userland.
+ */
+ struct fpsimd_state *st = &next->thread.fpsimd_state;
+
+ if (__get_cpu_var(fpsimd_last_task) == &st->last_cpu
+ && st->last_cpu == smp_processor_id())
+ clear_ti_thread_flag(task_thread_info(next),
+ TIF_FOREIGN_FPSTATE);
+ else
+ set_ti_thread_flag(task_thread_info(next),
+ TIF_FOREIGN_FPSTATE);
+ }
}
void fpsimd_flush_thread(void)
{
- preempt_disable();
memset(¤t->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));
- fpsimd_load_state(¤t->thread.fpsimd_state);
+ set_thread_flag(TIF_FOREIGN_FPSTATE);
+}
+
+void fpsimd_reload_fpstate(void)
+{
+ preempt_disable();
+ if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
+ /*
+ * We are entering userland and the userland context is not yet
+ * present in the registers.
+ */
+ struct fpsimd_state *st = ¤t->thread.fpsimd_state;
+
+ fpsimd_load_state(st);
+ __get_cpu_var(fpsimd_last_task) = &st->last_cpu;
+ st->last_cpu = smp_processor_id();
+ }
preempt_enable();
}
@@ -98,16 +153,20 @@ void kernel_neon_begin(void)
BUG_ON(in_interrupt());
preempt_disable();
- if (current->mm)
+ /*
+ * Save the userland FPSIMD state if we have one and if we haven't done
+ * so already. Clear fpsimd_last_task to indicate that there is no
+ * longer userland context in the registers.
+ */
+ if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
fpsimd_save_state(¤t->thread.fpsimd_state);
+ __get_cpu_var(fpsimd_last_task) = NULL;
+
}
EXPORT_SYMBOL(kernel_neon_begin);
void kernel_neon_end(void)
{
- if (current->mm)
- fpsimd_load_state(¤t->thread.fpsimd_state);
-
preempt_enable();
}
EXPORT_SYMBOL(kernel_neon_end);
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 248a15db37f2..274316df860f 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -205,7 +205,8 @@ void release_thread(struct task_struct *dead_task)
int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
{
- fpsimd_save_state(¤t->thread.fpsimd_state);
+ if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
+ fpsimd_save_state(¤t->thread.fpsimd_state);
*dst = *src;
return 0;
}
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 890a591f75dd..0a9eccf4fc0f 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -416,4 +416,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
clear_thread_flag(TIF_NOTIFY_RESUME);
tracehook_notify_resume(regs);
}
+
+ if (thread_flags & _TIF_FOREIGN_FPSTATE)
+ fpsimd_reload_fpstate();
}
--
1.8.3.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH resend 2/2] arm64: add support for kernel mode NEON in interrupt context
2014-01-31 10:13 [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Ard Biesheuvel
@ 2014-01-31 10:13 ` Ard Biesheuvel
2014-02-03 16:36 ` [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Will Deacon
1 sibling, 0 replies; 6+ messages in thread
From: Ard Biesheuvel @ 2014-01-31 10:13 UTC (permalink / raw)
To: linux-arm-kernel
This patch modifies kernel_neon_begin() and kernel_neon_end(), so
they may be called from any context. To address the case where only
a couple of registers are needed, kernel_neon_begin_partial(u32) is
introduced which takes as a parameter the number of bottom 'n' NEON
q-registers required. To mark the end of such a partial section, the
regular kernel_neon_end() should be used.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/fpsimd.h | 17 ++++++++++++++
arch/arm64/include/asm/fpsimdmacros.h | 37 +++++++++++++++++++++++++++++
arch/arm64/include/asm/neon.h | 6 ++++-
arch/arm64/kernel/entry-fpsimd.S | 24 +++++++++++++++++++
arch/arm64/kernel/fpsimd.c | 44 +++++++++++++++++++++++------------
5 files changed, 112 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 609bc44ceb8d..dc9ef741c648 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -41,6 +41,19 @@ struct fpsimd_state {
unsigned int last_cpu;
};
+/*
+ * Struct for stacking the bottom 'n' FP/SIMD registers.
+ * Mainly intended for kernel use of v8 Crypto Extensions which only
+ * needs a few registers and may need to execute in atomic context.
+ */
+struct fpsimd_partial_state {
+ u32 num_regs;
+ u32 fpsr;
+ u32 fpcr;
+ __uint128_t vregs[32] __aligned(16);
+} __aligned(16);
+
+
#if defined(__KERNEL__) && defined(CONFIG_COMPAT)
/* Masks for extracting the FPSR and FPCR from the FPSCR */
#define VFP_FPSCR_STAT_MASK 0xf800009f
@@ -57,6 +70,10 @@ struct task_struct;
extern void fpsimd_save_state(struct fpsimd_state *state);
extern void fpsimd_load_state(struct fpsimd_state *state);
+extern void fpsimd_save_partial_state(struct fpsimd_partial_state *state,
+ u32 num_regs);
+extern void fpsimd_load_partial_state(struct fpsimd_partial_state *state);
+
extern void fpsimd_thread_switch(struct task_struct *next);
extern void fpsimd_flush_thread(void);
extern void fpsimd_reload_fpstate(void);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index bbec599c96bd..42990a82c671 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -62,3 +62,40 @@
ldr w\tmpnr, [\state, #16 * 2 + 4]
msr fpcr, x\tmpnr
.endm
+
+.altmacro
+.macro q2op, op, q1, q2, state
+ \op q\q1, q\q2, [\state, # -16 * \q1 - 16]
+.endm
+
+.macro fpsimd_save_partial state, numnr, tmpnr1, tmpnr2
+ mrs x\tmpnr1, fpsr
+ str w\numnr, [\state]
+ mrs x\tmpnr2, fpcr
+ stp w\tmpnr1, w\tmpnr2, [\state, #4]
+ adr x\tmpnr1, 0f
+ add \state, \state, x\numnr, lsl #4
+ sub x\tmpnr1, x\tmpnr1, x\numnr, lsl #1
+ br x\tmpnr1
+ .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0
+ qb = \qa + 1
+ q2op stp, \qa, %qb, \state
+ .endr
+0:
+.endm
+
+.macro fpsimd_restore_partial state, tmpnr1, tmpnr2
+ ldp w\tmpnr1, w\tmpnr2, [\state, #4]
+ msr fpsr, x\tmpnr1
+ msr fpcr, x\tmpnr2
+ adr x\tmpnr1, 0f
+ ldr w\tmpnr2, [\state]
+ add \state, \state, x\tmpnr2, lsl #4
+ sub x\tmpnr1, x\tmpnr1, x\tmpnr2, lsl #1
+ br x\tmpnr1
+ .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0
+ qb = \qa + 1
+ q2op ldp, \qa, %qb, \state
+ .endr
+0:
+.endm
diff --git a/arch/arm64/include/asm/neon.h b/arch/arm64/include/asm/neon.h
index b0cc58a97780..13ce4cc18e26 100644
--- a/arch/arm64/include/asm/neon.h
+++ b/arch/arm64/include/asm/neon.h
@@ -8,7 +8,11 @@
* published by the Free Software Foundation.
*/
+#include <linux/types.h>
+
#define cpu_has_neon() (1)
-void kernel_neon_begin(void);
+#define kernel_neon_begin() kernel_neon_begin_partial(32)
+
+void kernel_neon_begin_partial(u32 num_regs);
void kernel_neon_end(void);
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 6a27cd6dbfa6..d358ccacfc00 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -41,3 +41,27 @@ ENTRY(fpsimd_load_state)
fpsimd_restore x0, 8
ret
ENDPROC(fpsimd_load_state)
+
+#ifdef CONFIG_KERNEL_MODE_NEON
+
+/*
+ * Save the bottom n FP registers.
+ *
+ * x0 - pointer to struct fpsimd_partial_state
+ */
+ENTRY(fpsimd_save_partial_state)
+ fpsimd_save_partial x0, 1, 8, 9
+ ret
+ENDPROC(fpsimd_load_partial_state)
+
+/*
+ * Load the bottom n FP registers.
+ *
+ * x0 - pointer to struct fpsimd_partial_state
+ */
+ENTRY(fpsimd_load_partial_state)
+ fpsimd_restore_partial x0, 8, 9
+ ret
+ENDPROC(fpsimd_load_partial_state)
+
+#endif
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 226a495e019c..970c2fa86530 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -144,30 +144,44 @@ void fpsimd_reload_fpstate(void)
#ifdef CONFIG_KERNEL_MODE_NEON
+static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
+static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+
/*
* Kernel-side NEON support functions
*/
-void kernel_neon_begin(void)
+void kernel_neon_begin_partial(u32 num_regs)
{
- /* Avoid using the NEON in interrupt context */
- BUG_ON(in_interrupt());
- preempt_disable();
-
- /*
- * Save the userland FPSIMD state if we have one and if we haven't done
- * so already. Clear fpsimd_last_task to indicate that there is no
- * longer userland context in the registers.
- */
- if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
- fpsimd_save_state(¤t->thread.fpsimd_state);
- __get_cpu_var(fpsimd_last_task) = NULL;
+ if (in_interrupt()) {
+ struct fpsimd_partial_state *s = this_cpu_ptr(
+ in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
+ BUG_ON(num_regs > 32);
+ fpsimd_save_partial_state(s, roundup(num_regs, 2));
+ } else {
+ /*
+ * Save the userland FPSIMD state if we have one and if we
+ * haven't done so already. Clear fpsimd_last_task to indicate
+ * that there is no longer userland context in the registers.
+ */
+ preempt_disable();
+ if (current->mm &&
+ !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
+ fpsimd_save_state(¤t->thread.fpsimd_state);
+ __get_cpu_var(fpsimd_last_task) = NULL;
+ }
}
-EXPORT_SYMBOL(kernel_neon_begin);
+EXPORT_SYMBOL(kernel_neon_begin_partial);
void kernel_neon_end(void)
{
- preempt_enable();
+ if (in_interrupt()) {
+ struct fpsimd_partial_state *s = this_cpu_ptr(
+ in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
+ fpsimd_load_partial_state(s);
+ } else {
+ preempt_enable();
+ }
}
EXPORT_SYMBOL(kernel_neon_end);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume
2014-01-31 10:13 [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Ard Biesheuvel
2014-01-31 10:13 ` [PATCH resend 2/2] arm64: add support for kernel mode NEON in interrupt context Ard Biesheuvel
@ 2014-02-03 16:36 ` Will Deacon
2014-02-04 14:49 ` Ard Biesheuvel
1 sibling, 1 reply; 6+ messages in thread
From: Will Deacon @ 2014-02-03 16:36 UTC (permalink / raw)
To: linux-arm-kernel
Hi Ard,
On Fri, Jan 31, 2014 at 10:13:15AM +0000, Ard Biesheuvel wrote:
> If a task gets scheduled out and back in again and nothing has touched
> its FPSIMD state in the mean time, there is really no reason to reload
> it from memory. Similarly, repeated calls to kernel_neon_begin() and
> kernel_neon_end() will preserve and restore the FPSIMD state every time.
>
> This patch defers the FPSIMD state restore to the last possible moment,
> i.e., right before the task re-enters userland. If a task does not enter
> userland at all (for any reason), the existing FPSIMD state is preserved
> and may be reused by the owning task if it gets scheduled in again on the
> same CPU.
The one situation I'm unsure of here is how you deal with the saved fpsimd
state potentially being updated by a signal handler or a debugger. In this
case, we probably need to set _TIF_FOREIGN_FPSTATE to force a reload, or are
you handling this some other way?
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume
2014-02-03 16:36 ` [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Will Deacon
@ 2014-02-04 14:49 ` Ard Biesheuvel
2014-02-04 17:27 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: Ard Biesheuvel @ 2014-02-04 14:49 UTC (permalink / raw)
To: linux-arm-kernel
On 3 February 2014 17:36, Will Deacon <will.deacon@arm.com> wrote:
> Hi Ard,
>
> On Fri, Jan 31, 2014 at 10:13:15AM +0000, Ard Biesheuvel wrote:
>> If a task gets scheduled out and back in again and nothing has touched
>> its FPSIMD state in the mean time, there is really no reason to reload
>> it from memory. Similarly, repeated calls to kernel_neon_begin() and
>> kernel_neon_end() will preserve and restore the FPSIMD state every time.
>>
>> This patch defers the FPSIMD state restore to the last possible moment,
>> i.e., right before the task re-enters userland. If a task does not enter
>> userland at all (for any reason), the existing FPSIMD state is preserved
>> and may be reused by the owning task if it gets scheduled in again on the
>> same CPU.
>
> The one situation I'm unsure of here is how you deal with the saved fpsimd
> state potentially being updated by a signal handler or a debugger. In this
> case, we probably need to set _TIF_FOREIGN_FPSTATE to force a reload, or are
> you handling this some other way?
>
If I am reading the code correctly, the signal handler is entered
using the normal userland resume path, so I don't think it requires
special treatment.
For the ptrace() case, it should suffice to set the 'last_cpu' field
to (u32)-1 to indicate that the FPSIMD context should be reloaded from
memory regardless of which CPU the debuggee is restarted on.
Regards,
Ard.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume
2014-02-04 14:49 ` Ard Biesheuvel
@ 2014-02-04 17:27 ` Will Deacon
2014-02-04 18:48 ` Ard Biesheuvel
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2014-02-04 17:27 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
On Tue, Feb 04, 2014 at 02:49:14PM +0000, Ard Biesheuvel wrote:
> On 3 February 2014 17:36, Will Deacon <will.deacon@arm.com> wrote:
> > On Fri, Jan 31, 2014 at 10:13:15AM +0000, Ard Biesheuvel wrote:
> >> If a task gets scheduled out and back in again and nothing has touched
> >> its FPSIMD state in the mean time, there is really no reason to reload
> >> it from memory. Similarly, repeated calls to kernel_neon_begin() and
> >> kernel_neon_end() will preserve and restore the FPSIMD state every time.
> >>
> >> This patch defers the FPSIMD state restore to the last possible moment,
> >> i.e., right before the task re-enters userland. If a task does not enter
> >> userland at all (for any reason), the existing FPSIMD state is preserved
> >> and may be reused by the owning task if it gets scheduled in again on the
> >> same CPU.
> >
> > The one situation I'm unsure of here is how you deal with the saved fpsimd
> > state potentially being updated by a signal handler or a debugger. In this
> > case, we probably need to set _TIF_FOREIGN_FPSTATE to force a reload, or are
> > you handling this some other way?
> >
>
> If I am reading the code correctly, the signal handler is entered
> using the normal userland resume path, so I don't think it requires
> special treatment.
It was the exiting of the signal handler that I was worried about, where it
may have modified the interrupted programs fpsimd state on the stack.
> For the ptrace() case, it should suffice to set the 'last_cpu' field
> to (u32)-1 to indicate that the FPSIMD context should be reloaded from
> memory regardless of which CPU the debuggee is restarted on.
Something like that sounds right, but it needs adding/testing.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume
2014-02-04 17:27 ` Will Deacon
@ 2014-02-04 18:48 ` Ard Biesheuvel
0 siblings, 0 replies; 6+ messages in thread
From: Ard Biesheuvel @ 2014-02-04 18:48 UTC (permalink / raw)
To: linux-arm-kernel
On 4 February 2014 18:27, Will Deacon <will.deacon@arm.com> wrote:
> Hello,
>
> On Tue, Feb 04, 2014 at 02:49:14PM +0000, Ard Biesheuvel wrote:
>> On 3 February 2014 17:36, Will Deacon <will.deacon@arm.com> wrote:
>> > On Fri, Jan 31, 2014 at 10:13:15AM +0000, Ard Biesheuvel wrote:
>> >> If a task gets scheduled out and back in again and nothing has touched
>> >> its FPSIMD state in the mean time, there is really no reason to reload
>> >> it from memory. Similarly, repeated calls to kernel_neon_begin() and
>> >> kernel_neon_end() will preserve and restore the FPSIMD state every time.
>> >>
>> >> This patch defers the FPSIMD state restore to the last possible moment,
>> >> i.e., right before the task re-enters userland. If a task does not enter
>> >> userland at all (for any reason), the existing FPSIMD state is preserved
>> >> and may be reused by the owning task if it gets scheduled in again on the
>> >> same CPU.
>> >
>> > The one situation I'm unsure of here is how you deal with the saved fpsimd
>> > state potentially being updated by a signal handler or a debugger. In this
>> > case, we probably need to set _TIF_FOREIGN_FPSTATE to force a reload, or are
>> > you handling this some other way?
>> >
>>
>> If I am reading the code correctly, the signal handler is entered
>> using the normal userland resume path, so I don't think it requires
>> special treatment.
>
> It was the exiting of the signal handler that I was worried about, where it
> may have modified the interrupted programs fpsimd state on the stack.
>
Ah, ok, I see what you mean.
I will update the patch so
(a) it only saves the state if _TIF_FOREIGN_FPSTATE is cleared (so we
don't overwrite the task's saved state inadvertently), and
(b) it sets _TIF_FOREIGN_FPSTATE instead of performing the restore
upon return from the signal handler.
>> For the ptrace() case, it should suffice to set the 'last_cpu' field
>> to (u32)-1 to indicate that the FPSIMD context should be reloaded from
>> memory regardless of which CPU the debuggee is restarted on.
>
> Something like that sounds right, but it needs adding/testing.
>
OK, I will add the above and do some more testing.
Cheers,
Ard.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-02-04 18:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-31 10:13 [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Ard Biesheuvel
2014-01-31 10:13 ` [PATCH resend 2/2] arm64: add support for kernel mode NEON in interrupt context Ard Biesheuvel
2014-02-03 16:36 ` [PATCH resend 1/2] arm64: defer reloading a task's FPSIMD state to userland resume Will Deacon
2014-02-04 14:49 ` Ard Biesheuvel
2014-02-04 17:27 ` Will Deacon
2014-02-04 18:48 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).