From: ard.biesheuvel@linaro.org (Ard Biesheuvel)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] arm64: fpsimd: improve stacking logic in non-interruptible context
Date: Wed, 7 Dec 2016 10:14:08 +0000 [thread overview]
Message-ID: <1481105648-19775-1-git-send-email-ard.biesheuvel@linaro.org> (raw)
Currently, we allow kernel mode NEON in softirq or hardirq context by
stacking and unstacking a slice of the NEON register file for each call
to kernel_neon_begin() and kernel_neon_end(), respectively.
Given that
a) a CPU typically spends most of its time in userland, during which time
no kernel mode NEON in process context is in progress,
b) a CPU spends most of its time in the kernel doing other things than
kernel mode NEON when it gets interrupted to perform kernel mode NEON
in softirq context
the stacking and subsequent unstacking is only necessary if we are
interrupting a thread while it is performing kernel mode NEON in process
context, which means that in all other cases, we can simply preserve the
userland FPSIMD state once, and only restore it upon return to userland,
even if we are being invoked from softirq or hardirq context.
So instead of checking whether we are running in interrupt context, keep
track of the level of nested kernel mode NEON calls in progress, and only
perform the eager stack/unstack if the level exceeds 1.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
v2:
- BUG() on unexpected values of the nesting level
- relax the BUG() on num_regs>32 to a WARN, given that nothing actually
breaks in that case
arch/arm64/kernel/fpsimd.c | 48 ++++++++++++++------
1 file changed, 34 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 394c61db5566..ae2c1bf569db 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -220,20 +220,35 @@ void fpsimd_flush_task_state(struct task_struct *t)
#ifdef CONFIG_KERNEL_MODE_NEON
-static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
-static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+/*
+ * Although unlikely, it is possible for three kernel mode NEON contexts to
+ * be live at the same time: process context, softirq context and hardirq
+ * context. So while the userland context is stashed in the thread's fpsimd
+ * state structure, we need two additional levels of storage.
+ */
+static DEFINE_PER_CPU(struct fpsimd_partial_state, nested_fpsimdstate[2]);
+static DEFINE_PER_CPU(int, kernel_neon_nesting_level);
/*
* Kernel-side NEON support functions
*/
void kernel_neon_begin_partial(u32 num_regs)
{
- if (in_interrupt()) {
- struct fpsimd_partial_state *s = this_cpu_ptr(
- in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
+ struct fpsimd_partial_state *s;
+ int level;
+
+ preempt_disable();
+
+ level = this_cpu_read(kernel_neon_nesting_level);
+ BUG_ON(level > 2);
+
+ if (level > 0) {
+ s = this_cpu_ptr(nested_fpsimdstate);
- BUG_ON(num_regs > 32);
- fpsimd_save_partial_state(s, roundup(num_regs, 2));
+ WARN_ON_ONCE(num_regs > 32);
+ num_regs = min(roundup(num_regs, 2), 32U);
+
+ fpsimd_save_partial_state(&s[level - 1], num_regs);
} else {
/*
* Save the userland FPSIMD state if we have one and if we
@@ -241,24 +256,29 @@ void kernel_neon_begin_partial(u32 num_regs)
* that there is no longer userland FPSIMD state in the
* registers.
*/
- preempt_disable();
if (current->mm &&
!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
fpsimd_save_state(¤t->thread.fpsimd_state);
this_cpu_write(fpsimd_last_state, NULL);
}
+ this_cpu_write(kernel_neon_nesting_level, level + 1);
}
EXPORT_SYMBOL(kernel_neon_begin_partial);
void kernel_neon_end(void)
{
- if (in_interrupt()) {
- struct fpsimd_partial_state *s = this_cpu_ptr(
- in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
- fpsimd_load_partial_state(s);
- } else {
- preempt_enable();
+ struct fpsimd_partial_state *s;
+ int level;
+
+ level = this_cpu_read(kernel_neon_nesting_level) - 1;
+ BUG_ON(level < 0);
+
+ if (level > 0) {
+ s = this_cpu_ptr(nested_fpsimdstate);
+ fpsimd_load_partial_state(&s[level - 1]);
}
+ this_cpu_write(kernel_neon_nesting_level, level);
+ preempt_enable();
}
EXPORT_SYMBOL(kernel_neon_end);
--
2.7.4
next reply other threads:[~2016-12-07 10:14 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-07 10:14 Ard Biesheuvel [this message]
2016-12-08 15:50 ` [PATCH v2] arm64: fpsimd: improve stacking logic in non-interruptible context Catalin Marinas
2016-12-08 15:53 ` Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1481105648-19775-1-git-send-email-ard.biesheuvel@linaro.org \
--to=ard.biesheuvel@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).