From: catalin.marinas@arm.com (Catalin Marinas)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 2/4] arm64: defer reloading a task's FPSIMD state to userland resume
Date: Fri, 21 Feb 2014 17:48:55 +0000 [thread overview]
Message-ID: <20140221174855.GD25351@arm.com> (raw)
In-Reply-To: <1391620418-3999-3-git-send-email-ard.biesheuvel@linaro.org>
On Wed, Feb 05, 2014 at 05:13:36PM +0000, Ard Biesheuvel wrote:
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index 7807974b49ee..f7e70f3f1eb7 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -37,6 +37,8 @@ struct fpsimd_state {
> u32 fpcr;
> };
> };
> + /* the id of the last cpu to have restored this state */
> + unsigned int last_cpu;
Just "cpu" is enough.
> };
>
> #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
> @@ -61,6 +63,8 @@ void fpsimd_set_task_state(struct fpsimd_state *state);
> struct user_fpsimd_state *fpsimd_get_user_state(struct task_struct *t);
> void fpsimd_set_user_state(struct task_struct *t, struct user_fpsimd_state *st);
>
> +void fpsimd_reload_fpstate(void);
> +
> #endif
>
> #endif
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index 720e70b66ffd..4a1ca1cfb2f8 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -100,6 +100,7 @@ static inline struct thread_info *current_thread_info(void)
> #define TIF_SIGPENDING 0
> #define TIF_NEED_RESCHED 1
> #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */
> +#define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */
> #define TIF_SYSCALL_TRACE 8
> #define TIF_POLLING_NRFLAG 16
> #define TIF_MEMDIE 18 /* is terminating due to OOM killer */
> @@ -112,10 +113,11 @@ static inline struct thread_info *current_thread_info(void)
> #define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
> #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
> +#define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
> #define _TIF_32BIT (1 << TIF_32BIT)
>
> #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
> - _TIF_NOTIFY_RESUME)
> + _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE)
>
> #endif /* __KERNEL__ */
> #endif /* __ASM_THREAD_INFO_H */
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 39ac630d83de..80464e2fb1a5 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -576,7 +576,7 @@ fast_work_pending:
> str x0, [sp, #S_X0] // returned x0
> work_pending:
> tbnz x1, #TIF_NEED_RESCHED, work_resched
> - /* TIF_SIGPENDING or TIF_NOTIFY_RESUME case */
> + /* TIF_SIGPENDING, TIF_NOTIFY_RESUME or TIF_FOREIGN_FPSTATE case */
> ldr x2, [sp, #S_PSTATE]
> mov x0, sp // 'regs'
> tst x2, #PSR_MODE_MASK // user mode regs?
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index eeb003f54ad0..239c8162473f 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -39,6 +39,23 @@ void fpsimd_save_state(struct fpsimd_state *state);
> void fpsimd_load_state(struct fpsimd_state *state);
>
> /*
> + * In order to reduce the number of times the fpsimd state is needlessly saved
> + * and restored, keep track here of which task's userland owns the current state
> + * of the FPSIMD register file.
> + *
> + * This percpu variable points to the fpsimd_state.last_cpu field of the task
> + * whose FPSIMD state was most recently loaded onto this cpu. The last_cpu field
> + * itself contains the id of the cpu onto which the task's FPSIMD state was
> + * loaded most recently. So, to decide whether we can skip reloading the FPSIMD
> + * state, we need to check
> + * (a) whether this task was the last one to have its FPSIMD state loaded onto
> + * this cpu
> + * (b) whether this task may have manipulated its FPSIMD state on another cpu in
> + * the meantime
> + */
> +static DEFINE_PER_CPU(unsigned int *, fpsimd_last_task);
This can simply be struct fpsimd_state * to avoid &st->last_cpu. Also
rename to fpsimd_state_last.
> +
> +/*
> * Trapped FP/ASIMD access.
> */
> void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs)
> @@ -76,36 +93,84 @@ void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
>
> void fpsimd_thread_switch(struct task_struct *next)
> {
> - /* check if not kernel threads */
> - if (current->mm)
> + /*
> + * The thread flag TIF_FOREIGN_FPSTATE conveys that the userland FPSIMD
> + * state belonging to the current task is not present in the registers
> + * but has (already) been saved to memory in order for the kernel to be
> + * able to go off and use the registers for something else. Therefore,
> + * we must not (re)save the register contents if this flag is set.
> + */
> + if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE))
> fpsimd_save_state(¤t->thread.fpsimd_state);
> - if (next->mm)
> - fpsimd_load_state(&next->thread.fpsimd_state);
> +
> + if (next->mm) {
> + /*
> + * If we are switching to a task whose most recent userland
> + * FPSIMD contents are already in the registers of *this* cpu,
> + * we can skip loading the state from memory. Otherwise, set
> + * the TIF_FOREIGN_FPSTATE flag so the state will be loaded
> + * upon the next entry of userland.
I would say "return" instead of "entry" (we enter the kernel and return
to user space ;)).
> + */
> + struct fpsimd_state *st = &next->thread.fpsimd_state;
> +
> + if (__get_cpu_var(fpsimd_last_task) == &st->last_cpu
> + && st->last_cpu == smp_processor_id())
> + clear_ti_thread_flag(task_thread_info(next),
> + TIF_FOREIGN_FPSTATE);
> + else
> + set_ti_thread_flag(task_thread_info(next),
> + TIF_FOREIGN_FPSTATE);
> + }
> }
I'm still trying to get my head around why we have 3 different type of
checks for this (fpsimd_last_task, last_cpu and TIF). The code seems
correct but I wonder whether we can reduce this to 2 checks?
--
Catalin
next prev parent reply other threads:[~2014-02-21 17:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-05 17:13 [PATCH v2 0/4] kernel mode NEON optimizations Ard Biesheuvel
2014-02-05 17:13 ` [PATCH v2 1/4] arm64: add abstractions for FPSIMD state manipulation Ard Biesheuvel
2014-02-21 14:25 ` Catalin Marinas
2014-02-21 14:52 ` Ard Biesheuvel
2014-02-05 17:13 ` [PATCH v2 2/4] arm64: defer reloading a task's FPSIMD state to userland resume Ard Biesheuvel
2014-02-21 17:48 ` Catalin Marinas [this message]
2014-02-21 18:33 ` Ard Biesheuvel
2014-02-24 10:14 ` Catalin Marinas
2014-02-05 17:13 ` [PATCH v2 3/4] arm64: add support for kernel mode NEON in interrupt context Ard Biesheuvel
2014-02-05 17:13 ` [PATCH v2 4/4] arm64: add Crypto Extensions based synchronous core AES cipher Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140221174855.GD25351@arm.com \
--to=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.