linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH resend v2 0/2] preparatory arm64 asm patches for yielding the NEON
@ 2018-03-29 13:13 Ard Biesheuvel
  2018-03-29 13:13 ` [PATCH resend v2 1/2] arm64: assembler: add utility macros to push/pop stack frames Ard Biesheuvel
  2018-03-29 13:13 ` [PATCH resend v2 2/2] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT Ard Biesheuvel
  0 siblings, 2 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2018-03-29 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

The RT people reported that the arm64 crypto NEON code behaves poorly in RT
context, because it disables preemption (to avoid having to context switch
the NEON registers) and usually processes the entire input in one go. When we
introduced this code, this was not unreasonable given the overhead of eager
preserve/restore, but today, there isn't that much overhead anymore, and so
we can consider approaches that have much better worst case scheduling latency.

Simply refactoring the code to only call into the core NEON transform one
block at a time results in a non-negligible performance impact, especially
on low end cores such as Cortex-A53 where memory accesses are relatively
costly. So instead, let's introduce some infrastructure to allow assembler
routines to do a conditional yield, i.e., check the TIF_NEED_RESCHED flag
after processing each block of input, and yield if it is set, in which case
some context may need to be preserved and restored, and or constant tables
reloaded.

Changes since v1:
- incorporate Dave's review feedback and add his Reviewed-bys
  . enhance non-nesting check in frame_push/_pop (#1)
  . describe cond_yield_neon convenience macro (#2)
  . discard yield sequence if CONFIG_PREEMPT=n (#2)
  . add missing include of linux/preempt.h (#2)

Patch #1 adds helper macros to create standard AAPCS stack frames. This is
needed because the assembler code will be modified to call into schedule()
[essentially], and so a stack frame is needed to preserve state.

Patch #2 adds helper macros to create the yielding code: check whether a
yield should be done, and preserve/restore the algorithm specific pieces
that will not be preserved across the yield in the NEON registers.

These patches have been broken out from the arm64/crypto series and resent
since they require careful review from the arm64 maintainers, rather than
pulled silently via the crypto tree (which already happened by accident and
got reverted)

Ard Biesheuvel (2):
  arm64: assembler: add utility macros to push/pop stack frames
  arm64: assembler: add macros to conditionally yield the NEON under
    PREEMPT

 arch/arm64/include/asm/assembler.h | 136 ++++++++++++++++++++
 arch/arm64/kernel/asm-offsets.c    |   3 +
 2 files changed, 139 insertions(+)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-03-29 13:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-29 13:13 [PATCH resend v2 0/2] preparatory arm64 asm patches for yielding the NEON Ard Biesheuvel
2018-03-29 13:13 ` [PATCH resend v2 1/2] arm64: assembler: add utility macros to push/pop stack frames Ard Biesheuvel
2018-03-29 13:13 ` [PATCH resend v2 2/2] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).