From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: linux-kernel@vger.kernel.org
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Dave Martin <Dave.Martin@arm.com>,
Russell King - ARM Linux <linux@armlinux.org.uk>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Mark Rutland <mark.rutland@arm.com>,
linux-rt-users@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH v4 19/20] crypto: arm64/crct10dif-ce - yield NEON after every block of input
Date: Tue, 26 Dec 2017 10:29:39 +0000 [thread overview]
Message-ID: <20171226102940.26908-20-ard.biesheuvel@linaro.org> (raw)
In-Reply-To: <20171226102940.26908-1-ard.biesheuvel@linaro.org>
Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/crypto/crct10dif-ce-core.S | 32 +++++++++++++++++---
1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/crypto/crct10dif-ce-core.S b/arch/arm64/crypto/crct10dif-ce-core.S
index d5b5a8c038c8..5bce7833ca5f 100644
--- a/arch/arm64/crypto/crct10dif-ce-core.S
+++ b/arch/arm64/crypto/crct10dif-ce-core.S
@@ -74,13 +74,19 @@
.text
.cpu generic+crypto
- arg1_low32 .req w0
- arg2 .req x1
- arg3 .req x2
+ arg1_low32 .req w19
+ arg2 .req x20
+ arg3 .req x21
vzr .req v13
ENTRY(crc_t10dif_pmull)
+ frame_push 3, 128
+
+ mov arg1_low32, w0
+ mov arg2, x1
+ mov arg3, x2
+
movi vzr.16b, #0 // init zero register
// adjust the 16-bit initial_crc value, scale it to 32 bits
@@ -175,8 +181,25 @@ CPU_LE( ext v12.16b, v12.16b, v12.16b, #8 )
subs arg3, arg3, #128
// check if there is another 64B in the buffer to be able to fold
- b.ge _fold_64_B_loop
+ b.lt _fold_64_B_end
+
+ if_will_cond_yield_neon
+ stp q0, q1, [sp, #.Lframe_local_offset]
+ stp q2, q3, [sp, #.Lframe_local_offset + 32]
+ stp q4, q5, [sp, #.Lframe_local_offset + 64]
+ stp q6, q7, [sp, #.Lframe_local_offset + 96]
+ do_cond_yield_neon
+ ldp q0, q1, [sp, #.Lframe_local_offset]
+ ldp q2, q3, [sp, #.Lframe_local_offset + 32]
+ ldp q4, q5, [sp, #.Lframe_local_offset + 64]
+ ldp q6, q7, [sp, #.Lframe_local_offset + 96]
+ ldr q10, rk3
+ movi vzr.16b, #0 // init zero register
+ endif_yield_neon
+
+ b _fold_64_B_loop
+_fold_64_B_end:
// at this point, the buffer pointer is pointing at the last y Bytes
// of the buffer the 64B of folded data is in 4 of the vector
// registers: v0, v1, v2, v3
@@ -304,6 +327,7 @@ _barrett:
_cleanup:
// scale the result back to 16 bits
lsr x0, x0, #16
+ frame_pop
ret
_less_than_128:
--
2.11.0
next prev parent reply other threads:[~2017-12-26 10:31 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-26 10:29 [PATCH v4 00/20] crypto: arm64 - play nice with CONFIG_PREEMPT Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 01/20] crypto: testmgr - add a new test case for CRC-T10DIF Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 02/20] crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 03/20] crypto: arm64/aes-blk " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 04/20] crypto: arm64/aes-bs " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 05/20] crypto: arm64/chacha20 " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 06/20] crypto: arm64/aes-blk - remove configurable interleave Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 07/20] crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt path Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 08/20] crypto: arm64/aes-blk - add 4 way interleave to CBC-MAC " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 09/20] crypto: arm64/sha256-neon - play nice with CONFIG_PREEMPT kernels Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 10/20] arm64: assembler: add utility macros to push/pop stack frames Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 12/20] crypto: arm64/sha1-ce - yield NEON after every block of input Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 13/20] crypto: arm64/sha2-ce " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 14/20] crypto: arm64/aes-ccm " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 15/20] crypto: arm64/aes-blk " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 16/20] crypto: arm64/aes-bs " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 17/20] crypto: arm64/aes-ghash " Ard Biesheuvel
2017-12-26 10:29 ` [PATCH v4 18/20] crypto: arm64/crc32-ce " Ard Biesheuvel
2017-12-26 10:29 ` Ard Biesheuvel [this message]
2017-12-26 10:29 ` [PATCH v4 20/20] DO NOT MERGE Ard Biesheuvel
2018-02-09 18:02 ` [PATCH v4 00/20] crypto: arm64 - play nice with CONFIG_PREEMPT Sebastian Andrzej Siewior
2018-02-09 19:28 ` Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171226102940.26908-20-ard.biesheuvel@linaro.org \
--to=ard.biesheuvel@linaro.org \
--cc=Dave.Martin@arm.com \
--cc=bigeasy@linutronix.de \
--cc=catalin.marinas@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=mark.rutland@arm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).