From: Eric Biggers <ebiggers@kernel.org>
To: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel <ardb@kernel.org>,
"Jason A . Donenfeld" <Jason@zx2c4.com>,
x86@kernel.org, linux-arm-kernel@lists.infradead.org,
Eric Biggers <ebiggers@kernel.org>
Subject: [PATCH 07/12] lib/crypto: x86/blake2s: Reduce size of BLAKE2S_SIGMA2
Date: Wed, 27 Aug 2025 08:11:26 -0700 [thread overview]
Message-ID: <20250827151131.27733-8-ebiggers@kernel.org> (raw)
In-Reply-To: <20250827151131.27733-1-ebiggers@kernel.org>
Save 480 bytes of .rodata by replacing the .long constants with .bytes,
and using the vpmovzxbd instruction to expand them.
Also update the code to do the loads before incrementing %rax rather
than after. This avoids the need for the first load to use an offset.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
lib/crypto/x86/blake2s-core.S | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/lib/crypto/x86/blake2s-core.S b/lib/crypto/x86/blake2s-core.S
index ac1c845445a4d..ef8e9f427aab3 100644
--- a/lib/crypto/x86/blake2s-core.S
+++ b/lib/crypto/x86/blake2s-core.S
@@ -27,23 +27,23 @@ SIGMA:
.byte 2, 6, 0, 8, 12, 10, 11, 3, 1, 4, 7, 15, 9, 13, 5, 14
.byte 12, 1, 14, 4, 5, 15, 13, 10, 8, 0, 6, 9, 11, 7, 3, 2
.byte 13, 7, 12, 3, 11, 14, 1, 9, 2, 5, 15, 8, 10, 0, 4, 6
.byte 6, 14, 11, 0, 15, 9, 3, 8, 10, 12, 13, 1, 5, 2, 7, 4
.byte 10, 8, 7, 1, 2, 4, 6, 5, 13, 15, 9, 3, 0, 11, 14, 12
-.section .rodata.cst64.BLAKE2S_SIGMA2, "aM", @progbits, 640
+.section .rodata.cst64.BLAKE2S_SIGMA2, "aM", @progbits, 160
.align 64
SIGMA2:
-.long 0, 2, 4, 6, 1, 3, 5, 7, 14, 8, 10, 12, 15, 9, 11, 13
-.long 8, 2, 13, 15, 10, 9, 12, 3, 6, 4, 0, 14, 5, 11, 1, 7
-.long 11, 13, 8, 6, 5, 10, 14, 3, 2, 4, 12, 15, 1, 0, 7, 9
-.long 11, 10, 7, 0, 8, 15, 1, 13, 3, 6, 2, 12, 4, 14, 9, 5
-.long 4, 10, 9, 14, 15, 0, 11, 8, 1, 7, 3, 13, 2, 5, 6, 12
-.long 2, 11, 4, 15, 14, 3, 10, 8, 13, 6, 5, 7, 0, 12, 1, 9
-.long 4, 8, 15, 9, 14, 11, 13, 5, 3, 2, 1, 12, 6, 10, 7, 0
-.long 6, 13, 0, 14, 12, 2, 1, 11, 15, 4, 5, 8, 7, 9, 3, 10
-.long 15, 5, 4, 13, 10, 7, 3, 11, 12, 2, 0, 6, 9, 8, 1, 14
-.long 8, 7, 14, 11, 13, 15, 0, 12, 10, 4, 5, 6, 3, 2, 1, 9
+.byte 0, 2, 4, 6, 1, 3, 5, 7, 14, 8, 10, 12, 15, 9, 11, 13
+.byte 8, 2, 13, 15, 10, 9, 12, 3, 6, 4, 0, 14, 5, 11, 1, 7
+.byte 11, 13, 8, 6, 5, 10, 14, 3, 2, 4, 12, 15, 1, 0, 7, 9
+.byte 11, 10, 7, 0, 8, 15, 1, 13, 3, 6, 2, 12, 4, 14, 9, 5
+.byte 4, 10, 9, 14, 15, 0, 11, 8, 1, 7, 3, 13, 2, 5, 6, 12
+.byte 2, 11, 4, 15, 14, 3, 10, 8, 13, 6, 5, 7, 0, 12, 1, 9
+.byte 4, 8, 15, 9, 14, 11, 13, 5, 3, 2, 1, 12, 6, 10, 7, 0
+.byte 6, 13, 0, 14, 12, 2, 1, 11, 15, 4, 5, 8, 7, 9, 3, 10
+.byte 15, 5, 4, 13, 10, 7, 3, 11, 12, 2, 0, 6, 9, 8, 1, 14
+.byte 8, 7, 14, 11, 13, 15, 0, 12, 10, 4, 5, 6, 3, 2, 1, 9
.text
SYM_FUNC_START(blake2s_compress_ssse3)
testq %rdx,%rdx
je .Lendofloop
@@ -191,13 +191,13 @@ SYM_FUNC_START(blake2s_compress_avx512)
vmovdqu 0x20(%rsi),%ymm7
addq $0x40,%rsi
leaq SIGMA2(%rip),%rax
movb $0xa,%cl
.Lblake2s_compress_avx512_roundloop:
- addq $0x40,%rax
- vmovdqa -0x40(%rax),%ymm8
- vmovdqa -0x20(%rax),%ymm9
+ vpmovzxbd (%rax),%ymm8
+ vpmovzxbd 0x8(%rax),%ymm9
+ addq $0x10,%rax
vpermi2d %ymm7,%ymm6,%ymm8
vpermi2d %ymm7,%ymm6,%ymm9
vmovdqa %ymm8,%ymm6
vmovdqa %ymm9,%ymm7
vpaddd %xmm8,%xmm0,%xmm0
--
2.50.1
next prev parent reply other threads:[~2025-08-27 15:14 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 15:11 [PATCH 00/12] ChaCha and BLAKE2s cleanups Eric Biggers
2025-08-27 15:11 ` [PATCH 01/12] arm: configs: Remove obsolete assignments to CRYPTO_CHACHA20_NEON Eric Biggers
2025-08-27 15:11 ` [PATCH 02/12] crypto: chacha - register only "-lib" drivers Eric Biggers
2025-08-27 15:11 ` [PATCH 03/12] lib/crypto: chacha: Remove unused function chacha_is_arch_optimized() Eric Biggers
2025-08-27 15:11 ` [PATCH 04/12] lib/crypto: chacha: Rename chacha.c to chacha-block-generic.c Eric Biggers
2025-08-27 15:11 ` [PATCH 05/12] lib/crypto: chacha: Rename libchacha.c to chacha.c Eric Biggers
2025-08-27 15:11 ` [PATCH 06/12] lib/crypto: chacha: Consolidate into single module Eric Biggers
2025-08-27 15:11 ` Eric Biggers [this message]
2025-08-27 15:11 ` [PATCH 08/12] lib/crypto: blake2s: Remove obsolete self-test Eric Biggers
2025-08-27 15:11 ` [PATCH 09/12] lib/crypto: blake2s: Always enable arch-optimized BLAKE2s code Eric Biggers
2025-08-29 13:08 ` Honza Fikar
2025-08-29 15:29 ` Eric Biggers
2025-08-29 16:05 ` Ard Biesheuvel
2025-08-29 16:10 ` Eric Biggers
2025-08-27 15:11 ` [PATCH 10/12] lib/crypto: blake2s: Move generic code into blake2s.c Eric Biggers
2025-08-27 15:11 ` [PATCH 11/12] lib/crypto: blake2s: Consolidate into single C translation unit Eric Biggers
2025-08-27 15:11 ` [PATCH 12/12] lib/crypto: tests: Add KUnit tests for BLAKE2s Eric Biggers
2025-08-29 16:37 ` [PATCH 00/12] ChaCha and BLAKE2s cleanups Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250827151131.27733-8-ebiggers@kernel.org \
--to=ebiggers@kernel.org \
--cc=Jason@zx2c4.com \
--cc=ardb@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).