From: Eric Biggers <ebiggers@kernel.org>
To: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
Ard Biesheuvel <ardb@kernel.org>,
"Jason A . Donenfeld" <Jason@zx2c4.com>,
Eric Biggers <ebiggers@kernel.org>
Subject: [PATCH 7/8] crypto: x86/aes-gcm - optimize AVX512 precomputation of H^2 from H^1
Date: Wed, 1 Oct 2025 19:31:16 -0700 [thread overview]
Message-ID: <20251002023117.37504-8-ebiggers@kernel.org> (raw)
In-Reply-To: <20251002023117.37504-1-ebiggers@kernel.org>
Squaring in GF(2^128) requires fewer instructions than a generic
multiplication in GF(2^128). Take advantage of this when computing H^2
from H^1 in aes_gcm_precompute_vaes_avx512().
Note that aes_gcm_precompute_vaes_avx2() already uses this optimization.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
arch/x86/crypto/aes-gcm-vaes-avx512.S | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/x86/crypto/aes-gcm-vaes-avx512.S b/arch/x86/crypto/aes-gcm-vaes-avx512.S
index 3cf0945a25170..5c8301d275c66 100644
--- a/arch/x86/crypto/aes-gcm-vaes-avx512.S
+++ b/arch/x86/crypto/aes-gcm-vaes-avx512.S
@@ -258,10 +258,23 @@
vpclmulqdq $0x01, \mi, \gfpoly, \t0
vpshufd $0x4e, \mi, \mi
vpternlogd $0x96, \t0, \mi, \hi
.endm
+// This is a specialized version of _ghash_mul that computes \a * \a, i.e. it
+// squares \a. It skips computing MI = (a_L * a_H) + (a_H * a_L) = 0.
+.macro _ghash_square a, dst, gfpoly, t0, t1
+ vpclmulqdq $0x00, \a, \a, \t0 // LO = a_L * a_L
+ vpclmulqdq $0x11, \a, \a, \dst // HI = a_H * a_H
+ vpclmulqdq $0x01, \t0, \gfpoly, \t1 // LO_L*(x^63 + x^62 + x^57)
+ vpshufd $0x4e, \t0, \t0 // Swap halves of LO
+ vpxord \t0, \t1, \t1 // Fold LO into MI
+ vpclmulqdq $0x01, \t1, \gfpoly, \t0 // MI_L*(x^63 + x^62 + x^57)
+ vpshufd $0x4e, \t1, \t1 // Swap halves of MI
+ vpternlogd $0x96, \t0, \t1, \dst // Fold MI into HI
+.endm
+
// void aes_gcm_precompute_vaes_avx512(struct aes_gcm_key_vaes_avx512 *key);
//
// Given the expanded AES key |key->base.aes_key|, derive the GHASH subkey and
// initialize |key->h_powers| and |key->padding|.
SYM_FUNC_START(aes_gcm_precompute_vaes_avx512)
@@ -335,12 +348,11 @@ SYM_FUNC_START(aes_gcm_precompute_vaes_avx512)
// Note that as with H^1, all higher key powers also need an extra
// factor of x^-1 (or x using the natural interpretation). Nothing
// special needs to be done to make this happen, though: H^1 * H^1 would
// end up with two factors of x^-1, but the multiplication consumes one.
// So the product H^2 ends up with the desired one factor of x^-1.
- _ghash_mul H_CUR_XMM, H_CUR_XMM, H_INC_XMM, GFPOLY_XMM, \
- %xmm0, %xmm1, %xmm2
+ _ghash_square H_CUR_XMM, H_INC_XMM, GFPOLY_XMM, %xmm0, %xmm1
// Create H_CUR_YMM = [H^2, H^1] and H_INC_YMM = [H^2, H^2].
vinserti128 $1, H_CUR_XMM, H_INC_YMM, H_CUR_YMM
vinserti128 $1, H_INC_XMM, H_INC_YMM, H_INC_YMM
--
2.51.0
next prev parent reply other threads:[~2025-10-02 2:34 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-02 2:31 [PATCH 0/8] VAES+AVX2 optimized implementation of AES-GCM Eric Biggers
2025-10-02 2:31 ` [PATCH 1/8] crypto: x86/aes-gcm - add VAES+AVX2 optimized code Eric Biggers
2025-10-17 18:34 ` Eric Biggers
2025-10-02 2:31 ` [PATCH 2/8] crypto: x86/aes-gcm - remove VAES+AVX10/256 " Eric Biggers
2025-10-02 2:31 ` [PATCH 3/8] crypto: x86/aes-gcm - rename avx10 and avx10_512 to avx512 Eric Biggers
2025-10-02 2:31 ` [PATCH 4/8] crypto: x86/aes-gcm - clean up AVX512 code to assume 512-bit vectors Eric Biggers
2025-10-02 2:31 ` [PATCH 5/8] crypto: x86/aes-gcm - reorder AVX512 precompute and aad_update functions Eric Biggers
2025-10-02 2:31 ` [PATCH 6/8] crypto: x86/aes-gcm - revise some comments in AVX512 code Eric Biggers
2025-10-02 2:31 ` Eric Biggers [this message]
2025-10-02 2:31 ` [PATCH 8/8] crypto: x86/aes-gcm - optimize long AAD processing with AVX512 Eric Biggers
2025-10-10 18:21 ` [PATCH 0/8] VAES+AVX2 optimized implementation of AES-GCM Ard Biesheuvel
2025-10-14 0:31 ` Eric Biggers
2025-10-17 8:25 ` Herbert Xu
2025-10-17 8:44 ` Ard Biesheuvel
2025-10-17 16:04 ` Eric Biggers
2025-10-17 20:50 ` Eric Biggers
2025-10-20 4:13 ` Herbert Xu
2025-10-20 16:57 ` Eric Biggers
2025-10-21 3:00 ` Herbert Xu
2025-10-17 8:24 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251002023117.37504-8-ebiggers@kernel.org \
--to=ebiggers@kernel.org \
--cc=Jason@zx2c4.com \
--cc=ardb@kernel.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).