public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] lib/crypto: x86/sha: Add PHE Extensions support
@ 2026-01-16  7:15 AlanSong-oc
  2026-01-16  7:15 ` [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor AlanSong-oc
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: AlanSong-oc @ 2026-01-16  7:15 UTC (permalink / raw)
  To: herbert, davem, ebiggers, Jason, ardb, linux-crypto, linux-kernel,
	x86
  Cc: CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu,
	AlanSong-oc

This series adds support for PHE Extensions optimized SHA1 and SHA256
transform functions for Zhaoxin processors in lib/crypto, and disables
the padlock-sha driver on Zhaoxin platforms due to self-test failures.

The table below shows the benchmark results before and after this patch
series by using CRYPTO_LIB_BENCHMARK on Zhaoxin KX-7000 platform,
highlighting the achieved speedups.

+---------+-------------------------+--------------------------+
|         |         SHA1            |          SHA256          |
+---------+--------+----------------+--------+-----------------+
|   Len   | Before |      After     | Before |      After      |
+---------+--------+----------------+--------+-----------------+
|      1* |    3** |    8 (2.67x)   |    2   |    7 (3.50x)    |
|     16  |   52   |  125 (2.40x)   |   35   |  119 (3.40x)    |
|     64  |  114   |  318 (2.79x)   |   74   |  280 (3.78x)    |
|    127  |  154   |  440 (2.86x)   |   99   |  387 (3.91x)    |
|    128  |  160   |  492 (3.08x)   |  103   |  427 (4.15x)    |
|    200  |  189   |  605 (3.20x)   |  123   |  537 (4.37x)    |
|    256  |  199   |  676 (3.40x)   |  128   |  582 (4.55x)    |
|    511  |  223   |  794 (3.56x)   |  144   |  679 (4.72x)    |
|    512  |  225   |  833 (3.70x)   |  146   |  714 (4.89x)    |
|   1024  |  243   |  941 (3.87x)   |  157   |  796 (5.07x)    |
|   3173  |  259   | 1044 (4.03x)   |  167   |  883 (5.28x)    |
|   4096  |  257   | 1044 (4.06x)   |  166   |  876 (5.28x)    |
|  16384  |  261   | 1073 (4.11x)   |  169   |  899 (5.32x)    |
+---------+--------+----------------+--------+-----------------+
*: The length of each data block to be processed by one complete SHA
   sequence.
**: The throughput of processing data blocks, unit is Mb/s.

After applying this patch series, the KUnit test suites for SHA1 and
SHA256 pass successfully on Zhaoxin platforms. The following shows the
detailed test logs:

[    5.993700]     # Subtest: sha1
[    5.996813]     # module: sha1_kunit
[    5.996814]     1..11
[    6.003399]     ok 1 test_hash_test_vectors
[    6.012489]     ok 2 test_hash_all_lens_up_to_4096
[    6.028511]     ok 3 test_hash_incremental_updates
[    6.035766]     ok 4 test_hash_buffer_overruns
[    6.043445]     ok 5 test_hash_overlaps
[    6.050315]     ok 6 test_hash_alignment_consistency
[    6.054994]     ok 7 test_hash_ctx_zeroization
[    6.127778]     ok 8 test_hash_interrupt_context_1
[    6.774847]     ok 9 test_hash_interrupt_context_2
[    6.810745]     ok 10 test_hmac
[    6.835169]     # benchmark_hash: len=1: 8 MB/s
[    6.847167]     # benchmark_hash: len=16: 125 MB/s
[    6.862114]     # benchmark_hash: len=64: 318 MB/s
[    6.878173]     # benchmark_hash: len=127: 440 MB/s
[    6.893081]     # benchmark_hash: len=128: 492 MB/s
[    6.907976]     # benchmark_hash: len=200: 605 MB/s
[    6.922658]     # benchmark_hash: len=256: 676 MB/s
[    6.937558]     # benchmark_hash: len=511: 794 MB/s
[    6.951994]     # benchmark_hash: len=512: 833 MB/s
[    6.966262]     # benchmark_hash: len=1024: 941 MB/s
[    6.980295]     # benchmark_hash: len=3173: 1044 MB/s
[    6.994494]     # benchmark_hash: len=4096: 1044 MB/s
[    7.008728]     # benchmark_hash: len=16384: 1073 MB/s
[    7.014515]     ok 11 benchmark_hash
[    7.019628] # sha1: pass:11 fail:0 skip:0 total:11
[    7.023170] # Totals: pass:11 fail:0 skip:0 total:11
[    7.027916] ok 5 sha1

[    7.767257]     # Subtest: sha256
[    7.770542]     # module: sha256_kunit
[    7.770544]     1..15
[    7.777383]     ok 1 test_hash_test_vectors
[    7.788563]     ok 2 test_hash_all_lens_up_to_4096
[    7.806090]     ok 3 test_hash_incremental_updates
[    7.813553]     ok 4 test_hash_buffer_overruns
[    7.822384]     ok 5 test_hash_overlaps
[    7.829388]     ok 6 test_hash_alignment_consistency
[    7.833843]     ok 7 test_hash_ctx_zeroization
[    7.915191]     ok 8 test_hash_interrupt_context_1
[    8.362312]     ok 9 test_hash_interrupt_context_2
[    8.401607]     ok 10 test_hmac
[    8.415458]     ok 11 test_sha256_finup_2x
[    8.419397]     ok 12 test_sha256_finup_2x_defaultctx
[    8.424107]     ok 13 test_sha256_finup_2x_hugelen
[    8.451289]     # benchmark_hash: len=1: 7 MB/s
[    8.465372]     # benchmark_hash: len=16: 119 MB/s
[    8.481760]     # benchmark_hash: len=64: 280 MB/s
[    8.499344]     # benchmark_hash: len=127: 387 MB/s
[    8.515800]     # benchmark_hash: len=128: 427 MB/s
[    8.531970]     # benchmark_hash: len=200: 537 MB/s
[    8.548241]     # benchmark_hash: len=256: 582 MB/s
[    8.564838]     # benchmark_hash: len=511: 679 MB/s
[    8.580872]     # benchmark_hash: len=512: 714 MB/s
[    8.596858]     # benchmark_hash: len=1024: 796 MB/s
[    8.612567]     # benchmark_hash: len=3173: 883 MB/s
[    8.628546]     # benchmark_hash: len=4096: 876 MB/s
[    8.644482]     # benchmark_hash: len=16384: 899 MB/s
[    8.649773]     ok 14 benchmark_hash
[    8.655505]     ok 15 benchmark_sha256_finup_2x # SKIP not relevant
[    8.659065] # sha256: pass:14 fail:0 skip:1 total:15
[    8.665276] # Totals: pass:14 fail:0 skip:1 total:15
[    8.670195] ok 7 sha256

Changes in v3:
- Implement PHE Extensions optimized SHA1 and SHA256 transform functions
  using inline assembly instead of separate assembly files
- Eliminate unnecessary casts
- Add CONFIG_CPU_SUP_ZHAOXIN check to compile out the code when disabled
- Use 'boot_cpu_data.x86' to identify the CPU family instead of
  'cpu_data(0).x86'
- Only check X86_FEATURE_PHE_EN for CPU support, consistent with other
  CPU feature checks.
- Disable the padlock-sha driver on Zhaoxin processors with CPU family
  0x07 and newer.

Changes in v2:
- Add Zhaoxin support to lib/crypto instead of extending the existing
  padlock-sha driver

AlanSong-oc (3):
  crypto: padlock-sha - Disable for Zhaoxin processor
  lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  lib/crypto: x86/sha256: PHE Extensions optimized SHA256 transform
    function

 drivers/crypto/padlock-sha.c |  7 +++++++
 lib/crypto/x86/sha1.h        | 25 +++++++++++++++++++++++++
 lib/crypto/x86/sha256.h      | 25 +++++++++++++++++++++++++
 3 files changed, 57 insertions(+)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor
  2026-01-16  7:15 [PATCH v3 0/3] lib/crypto: x86/sha: Add PHE Extensions support AlanSong-oc
@ 2026-01-16  7:15 ` AlanSong-oc
  2026-01-18  0:09   ` Eric Biggers
  2026-01-16  7:15 ` [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function AlanSong-oc
  2026-01-16  7:15 ` [PATCH v3 3/3] lib/crypto: x86/sha256: PHE Extensions optimized SHA256 " AlanSong-oc
  2 siblings, 1 reply; 13+ messages in thread
From: AlanSong-oc @ 2026-01-16  7:15 UTC (permalink / raw)
  To: herbert, davem, ebiggers, Jason, ardb, linux-crypto, linux-kernel,
	x86
  Cc: CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu,
	AlanSong-oc

For Zhaoxin processors, the XSHA1 instruction requires the total memory
allocated at %rdi register must be 32 bytes, while the XSHA1 and
XSHA256 instruction doesn't perform any operation when %ecx is zero.

Due to these requirements, the current padlock-sha driver does not work
correctly with Zhaoxin processors. It cannot pass the self-tests and
therefore does not activate the driver on Zhaoxin processors. This issue
has been reported in Debian [1]. The self-tests fail with the
following messages [2]:

alg: shash: sha1-padlock-nano test failed (wrong result) on test vector 0, cfg="init+update+final aligned buffer"
alg: self-tests for sha1 using sha1-padlock-nano failed (rc=-22)
------------[ cut here ]------------

alg: shash: sha256-padlock-nano test failed (wrong result) on test vector 0, cfg="init+update+final aligned buffer"
alg: self-tests for sha256 using sha256-padlock-nano failed (rc=-22)
------------[ cut here ]------------

Disable the padlock-sha driver on Zhaoxin processors with CPU family
0x07 and newer. Add PHE Extensions support for SHA-1 and SHA-256 to
lib/crypto, following the suggestion in [3].

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1103397
[2] https://linux-hardware.org/?probe=271fabb7a4&log=dmesg
[3] https://lore.kernel.org/linux-crypto/aUI4CGp6kK7mxgEr@gondor.apana.org.au/

Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>
---
 drivers/crypto/padlock-sha.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/crypto/padlock-sha.c b/drivers/crypto/padlock-sha.c
index 329f60ad4..9214bbfc8 100644
--- a/drivers/crypto/padlock-sha.c
+++ b/drivers/crypto/padlock-sha.c
@@ -332,6 +332,13 @@ static int __init padlock_init(void)
 	if (!x86_match_cpu(padlock_sha_ids) || !boot_cpu_has(X86_FEATURE_PHE_EN))
 		return -ENODEV;
 
+	/*
+	 * Skip family 0x07 and newer used by Zhaoxin processors,
+	 * as the driver's self-tests fail on these CPUs.
+	 */
+	if (c->x86 >= 0x07)
+		return -ENODEV;
+
 	/* Register the newly added algorithm module if on *
 	* VIA Nano processor, or else just do as before */
 	if (c->x86_model < 0x0f) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-01-16  7:15 [PATCH v3 0/3] lib/crypto: x86/sha: Add PHE Extensions support AlanSong-oc
  2026-01-16  7:15 ` [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor AlanSong-oc
@ 2026-01-16  7:15 ` AlanSong-oc
  2026-01-18  0:31   ` Eric Biggers
  2026-01-16  7:15 ` [PATCH v3 3/3] lib/crypto: x86/sha256: PHE Extensions optimized SHA256 " AlanSong-oc
  2 siblings, 1 reply; 13+ messages in thread
From: AlanSong-oc @ 2026-01-16  7:15 UTC (permalink / raw)
  To: herbert, davem, ebiggers, Jason, ardb, linux-crypto, linux-kernel,
	x86
  Cc: CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu,
	AlanSong-oc

Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its CPU
instructions by PHE(Padlock Hash Engine) Extensions, including XSHA1,
XSHA256, XSHA384 and XSHA512 instructions.

With the help of implementation of SHA in hardware instead of software,
can develop applications with higher performance, more security and more
flexibility.

This patch includes the XSHA1 instruction optimized implementation of
SHA-1 transform function.

Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>
---
 lib/crypto/x86/sha1.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/lib/crypto/x86/sha1.h b/lib/crypto/x86/sha1.h
index c48a0131f..d4946270a 100644
--- a/lib/crypto/x86/sha1.h
+++ b/lib/crypto/x86/sha1.h
@@ -48,6 +48,26 @@ static void sha1_blocks_avx2(struct sha1_block_state *state,
 	}
 }
 
+#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
+#define PHE_ALIGNMENT 16
+static void sha1_blocks_phe(struct sha1_block_state *state,
+			     const u8 *data, size_t nblocks)
+{
+	/*
+	 * XSHA1 requires %edi to point to a 32-byte, 16-byte-aligned
+	 * buffer on Zhaoxin processors.
+	 */
+	u8 buf[32 + PHE_ALIGNMENT - 1];
+	u8 *dst = PTR_ALIGN(&buf[0], PHE_ALIGNMENT);
+
+	memcpy(dst, state, SHA1_DIGEST_SIZE);
+	asm volatile(".byte 0xf3,0x0f,0xa6,0xc8"
+		     : "+S"(data), "+D"(dst)
+		     : "a"((long)-1), "c"(nblocks));
+	memcpy(state, dst, SHA1_DIGEST_SIZE);
+}
+#endif /* CONFIG_CPU_SUP_ZHAOXIN */
+
 static void sha1_blocks(struct sha1_block_state *state,
 			const u8 *data, size_t nblocks)
 {
@@ -59,6 +79,11 @@ static void sha1_mod_init_arch(void)
 {
 	if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
 		static_call_update(sha1_blocks_x86, sha1_blocks_ni);
+#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
+	} else if (boot_cpu_has(X86_FEATURE_PHE_EN)) {
+		if (boot_cpu_data.x86 >= 0x07)
+			static_call_update(sha1_blocks_x86, sha1_blocks_phe);
+#endif
 	} else if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
 				     NULL) &&
 		   boot_cpu_has(X86_FEATURE_AVX)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/3] lib/crypto: x86/sha256: PHE Extensions optimized SHA256 transform function
  2026-01-16  7:15 [PATCH v3 0/3] lib/crypto: x86/sha: Add PHE Extensions support AlanSong-oc
  2026-01-16  7:15 ` [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor AlanSong-oc
  2026-01-16  7:15 ` [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function AlanSong-oc
@ 2026-01-16  7:15 ` AlanSong-oc
  2 siblings, 0 replies; 13+ messages in thread
From: AlanSong-oc @ 2026-01-16  7:15 UTC (permalink / raw)
  To: herbert, davem, ebiggers, Jason, ardb, linux-crypto, linux-kernel,
	x86
  Cc: CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu,
	AlanSong-oc

Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its CPU
instructions by PHE(Padlock Hash Engine) Extensions, including XSHA1,
XSHA256, XSHA384 and XSHA512 instructions.

With the help of implementation of SHA in hardware instead of software,
can develop applications with higher performance, more security and more
flexibility.

This patch includes the XSHA256 instruction optimized implementation of
SHA-256 transform function.

Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>
---
 lib/crypto/x86/sha256.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/lib/crypto/x86/sha256.h b/lib/crypto/x86/sha256.h
index 38e33b22a..70b68bad0 100644
--- a/lib/crypto/x86/sha256.h
+++ b/lib/crypto/x86/sha256.h
@@ -31,6 +31,26 @@ DEFINE_X86_SHA256_FN(sha256_blocks_avx, sha256_transform_avx);
 DEFINE_X86_SHA256_FN(sha256_blocks_avx2, sha256_transform_rorx);
 DEFINE_X86_SHA256_FN(sha256_blocks_ni, sha256_ni_transform);
 
+#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
+#define PHE_ALIGNMENT 16
+static void sha256_blocks_phe(struct sha256_block_state *state,
+			     const u8 *data, size_t nblocks)
+{
+	/*
+	 * XSHA256 requires %edi to point to a 32-byte, 16-byte-aligned
+	 * buffer on Zhaoxin processors.
+	 */
+	u8 buf[32 + PHE_ALIGNMENT - 1];
+	u8 *dst = PTR_ALIGN(&buf[0], PHE_ALIGNMENT);
+
+	memcpy(dst, state, SHA256_DIGEST_SIZE);
+	asm volatile(".byte 0xf3,0x0f,0xa6,0xd0"
+			: "+S"(data), "+D"(dst)
+			: "a"((long)-1), "c"(nblocks));
+	memcpy(state, dst, SHA256_DIGEST_SIZE);
+}
+#endif /* CONFIG_CPU_SUP_ZHAOXIN */
+
 static void sha256_blocks(struct sha256_block_state *state,
 			  const u8 *data, size_t nblocks)
 {
@@ -79,6 +99,11 @@ static void sha256_mod_init_arch(void)
 	if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
 		static_call_update(sha256_blocks_x86, sha256_blocks_ni);
 		static_branch_enable(&have_sha_ni);
+#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
+	} else if (boot_cpu_has(X86_FEATURE_PHE_EN)) {
+		if (boot_cpu_data.x86 >= 0x07)
+			static_call_update(sha256_blocks_x86, sha256_blocks_phe);
+#endif
 	} else if (cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
 				     NULL) &&
 		   boot_cpu_has(X86_FEATURE_AVX)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor
  2026-01-16  7:15 ` [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor AlanSong-oc
@ 2026-01-18  0:09   ` Eric Biggers
  2026-03-05  1:36     ` AlanSong-oc
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2026-01-18  0:09 UTC (permalink / raw)
  To: AlanSong-oc
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu

On Fri, Jan 16, 2026 at 03:15:11PM +0800, AlanSong-oc wrote:
> 
> Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>

Since this is a bug fix, please add Fixes and Cc stable tags.

- Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-01-16  7:15 ` [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function AlanSong-oc
@ 2026-01-18  0:31   ` Eric Biggers
  2026-03-05  1:37     ` AlanSong-oc
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2026-01-18  0:31 UTC (permalink / raw)
  To: AlanSong-oc
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu

On Fri, Jan 16, 2026 at 03:15:12PM +0800, AlanSong-oc wrote:
> Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its CPU
> instructions by PHE(Padlock Hash Engine) Extensions, including XSHA1,
> XSHA256, XSHA384 and XSHA512 instructions.
> 
> With the help of implementation of SHA in hardware instead of software,
> can develop applications with higher performance, more security and more
> flexibility.
> 
> This patch includes the XSHA1 instruction optimized implementation of
> SHA-1 transform function.
> 
> Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>

Please include the information I've asked for (benchmark results, test
results, and link to the specification) directly in the commit message.

> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
> +#define PHE_ALIGNMENT 16
> +static void sha1_blocks_phe(struct sha1_block_state *state,
> +			     const u8 *data, size_t nblocks)

The IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) should go in the CPU feature
check, so that the code will be parsed regardless of the setting.  That
reduces the chance that future changes will cause compilation errors.

> +	/*
> +	 * XSHA1 requires %edi to point to a 32-byte, 16-byte-aligned
> +	 * buffer on Zhaoxin processors.
> +	 */

This seems implausible.  In 64-bit mode a pointer can't fit in %edi.  I
thought you mentioned that this instruction is 64-bit compatible?  You
may have meant %rdi.

Interestingly, the spec you provided specifically says the registers
operated on are %eax, %ecx, %esi, and %edi.

So assuming the code works, perhaps both the spec and your code comment
are incorrect?

These errors don't really confidence in this instruction.

> +	memcpy(dst, state, SHA1_DIGEST_SIZE);
> +	asm volatile(".byte 0xf3,0x0f,0xa6,0xc8"
> +		     : "+S"(data), "+D"(dst)
> +		     : "a"((long)-1), "c"(nblocks));
> +	memcpy(state, dst, SHA1_DIGEST_SIZE);

Is the reason for using '.byte' that the GNU and clang assemblers don't
implement the mnemonic this Zhaoxin-specific instruction?  The spec
implies that the intended mnemonic is "rep sha1".

If that's correct, could you add a comment like /* rep sha1 */ so that
it's clear what the intended instruction is?

Also, the spec describes all four registers as both input and output
registers.  Yet your inline asm marks %rax and %rcx as inputs only.

> @@ -59,6 +79,11 @@ static void sha1_mod_init_arch(void)
>  {
>  	if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
>  		static_call_update(sha1_blocks_x86, sha1_blocks_ni);
> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
> +	} else if (boot_cpu_has(X86_FEATURE_PHE_EN)) {
> +		if (boot_cpu_data.x86 >= 0x07)
> +			static_call_update(sha1_blocks_x86, sha1_blocks_phe);
> +#endif

I think it should be:

	} else if (IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) &&
		   boot_cpu_has(X86_FEATURE_PHE_EN) &&
		   boot_cpu_data.x86 >= 0x07) {
			static_call_update(sha1_blocks_x86, sha1_blocks_phe);

... so (a) the code will be parsed even when !CONFIG_CPU_SUP_ZHAOXIN,
and (b) functions won't be unnecessarily disabled when
boot_cpu_has(X86_FEATURE_PHE_EN) && boot_cpu_data.x86 < 0x07).

As before, all these comments apply to the SHA-256 patch too.

- Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor
  2026-01-18  0:09   ` Eric Biggers
@ 2026-03-05  1:36     ` AlanSong-oc
  0 siblings, 0 replies; 13+ messages in thread
From: AlanSong-oc @ 2026-03-05  1:36 UTC (permalink / raw)
  To: Eric Biggers
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu

On 1/18/2026 8:09 AM, Eric Biggers wrote:
> 
> On Fri, Jan 16, 2026 at 03:15:11PM +0800, AlanSong-oc wrote:
>>
>> Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>
> 
> Since this is a bug fix, please add Fixes and Cc stable tags.
> 

I will add Fixes and CC stable tags in the next version of the patch.

Please accept my apologies for the delayed response due to
administrative procedures and the recent holidays. Thank you for your
review and valuable suggestions.

Best Regards
AlanSong-oc

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-01-18  0:31   ` Eric Biggers
@ 2026-03-05  1:37     ` AlanSong-oc
  2026-03-05 19:18       ` Eric Biggers
  0 siblings, 1 reply; 13+ messages in thread
From: AlanSong-oc @ 2026-03-05  1:37 UTC (permalink / raw)
  To: Eric Biggers
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu

On 1/18/2026 8:31 AM, Eric Biggers wrote:
> 
> On Fri, Jan 16, 2026 at 03:15:12PM +0800, AlanSong-oc wrote:
>> Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its CPU
>> instructions by PHE(Padlock Hash Engine) Extensions, including XSHA1,
>> XSHA256, XSHA384 and XSHA512 instructions.
>>
>> With the help of implementation of SHA in hardware instead of software,
>> can develop applications with higher performance, more security and more
>> flexibility.
>>
>> This patch includes the XSHA1 instruction optimized implementation of
>> SHA-1 transform function.
>>
>> Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>
> 
> Please include the information I've asked for (benchmark results, test
> results, and link to the specification) directly in the commit message.

Sorry for missing the link to the specification in the commit message.
I will include the benchmark results, test results, and the link to the
specification directly in the commit message in the next version of the
patch, rather than in the cover letter.

>> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
>> +#define PHE_ALIGNMENT 16
>> +static void sha1_blocks_phe(struct sha1_block_state *state,
>> +                          const u8 *data, size_t nblocks)
> 
> The IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) should go in the CPU feature
> check, so that the code will be parsed regardless of the setting.  That
> reduces the chance that future changes will cause compilation errors.

I will address this problem using the approach you described below in
the next version of the patch.

> 
>> +     /*
>> +      * XSHA1 requires %edi to point to a 32-byte, 16-byte-aligned
>> +      * buffer on Zhaoxin processors.
>> +      */
> 
> This seems implausible.  In 64-bit mode a pointer can't fit in %edi.  I
> thought you mentioned that this instruction is 64-bit compatible?  You
> may have meant %rdi.
> 
> Interestingly, the spec you provided specifically says the registers
> operated on are %eax, %ecx, %esi, and %edi.
> 
> So assuming the code works, perhaps both the spec and your code comment
> are incorrect?
> 
> These errors don't really confidence in this instruction.

Sorry for the misleading comment. I will correct it in the next version
of the patch. The specification provided earlier uses the 32-bit
register as an example, which doesn't mean the instruction only supports
32-bit mode. The updated specification explicitly clarifies this point
and is available at the following link.
(https://gitee.com/openzhaoxin/zhaoxin_specifications/blob/20260227/ZX_Padlock_Reference.pdf)

> 
>> +     memcpy(dst, state, SHA1_DIGEST_SIZE);
>> +     asm volatile(".byte 0xf3,0x0f,0xa6,0xc8"
>> +                  : "+S"(data), "+D"(dst)
>> +                  : "a"((long)-1), "c"(nblocks));
>> +     memcpy(state, dst, SHA1_DIGEST_SIZE);
> 
> Is the reason for using '.byte' that the GNU and clang assemblers don't
> implement the mnemonic this Zhaoxin-specific instruction?  The spec
> implies that the intended mnemonic is "rep sha1".
> 
> If that's correct, could you add a comment like /* rep sha1 */ so that
> it's clear what the intended instruction is?

The '.byte' directive is used to ensure the correct binary code is
generated, regardless of compiler support, particularly for compilers
that lack the corresponding mnemonic. I will add an appropriate comment
in the next version of the patch to clarify the intended instruction.

> Also, the spec describes all four registers as both input and output
> registers.  Yet your inline asm marks %rax and %rcx as inputs only.

Thank you for pointing this question out.

On the one hand, when the '+' constraint modifier is applied to an
operand, it is treated as both an input and an output operand.
Therefore, %rsi and %rdi are considered input operands as well.

On the other hand, after the instruction executes, the values in %rax,
%rsi, and %rcx are modified. These registers should therefore use the
'+' constraint modifier to inform the compiler that their values are
updated by the assembly code. We cannot rely on clobbers to indicate
that the values of input operands are modified following the suggestion
by gcc manual. However, since %rax is initialized with a constant value,
it does not need the '+' constraint modifier. It should can simply be
specified as an input operand.

In addition, although %rdi itself is not modified by the instruction but
the memory it references may be updated, a "memory" clobber should be
added to notify the compiler about possible memory side effects.

The corrected inline assembly should be written as follows:

    asm volatile(".byte 0xf3,0x0f,0xa6,0xc8" /* REP XSHA1 */
                : "+S"(data), "+c"(nblocks)
                : "a"((long)-1), "D"(dst)
                : "memory");

I will adopt it in the next version of the patch.

>> @@ -59,6 +79,11 @@ static void sha1_mod_init_arch(void)
>>  {
>>       if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
>>               static_call_update(sha1_blocks_x86, sha1_blocks_ni);
>> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
>> +     } else if (boot_cpu_has(X86_FEATURE_PHE_EN)) {
>> +             if (boot_cpu_data.x86 >= 0x07)
>> +                     static_call_update(sha1_blocks_x86, sha1_blocks_phe);
>> +#endif
> 
> I think it should be:
> 
>         } else if (IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) &&
>                    boot_cpu_has(X86_FEATURE_PHE_EN) &&
>                    boot_cpu_data.x86 >= 0x07) {
>                         static_call_update(sha1_blocks_x86, sha1_blocks_phe);
> 
> ... so (a) the code will be parsed even when !CONFIG_CPU_SUP_ZHAOXIN,
> and (b) functions won't be unnecessarily disabled when
> boot_cpu_has(X86_FEATURE_PHE_EN) && boot_cpu_data.x86 < 0x07).

Thanks for the suggestion, that makes more sense.

Using #if and #endif for conditional compilation is a poor choice, as it
prevents the code from being properly parsed. It is more efficient to
include CPU family checks directly in the condition.

> 
> As before, all these comments apply to the SHA-256 patch too.

Surely, I will also apply all of the suggestions mentioned above to the
SHA-256 patch.

Please accept my apologies for the delayed response due to
administrative procedures and the recent holidays. Thank you for your
review and valuable suggestions.

Best Regards
AlanSong-oc



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-03-05  1:37     ` AlanSong-oc
@ 2026-03-05 19:18       ` Eric Biggers
  2026-03-11 11:37         ` AlanSong-oc
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2026-03-05 19:18 UTC (permalink / raw)
  To: AlanSong-oc
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu-oc, HansHu

On Thu, Mar 05, 2026 at 09:37:01AM +0800, AlanSong-oc wrote:
> > Also, the spec describes all four registers as both input and output
> > registers.  Yet your inline asm marks %rax and %rcx as inputs only.
> 
> Thank you for pointing this question out.
> 
> On the one hand, when the '+' constraint modifier is applied to an
> operand, it is treated as both an input and an output operand.
> Therefore, %rsi and %rdi are considered input operands as well.
> 
> On the other hand, after the instruction executes, the values in %rax,
> %rsi, and %rcx are modified. These registers should therefore use the
> '+' constraint modifier to inform the compiler that their values are
> updated by the assembly code. We cannot rely on clobbers to indicate
> that the values of input operands are modified following the suggestion
> by gcc manual. However, since %rax is initialized with a constant value,
> it does not need the '+' constraint modifier. It should can simply be
> specified as an input operand.
> 
> In addition, although %rdi itself is not modified by the instruction but
> the memory it references may be updated, a "memory" clobber should be
> added to notify the compiler about possible memory side effects.
> 
> The corrected inline assembly should be written as follows:
> 
>     asm volatile(".byte 0xf3,0x0f,0xa6,0xc8" /* REP XSHA1 */
>                 : "+S"(data), "+c"(nblocks)
>                 : "a"((long)-1), "D"(dst)
>                 : "memory");

If the instruction both reads and writes %rax, then the constraint needs
to be "+a", even if the C code doesn't use the updated value.  Otherwise
the compiler can assume that the value stored in %rax is unchanged and
optimize the code accordingly, for example by not reinitializing %rax if
the constant -1 is needed again later on.

Yes, this means you'll need to move the constant -1 to a local variable.

> > As before, all these comments apply to the SHA-256 patch too.
> 
> Surely, I will also apply all of the suggestions mentioned above to the
> SHA-256 patch.

I also have to ask: are you sure you need SHA-1 to be optimized at all?
SHA-1 has been deprecated for a long time.  Most users have moved to
SHA-256 and other stronger algorithms, and those that haven't need to
move very soon.  There's little value in adding new optimized code for
SHA-1.

How about simplifying your patch to just SHA-256?  Then we can focus on
the one that's actually important and not on the deprecated SHA-1.

- Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-03-05 19:18       ` Eric Biggers
@ 2026-03-11 11:37         ` AlanSong-oc
  2026-03-12  4:03           ` Eric Biggers
  0 siblings, 1 reply; 13+ messages in thread
From: AlanSong-oc @ 2026-03-11 11:37 UTC (permalink / raw)
  To: Eric Biggers
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu, HansHu

On 3/6/26 03:18, Eric Biggers wrote:
> On Thu, Mar 05, 2026 at 09:37:01AM +0800, AlanSong-oc wrote:
>>> Also, the spec describes all four registers as both input and output
>>> registers.  Yet your inline asm marks %rax and %rcx as inputs only.
>>
>> Thank you for pointing this question out.
>>
>> On the one hand, when the '+' constraint modifier is applied to an
>> operand, it is treated as both an input and an output operand.
>> Therefore, %rsi and %rdi are considered input operands as well.
>>
>> On the other hand, after the instruction executes, the values in %rax,
>> %rsi, and %rcx are modified. These registers should therefore use the
>> '+' constraint modifier to inform the compiler that their values are
>> updated by the assembly code. We cannot rely on clobbers to indicate
>> that the values of input operands are modified following the suggestion
>> by gcc manual. However, since %rax is initialized with a constant value,
>> it does not need the '+' constraint modifier. It should can simply be
>> specified as an input operand.
>>
>> In addition, although %rdi itself is not modified by the instruction but
>> the memory it references may be updated, a "memory" clobber should be
>> added to notify the compiler about possible memory side effects.
>>
>> The corrected inline assembly should be written as follows:
>>
>>     asm volatile(".byte 0xf3,0x0f,0xa6,0xc8" /* REP XSHA1 */
>>                 : "+S"(data), "+c"(nblocks)
>>                 : "a"((long)-1), "D"(dst)
>>                 : "memory");
> 
> If the instruction both reads and writes %rax, then the constraint needs
> to be "+a", even if the C code doesn't use the updated value.  Otherwise
> the compiler can assume that the value stored in %rax is unchanged and
> optimize the code accordingly, for example by not reinitializing %rax if
> the constant -1 is needed again later on.
> 
> Yes, this means you'll need to move the constant -1 to a local variable.

Indeed, to ensure that the compiler generates correct binary code under
all optimization levels, the inline assembly should accurately describe
the behavior of the instruction. I will use a local variable initialized
to -1 for the instruction reads and writes %rax register in the next
version of the patch.

>>> As before, all these comments apply to the SHA-256 patch too.
>>
>> Surely, I will also apply all of the suggestions mentioned above to the
>> SHA-256 patch.
> 
> I also have to ask: are you sure you need SHA-1 to be optimized at all?
> SHA-1 has been deprecated for a long time.  Most users have moved to
> SHA-256 and other stronger algorithms, and those that haven't need to
> move very soon.  There's little value in adding new optimized code for
> SHA-1.
> 
> How about simplifying your patch to just SHA-256?  Then we can focus on
> the one that's actually important and not on the deprecated SHA-1.

It is true that SHA-1 is rarely used by most users today. However, it
may still be needed in certain scenarios. For those cases, we would like
to add support for the XHSA1 instruction to accelerate SHA-1.

Does the crypto community have any plans to remove SHA-1 support in
recent kernel versions?

Best Regards
AlanSong-oc



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-03-11 11:37         ` AlanSong-oc
@ 2026-03-12  4:03           ` Eric Biggers
  2026-03-13  2:58             ` AlanSong-oc
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2026-03-12  4:03 UTC (permalink / raw)
  To: AlanSong-oc
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu, HansHu

On Wed, Mar 11, 2026 at 07:37:39PM +0800, AlanSong-oc wrote:
> > I also have to ask: are you sure you need SHA-1 to be optimized at all?
> > SHA-1 has been deprecated for a long time.  Most users have moved to
> > SHA-256 and other stronger algorithms, and those that haven't need to
> > move very soon.  There's little value in adding new optimized code for
> > SHA-1.
> > 
> > How about simplifying your patch to just SHA-256?  Then we can focus on
> > the one that's actually important and not on the deprecated SHA-1.
> 
> It is true that SHA-1 is rarely used by most users today. However, it
> may still be needed in certain scenarios. For those cases, we would like
> to add support for the XHSA1 instruction to accelerate SHA-1.
> 
> Does the crypto community have any plans to remove SHA-1 support in
> recent kernel versions?

It's already possible to build a kernel without SHA-1 support.  SHA-1
has been cryptographically broken and is considered obsolete.
Performance-critical hashing in the kernel already tends to use SHA-256.

These patches already feel marginal, as they are being pushed without
QEMU support, so the community will be unable to test them.  The only
reason I would consider accepting them without QEMU support is because
there was already code in drivers/crypto/ that used these instructions.

It also helps that they are just single instructions.  Though, even with
that I still found a bug in the proposed code as well as errors in the
CPU documentation, as mentioned.  And the drivers/crypto/ implementation
that uses these instructions is broken too, as you're aware of.

Overall, it's clear that platform-specific routines like this are very
risky to maintain without adequate testing.  Yet, correctness is the
first priority in cryptographic code.

So I would suggest that to reduce the risk, we focus on just one
algorithm, SHA-256.  Note that this makes your job easier, as well.

- Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-03-12  4:03           ` Eric Biggers
@ 2026-03-13  2:58             ` AlanSong-oc
  2026-03-13  3:28               ` Eric Biggers
  0 siblings, 1 reply; 13+ messages in thread
From: AlanSong-oc @ 2026-03-13  2:58 UTC (permalink / raw)
  To: Eric Biggers
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu, HansHu

On 3/12/26 12:03, Eric Biggers wrote:
> On Wed, Mar 11, 2026 at 07:37:39PM +0800, AlanSong-oc wrote:
>>> I also have to ask: are you sure you need SHA-1 to be optimized at all?
>>> SHA-1 has been deprecated for a long time.  Most users have moved to
>>> SHA-256 and other stronger algorithms, and those that haven't need to
>>> move very soon.  There's little value in adding new optimized code for
>>> SHA-1.
>>>
>>> How about simplifying your patch to just SHA-256?  Then we can focus on
>>> the one that's actually important and not on the deprecated SHA-1.
>>
>> It is true that SHA-1 is rarely used by most users today. However, it
>> may still be needed in certain scenarios. For those cases, we would like
>> to add support for the XHSA1 instruction to accelerate SHA-1.
>>
>> Does the crypto community have any plans to remove SHA-1 support in
>> recent kernel versions?
> 
> It's already possible to build a kernel without SHA-1 support.  SHA-1
> has been cryptographically broken and is considered obsolete.
> Performance-critical hashing in the kernel already tends to use SHA-256.
> 
> These patches already feel marginal, as they are being pushed without
> QEMU support, so the community will be unable to test them.  The only
> reason I would consider accepting them without QEMU support is because
> there was already code in drivers/crypto/ that used these instructions.

Sorry for the inconvenience caused by the inability to test provided
patches, as QEMU currently does not support emulation of the XSHA1 and
XSHA256 instructions.

Besides, since the previous patch adding XSHA384 and XSHA512 instruction
support was not accepted, I would like to ask whether adding emulation
support for XSHA384 and XSHA512 instructions in QEMU would help the
crypto community evaluate and accept the corresponding kernel patches.

> It also helps that they are just single instructions.  Though, even with
> that I still found a bug in the proposed code as well as errors in the
> CPU documentation, as mentioned.  And the drivers/crypto/ implementation
> that uses these instructions is broken too, as you're aware of.
> 
> Overall, it's clear that platform-specific routines like this are very
> risky to maintain without adequate testing.  Yet, correctness is the
> first priority in cryptographic code.
> 
> So I would suggest that to reduce the risk, we focus on just one
> algorithm, SHA-256.  Note that this makes your job easier, as well.

Thanks for your suggestions. I will only add XSHA256 instruction support
for the SHA-256 algorithm in the next version of the patch.

Best Regards
AlanSong-oc

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  2026-03-13  2:58             ` AlanSong-oc
@ 2026-03-13  3:28               ` Eric Biggers
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Biggers @ 2026-03-13  3:28 UTC (permalink / raw)
  To: AlanSong-oc
  Cc: herbert, davem, Jason, ardb, linux-crypto, linux-kernel, x86,
	CobeChen, TonyWWang-oc, YunShen, GeorgeXue, LeoLiu, HansHu

On Fri, Mar 13, 2026 at 10:58:07AM +0800, AlanSong-oc wrote:
> On 3/12/26 12:03, Eric Biggers wrote:
> > On Wed, Mar 11, 2026 at 07:37:39PM +0800, AlanSong-oc wrote:
> >>> I also have to ask: are you sure you need SHA-1 to be optimized at all?
> >>> SHA-1 has been deprecated for a long time.  Most users have moved to
> >>> SHA-256 and other stronger algorithms, and those that haven't need to
> >>> move very soon.  There's little value in adding new optimized code for
> >>> SHA-1.
> >>>
> >>> How about simplifying your patch to just SHA-256?  Then we can focus on
> >>> the one that's actually important and not on the deprecated SHA-1.
> >>
> >> It is true that SHA-1 is rarely used by most users today. However, it
> >> may still be needed in certain scenarios. For those cases, we would like
> >> to add support for the XHSA1 instruction to accelerate SHA-1.
> >>
> >> Does the crypto community have any plans to remove SHA-1 support in
> >> recent kernel versions?
> > 
> > It's already possible to build a kernel without SHA-1 support.  SHA-1
> > has been cryptographically broken and is considered obsolete.
> > Performance-critical hashing in the kernel already tends to use SHA-256.
> > 
> > These patches already feel marginal, as they are being pushed without
> > QEMU support, so the community will be unable to test them.  The only
> > reason I would consider accepting them without QEMU support is because
> > there was already code in drivers/crypto/ that used these instructions.
> 
> Sorry for the inconvenience caused by the inability to test provided
> patches, as QEMU currently does not support emulation of the XSHA1 and
> XSHA256 instructions.
> 
> Besides, since the previous patch adding XSHA384 and XSHA512 instruction
> support was not accepted, I would like to ask whether adding emulation
> support for XSHA384 and XSHA512 instructions in QEMU would help the
> crypto community evaluate and accept the corresponding kernel patches.
> 
> > It also helps that they are just single instructions.  Though, even with
> > that I still found a bug in the proposed code as well as errors in the
> > CPU documentation, as mentioned.  And the drivers/crypto/ implementation
> > that uses these instructions is broken too, as you're aware of.
> > 
> > Overall, it's clear that platform-specific routines like this are very
> > risky to maintain without adequate testing.  Yet, correctness is the
> > first priority in cryptographic code.
> > 
> > So I would suggest that to reduce the risk, we focus on just one
> > algorithm, SHA-256.  Note that this makes your job easier, as well.
> 
> Thanks for your suggestions. I will only add XSHA256 instruction support
> for the SHA-256 algorithm in the next version of the patch.

Thanks.  Yes, adding QEMU support for these instructions would be very
helpful. 

- Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-03-13  3:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-16  7:15 [PATCH v3 0/3] lib/crypto: x86/sha: Add PHE Extensions support AlanSong-oc
2026-01-16  7:15 ` [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor AlanSong-oc
2026-01-18  0:09   ` Eric Biggers
2026-03-05  1:36     ` AlanSong-oc
2026-01-16  7:15 ` [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function AlanSong-oc
2026-01-18  0:31   ` Eric Biggers
2026-03-05  1:37     ` AlanSong-oc
2026-03-05 19:18       ` Eric Biggers
2026-03-11 11:37         ` AlanSong-oc
2026-03-12  4:03           ` Eric Biggers
2026-03-13  2:58             ` AlanSong-oc
2026-03-13  3:28               ` Eric Biggers
2026-01-16  7:15 ` [PATCH v3 3/3] lib/crypto: x86/sha256: PHE Extensions optimized SHA256 " AlanSong-oc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox