All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: AlanSong-oc <AlanSong-oc@zhaoxin.com>
Cc: herbert@gondor.apana.org.au, davem@davemloft.net,
	Jason@zx2c4.com, ardb@kernel.org, linux-crypto@vger.kernel.org,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	CobeChen@zhaoxin.com, TonyWWang-oc@zhaoxin.com,
	YunShen@zhaoxin.com, GeorgeXue@zhaoxin.com,
	LeoLiu-oc@zhaoxin.com, HansHu@zhaoxin.com
Subject: Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
Date: Sat, 17 Jan 2026 16:31:20 -0800	[thread overview]
Message-ID: <20260118003120.GF74518@quark> (raw)
In-Reply-To: <20260116071513.12134-3-AlanSong-oc@zhaoxin.com>

On Fri, Jan 16, 2026 at 03:15:12PM +0800, AlanSong-oc wrote:
> Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its CPU
> instructions by PHE(Padlock Hash Engine) Extensions, including XSHA1,
> XSHA256, XSHA384 and XSHA512 instructions.
> 
> With the help of implementation of SHA in hardware instead of software,
> can develop applications with higher performance, more security and more
> flexibility.
> 
> This patch includes the XSHA1 instruction optimized implementation of
> SHA-1 transform function.
> 
> Signed-off-by: AlanSong-oc <AlanSong-oc@zhaoxin.com>

Please include the information I've asked for (benchmark results, test
results, and link to the specification) directly in the commit message.

> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
> +#define PHE_ALIGNMENT 16
> +static void sha1_blocks_phe(struct sha1_block_state *state,
> +			     const u8 *data, size_t nblocks)

The IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) should go in the CPU feature
check, so that the code will be parsed regardless of the setting.  That
reduces the chance that future changes will cause compilation errors.

> +	/*
> +	 * XSHA1 requires %edi to point to a 32-byte, 16-byte-aligned
> +	 * buffer on Zhaoxin processors.
> +	 */

This seems implausible.  In 64-bit mode a pointer can't fit in %edi.  I
thought you mentioned that this instruction is 64-bit compatible?  You
may have meant %rdi.

Interestingly, the spec you provided specifically says the registers
operated on are %eax, %ecx, %esi, and %edi.

So assuming the code works, perhaps both the spec and your code comment
are incorrect?

These errors don't really confidence in this instruction.

> +	memcpy(dst, state, SHA1_DIGEST_SIZE);
> +	asm volatile(".byte 0xf3,0x0f,0xa6,0xc8"
> +		     : "+S"(data), "+D"(dst)
> +		     : "a"((long)-1), "c"(nblocks));
> +	memcpy(state, dst, SHA1_DIGEST_SIZE);

Is the reason for using '.byte' that the GNU and clang assemblers don't
implement the mnemonic this Zhaoxin-specific instruction?  The spec
implies that the intended mnemonic is "rep sha1".

If that's correct, could you add a comment like /* rep sha1 */ so that
it's clear what the intended instruction is?

Also, the spec describes all four registers as both input and output
registers.  Yet your inline asm marks %rax and %rcx as inputs only.

> @@ -59,6 +79,11 @@ static void sha1_mod_init_arch(void)
>  {
>  	if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
>  		static_call_update(sha1_blocks_x86, sha1_blocks_ni);
> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
> +	} else if (boot_cpu_has(X86_FEATURE_PHE_EN)) {
> +		if (boot_cpu_data.x86 >= 0x07)
> +			static_call_update(sha1_blocks_x86, sha1_blocks_phe);
> +#endif

I think it should be:

	} else if (IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) &&
		   boot_cpu_has(X86_FEATURE_PHE_EN) &&
		   boot_cpu_data.x86 >= 0x07) {
			static_call_update(sha1_blocks_x86, sha1_blocks_phe);

... so (a) the code will be parsed even when !CONFIG_CPU_SUP_ZHAOXIN,
and (b) functions won't be unnecessarily disabled when
boot_cpu_has(X86_FEATURE_PHE_EN) && boot_cpu_data.x86 < 0x07).

As before, all these comments apply to the SHA-256 patch too.

- Eric

  reply	other threads:[~2026-01-18  0:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-16  7:15 [PATCH v3 0/3] lib/crypto: x86/sha: Add PHE Extensions support AlanSong-oc
2026-01-16  7:15 ` [PATCH v3 1/3] crypto: padlock-sha - Disable for Zhaoxin processor AlanSong-oc
2026-01-18  0:09   ` Eric Biggers
2026-03-05  1:36     ` AlanSong-oc
2026-01-16  7:15 ` [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function AlanSong-oc
2026-01-18  0:31   ` Eric Biggers [this message]
2026-03-05  1:37     ` AlanSong-oc
2026-03-05 19:18       ` Eric Biggers
2026-03-11 11:37         ` AlanSong-oc
2026-03-12  4:03           ` Eric Biggers
2026-03-13  2:58             ` AlanSong-oc
2026-03-13  3:28               ` Eric Biggers
2026-01-16  7:15 ` [PATCH v3 3/3] lib/crypto: x86/sha256: PHE Extensions optimized SHA256 " AlanSong-oc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260118003120.GF74518@quark \
    --to=ebiggers@kernel.org \
    --cc=AlanSong-oc@zhaoxin.com \
    --cc=CobeChen@zhaoxin.com \
    --cc=GeorgeXue@zhaoxin.com \
    --cc=HansHu@zhaoxin.com \
    --cc=Jason@zx2c4.com \
    --cc=LeoLiu-oc@zhaoxin.com \
    --cc=TonyWWang-oc@zhaoxin.com \
    --cc=YunShen@zhaoxin.com \
    --cc=ardb@kernel.org \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.