public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
To: Ard Biesheuvel <ardb@kernel.org>, Mark Brown <broonie@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	Jussi Kivilinna <jussi.kivilinna@iki.fi>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Maxime Coquelin <mcoquelin.stm32@gmail.com>,
	Alexandre Torgue <alexandre.torgue@foss.st.com>,
	Eric Biggers <ebiggers@kernel.org>,
	linux-crypto@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	linux-stm32@st-md-mailman.stormreply.com
Subject: Re: [PATCH 16/16] crypto: arm64/sm4 - add ARMv9 SVE cryptography acceleration implementation
Date: Tue, 27 Sep 2022 12:26:29 +0800	[thread overview]
Message-ID: <cb061cbb-5f28-9dde-270e-6d7ccb6d4433@linux.alibaba.com> (raw)
In-Reply-To: <CAMj1kXF8Fi9cG4p6udRYT4LbCAj0UBXQL12nmQBFEWvZsVX7Wg@mail.gmail.com>

Hi Ard,

On 9/26/22 6:02 PM, Ard Biesheuvel wrote:
> (cc Mark Brown)
> 
> Hello Tianjia,
> 
> On Mon, 26 Sept 2022 at 11:37, Tianjia Zhang
> <tianjia.zhang@linux.alibaba.com> wrote:
>>
>> Scalable Vector Extension (SVE) is the next-generation SIMD extension for
>> arm64. SVE allows flexible vector length implementations with a range of
>> possible values in CPU implementations. The vector length can vary from a
>> minimum of 128 bits up to a maximum of 2048 bits, at 128-bit increments.
>> The SVE design guarantees that the same application can run on different
>> implementations that support SVE, without the need to recompile the code.
>>
>> SVE was originally introduced by ARMv8, and ARMv9 introduced SVE2 to
>> expand and improve it. Similar to the Crypto Extension supported by the
>> NEON instruction set for the algorithm, SVE also supports the similar
>> instructions, called cryptography acceleration instructions, but this is
>> also optional instruction set.
>>
>> This patch uses SM4 cryptography acceleration instructions and SVE2
>> instructions to optimize the SM4 algorithm for ECB/CBC/CFB/CTR modes.
>> Since the encryption of CBC/CFB cannot be parallelized, the Crypto
>> Extension instruction is used.
>>
> 
> Given that we currently do not support the use of SVE in kernel mode,
> this patch cannot be accepted at this time (but the rest of the series
> looks reasonable to me, although I have only skimmed over the patches)
> 
> In view of the disappointing benchmark results below, I don't think
> this is worth the hassle at the moment. If we can find a case where
> using SVE in kernel mode truly makes a [favorable] difference, we can
> revisit this, but not without a thorough analysis of the impact it
> will have to support SVE in the kernel. Also, the fact that SVE may
> also cover cryptographic extensions does not necessarily imply that a
> micro-architecture will perform those crypto transformations in
> parallel and so the performance may be the same even if VL > 128.
> 
> In summary, please drop this patch for now, and once there are more
> encouraging performance numbers, please resubmit it as part of a
> series that explicitly enables SVE in kernel mode on arm64, and
> documents the requirements and constraints.
> 
> I have cc'ed Mark who has been working on the SVE support., who might
> have something to add here as well.
> 
> Thanks,
> Ard.
> 
> 

Thanks for your reply, the current performance of SVE is really
unsatisfactory. One reason is that the optimization of SVE needs to deal
with more and more complex data shifting operations, such as in CBC/CFB
mode, but also in CTR mode. needing more instruction to complete the
128-bit count increment, and the use of CE optimization does not have
these complications.

In addition, I naively thought that when the VL is 256-bit, the
performance will simply double compared to 128-bit. At present, this is
not the case. Maybe it is worth using SVE until there are significantly
improved performance data. I'll follow your advice and drop this
patch.

Best regards,
Tianjia


      parent reply	other threads:[~2022-09-27  4:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-26  9:36 [PATCH 00/16] Optimizing SM3 and SM4 algorithms using NEON/CE/SVE instructions Tianjia Zhang
2022-09-26  9:36 ` [PATCH 01/16] crypto: arm64/sm3 - raise the priority of the CE implementation Tianjia Zhang
2022-09-26  9:36 ` [PATCH 02/16] crypto: arm64/sm3 - add NEON assembly implementation Tianjia Zhang
2022-09-26  9:36 ` [PATCH 03/16] crypto: arm64/sm4 - refactor and simplify NEON implementation Tianjia Zhang
2022-09-26  9:36 ` [PATCH 04/16] crypto: testmgr - add SM4 cts-cbc/essiv/xts/xcbc test vectors Tianjia Zhang
2022-09-26  9:36 ` [PATCH 05/16] crypto: tcrypt - add SM4 cts-cbc/essiv/xts/xcbc test Tianjia Zhang
2022-09-26  9:36 ` [PATCH 06/16] crypto: arm64/sm4 - refactor and simplify CE implementation Tianjia Zhang
2022-09-26  9:36 ` [PATCH 07/16] crypto: arm64/sm4 - simplify sm4_ce_expand_key() of " Tianjia Zhang
2022-09-26  9:36 ` [PATCH 08/16] crypto: arm64/sm4 - export reusable CE acceleration functions Tianjia Zhang
2022-09-26  9:36 ` [PATCH 09/16] crypto: arm64/sm4 - add CE implementation for CTS-CBC mode Tianjia Zhang
2022-09-26  9:36 ` [PATCH 10/16] crypto: arm64/sm4 - add CE implementation for XTS mode Tianjia Zhang
2022-09-26  9:36 ` [PATCH 11/16] crypto: essiv - allow digestsize to be greater than keysize Tianjia Zhang
2022-09-26  9:36 ` [PATCH 12/16] crypto: arm64/sm4 - add CE implementation for ESSIV mode Tianjia Zhang
2022-09-26  9:36 ` [PATCH 13/16] crypto: arm64/sm4 - add CE implementation for cmac/xcbc/cbcmac Tianjia Zhang
2022-09-26  9:36 ` [PATCH 14/16] crypto: arm64/sm4 - add CE implementation for CCM mode Tianjia Zhang
2022-09-26  9:36 ` [PATCH 15/16] crypto: arm64/sm4 - add CE implementation for GCM mode Tianjia Zhang
2022-09-26  9:36 ` [PATCH 16/16] crypto: arm64/sm4 - add ARMv9 SVE cryptography acceleration implementation Tianjia Zhang
2022-09-26 10:02   ` Ard Biesheuvel
2022-09-26 17:14     ` Mark Brown
2022-09-27  4:30       ` Tianjia Zhang
2022-09-27  4:26     ` Tianjia Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb061cbb-5f28-9dde-270e-6d7ccb6d4433@linux.alibaba.com \
    --to=tianjia.zhang@linux.alibaba.com \
    --cc=alexandre.torgue@foss.st.com \
    --cc=ardb@kernel.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jussi.kivilinna@iki.fi \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-stm32@st-md-mailman.stormreply.com \
    --cc=mcoquelin.stm32@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox