Re: [PATCH 3/3] crypto: LEA block cipher AVX2 optimization

public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed

From: Dave Hansen <dave.hansen@intel.com>
To: Dongsoo Lee <letrhee@nsr.re.kr>, linux-crypto@vger.kernel.org
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org,
	"David S. Miller" <abc@test.nsr.re.kr>,
	Dongsoo Lee <letrhee@gmail.com>
Subject: Re: [PATCH 3/3] crypto: LEA block cipher AVX2 optimization
Date: Fri, 28 Apr 2023 08:54:51 -0700	[thread overview]
Message-ID: <f0c77850-f8ca-3e21-2722-1deaae424130@intel.com> (raw)
In-Reply-To: <20230428110058.1516119-4-letrhee@nsr.re.kr>

> +config CRYPTO_LEA_AVX2
> +	tristate "Ciphers: LEA with modes: ECB, CBC, CTR, XTS (SSE2/MOVBE/AVX2)"
> +	select CRYPTO_LEA
> +	imply CRYPTO_XTS
> +	imply CRYPTO_CTR
> +	help
> +	  LEA cipher algorithm (KS X 3246, ISO/IEC 29192-2:2019)
> +
> +	  LEA is one of the standard cryptographic alorithms of
> +	  the Republic of Korea. It consists of four 32bit word.

The "four 32bit word" thing is probably not a detail end users care
about enough to see in Kconfig text.

> +	  See:
> +	  https://seed.kisa.or.kr/kisa/algorithm/EgovLeaInfo.do
> +
> +	  Architecture: x86_64 using:
> +	  - SSE2 (Streaming SIMD Extensions 2)
> +	  - MOVBE (Move Data After Swapping Bytes)
> +	  - AVX2 (Advanced Vector Extensions)

What about i386?  If this is truly 64-bit-only for some reason, it's not
reflected anywhere that I can see, like having a:

	depends on X86_64

I'm also a _bit_ confused why this has one config option called "_AVX2"
but that also includes the SSE2 implementation.

> +	  Processes 4(SSE2), 8(AVX2) blocks in parallel.
> +	  In CTR mode, the MOVBE instruction is utilized for improved performance.
> +
>  config CRYPTO_CHACHA20_X86_64
>  	tristate "Ciphers: ChaCha20, XChaCha20, XChaCha12 (SSSE3/AVX2/AVX-512VL)"
>  	depends on X86 && 64BIT
> diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
> index 9aa46093c91b..de23293b88df 100644
> --- a/arch/x86/crypto/Makefile
> +++ b/arch/x86/crypto/Makefile
> @@ -109,6 +109,9 @@ aria-aesni-avx2-x86_64-y := aria-aesni-avx2-asm_64.o aria_aesni_avx2_glue.o
>  obj-$(CONFIG_CRYPTO_ARIA_GFNI_AVX512_X86_64) += aria-gfni-avx512-x86_64.o
>  aria-gfni-avx512-x86_64-y := aria-gfni-avx512-asm_64.o aria_gfni_avx512_glue.o
>  
> +obj-$(CONFIG_CRYPTO_LEA_AVX2) += lea-avx2-x86_64.o
> +lea-avx2-x86_64-y := lea_avx2_x86_64-asm.o lea_avx2_glue.o
> +
>  quiet_cmd_perlasm = PERLASM $@
>        cmd_perlasm = $(PERL) $< > $@
>  $(obj)/%.S: $(src)/%.pl FORCE
> diff --git a/arch/x86/crypto/lea_avx2_glue.c b/arch/x86/crypto/lea_avx2_glue.c
> new file mode 100644
> index 000000000000..532958d3caa5
> --- /dev/null
> +++ b/arch/x86/crypto/lea_avx2_glue.c
> @@ -0,0 +1,1112 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Glue Code for the SSE2/MOVBE/AVX2 assembler instructions for the LEA Cipher
> + *
> + * Copyright (c) 2023 National Security Research.
> + * Author: Dongsoo Lee <letrhee@nsr.re.kr>
> + */
> +
> +#include <asm/simd.h>
> +#include <asm/unaligned.h>
> +#include <crypto/algapi.h>
> +#include <crypto/ctr.h>
> +#include <crypto/internal/simd.h>
> +#include <crypto/scatterwalk.h>
> +#include <crypto/skcipher.h>
> +#include <crypto/internal/skcipher.h>
> +#include <linux/err.h>
> +#include <linux/module.h>
> +#include <linux/types.h>
> +
> +#include <crypto/lea.h>
> +#include <crypto/xts.h>
> +#include "ecb_cbc_helpers.h"
> +
> +#define SIMD_KEY_ALIGN 16
> +#define SIMD_ALIGN_ATTR __aligned(SIMD_KEY_ALIGN)
> +
> +struct lea_xts_ctx {
> +	u8 raw_crypt_ctx[sizeof(struct crypto_lea_ctx)] SIMD_ALIGN_ATTR;
> +	u8 raw_tweak_ctx[sizeof(struct crypto_lea_ctx)] SIMD_ALIGN_ATTR;
> +};

The typing here is a bit goofy.  What's wrong with:

struct lea_xts_ctx {
	struct crypto_lea_ctx crypt_ctx SIMD_ALIGN_ATTR;
	struct crypto_lea_ctx lea_ctx   SIMD_ALIGN_ATTR;
};

?  You end up with the same sized structure but you don't have to cast
it as much.

> +struct _lea_u128 {
> +	u64 v0, v1;
> +};
> +
> +static inline void xor_1blk(u8 *out, const u8 *in1, const u8 *in2)
> +{
> +	const struct _lea_u128 *_in1 = (const struct _lea_u128 *)in1;
> +	const struct _lea_u128 *_in2 = (const struct _lea_u128 *)in2;
> +	struct _lea_u128 *_out = (struct _lea_u128 *)out;
> +
> +	_out->v0 = _in1->v0 ^ _in2->v0;
> +	_out->v1 = _in1->v1 ^ _in2->v1;
> +}
> +
> +static inline void xts_next_tweak(u8 *out, const u8 *in)
> +{
> +	const u64 *_in = (const u64 *)in;
> +	u64 *_out = (u64 *)out;
> +	u64 v0 = _in[0];
> +	u64 v1 = _in[1];
> +	u64 carry = (u64)(((s64)v1) >> 63);
> +
> +	v1 = (v1 << 1) ^ (v0 >> 63);
> +	v0 = (v0 << 1) ^ ((u64)carry & 0x87);
> +
> +	_out[0] = v0;
> +	_out[1] = v1;
> +}

I don't really care either way, but it's interesting that in two
adjacent functions this deals with two adjacent 64-bit values.  In one
it defines a structure with two u64's and in the next it treats it as an
array.

> +static int xts_encrypt_8way(struct skcipher_request *req)
> +{
...

It's kinda a shame that there isn't more code shared here between, for
instance the 4way and 8way functions.  But I guess this crypto code
tends to be merged and then very rarely fixed up after.

> +static int xts_lea_set_key(struct crypto_skcipher *tfm, const u8 *key,
> +				u32 keylen)
> +{
> +	struct crypto_tfm *tfm_ctx = crypto_skcipher_ctx(tfm);
> +	struct lea_xts_ctx *ctx = crypto_tfm_ctx(tfm_ctx);
> +
> +	struct crypto_lea_ctx *crypt_key =
> +		(struct crypto_lea_ctx *)(ctx->raw_crypt_ctx);
> +	struct crypto_lea_ctx *tweak_key =
> +		(struct crypto_lea_ctx *)(ctx->raw_tweak_ctx);

These were those goofy casts that can go away if the typing is a bit
more careful

...
> +static struct simd_skcipher_alg *lea_simd_algs[ARRAY_SIZE(lea_simd_avx2_algs)];
> +
> +static int __init crypto_lea_avx2_init(void)
> +{
> +	const char *feature_name;
> +
> +	if (!boot_cpu_has(X86_FEATURE_XMM2)) {
> +		pr_info("SSE2 instructions are not detected.\n");
> +		return -ENODEV;
> +	}
> +
> +	if (!boot_cpu_has(X86_FEATURE_MOVBE)) {
> +		pr_info("MOVBE instructions are not detected.\n");
> +		return -ENODEV;
> +	}
> +
> +	if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AVX)) {
> +		pr_info("AVX2 instructions are not detected.\n");
> +		return -ENODEV;
> +	}
> +
> +	if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
> +				&feature_name)) {
> +		pr_info("CPU feature '%s' is not supported.\n", feature_name);
> +		return -ENODEV;
> +	}

This looks suspect.

It requires that *ALL* of XMM2, MOVBE, AVX, AVX2 and XSAVE support for
*ANY* of these to be used.  In other cipher code that I've seen, it
separates out the AVX/YMM acceleration from the pure SSE2/XMM
acceleration functions so that CPUs with only SSE2 can still benefit.

Either this is wrong, or there is something subtle going on that I'm
missing.

next prev parent reply	other threads:[~2023-04-28 15:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-28 11:00 [PATCH 0/3] crypto: LEA block cipher implementation Dongsoo Lee
2023-04-28 11:00 ` [PATCH 1/3] " Dongsoo Lee
2023-04-28 23:29   ` Eric Biggers
2023-04-29  2:20     ` Letrhee
2023-04-28 11:00 ` [PATCH 2/3] crypto: add LEA testmgr tests Dongsoo Lee
2023-04-28 11:00 ` [PATCH 3/3] crypto: LEA block cipher AVX2 optimization Dongsoo Lee
2023-04-28 15:54   ` Dave Hansen [this message]
2023-05-16  4:29     ` Dongsoo Lee
2023-04-28 23:19 ` [PATCH 0/3] crypto: LEA block cipher implementation Eric Biggers
2023-05-16  4:27   ` Dongsoo Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0c77850-f8ca-3e21-2722-1deaae424130@intel.com \
    --to=dave.hansen@intel.com \
    --cc=abc@test.nsr.re.kr \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=hpa@zytor.com \
    --cc=letrhee@gmail.com \
    --cc=letrhee@nsr.re.kr \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox