public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Nathan Huckleberry <nhuck@google.com>
Cc: linux-crypto@vger.kernel.org,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org,
	Paul Crowley <paulcrowley@google.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Ard Biesheuvel <ardb@kernel.org>
Subject: Re: [PATCH v4 7/8] crypto: arm64/polyval: Add PMULL accelerated implementation of POLYVAL
Date: Tue, 12 Apr 2022 22:53:34 -0700	[thread overview]
Message-ID: <YlZlXq96e96Bg6tL@sol.localdomain> (raw)
In-Reply-To: <20220412172816.917723-8-nhuck@google.com>

Many of the comments I made on the x86 version apply to this too, but a few
unique comments below:

On Tue, Apr 12, 2022 at 05:28:15PM +0000, Nathan Huckleberry wrote:
> Add hardware accelerated version of POLYVAL for ARM64 CPUs with
> Crypto Extension support.

As I've mentioned, it seems to be "Crypto Extensions", not "Crypto Extension".
That's what everything else in the kernel says, at least.  It's just a nit, but
it's unclear whether you intentionally decided to not change this, or missed it.

> diff --git a/arch/arm64/crypto/polyval-ce-core.S b/arch/arm64/crypto/polyval-ce-core.S
[..]
> +	.text
> +	.align	4
> +
> +	.arch	armv8-a+crypto
> +	.align	4

Having '.align' twice is redundant, right?

> +
> +.Lgstar:
> +	.quad	0xc200000000000000, 0xc200000000000000
> +
> +/*
> + * Computes the product of two 128-bit polynomials in X and Y and XORs the
> + * components of the 256-bit product into LO, MI, HI.
> + *
> + * X = [X_1 : X_0]
> + * Y = [Y_1 : Y_0]
> + *
> + * The multiplication produces four parts:
> + *   LOW: The polynomial given by performing carryless multiplication of X_0 and
> + *   Y_0
> + *   MID: The polynomial given by performing carryless multiplication of (X_0 +
> + *   X_1) and (Y_0 + Y_1)
> + *   HIGH: The polynomial given by performing carryless multiplication of X_1
> + *   and Y_1
> + *
> + * We compute:
> + *  LO += LOW
> + *  MI += MID
> + *  HI += HIGH
> + *
> + * Later, the 256-bit result can be extracted as:
> + *   [HI_1 : HI_0 + HI_1 + MI_1 + LO_1 :: LO_1 + HI_0 + MI_0 + LO_0 : LO_0]

What does the double colon mean?

> + * This step is done when computing the polynomial reduction for efficiency
> + * reasons.
> + */
> +.macro karatsuba1 X Y

A super brief explanation of why Karatsuba multiplication is used instead of
schoolbook multiplication would be helpful.

> +	X .req \X
> +	Y .req \Y
> +	ext	v25.16b, X.16b, X.16b, #8
> +	ext	v26.16b, Y.16b, Y.16b, #8
> +	eor	v25.16b, v25.16b, X.16b
> +	eor	v26.16b, v26.16b, Y.16b
> +	pmull	v27.1q, v25.1d, v26.1d
> +	pmull2	v28.1q, X.2d, Y.2d
> +	pmull	v29.1q, X.1d, Y.1d
> +	eor	MI.16b, MI.16b, v27.16b
> +	eor	HI.16b, HI.16b, v28.16b
> +	eor	LO.16b, LO.16b, v29.16b
> +	.unreq X
> +	.unreq Y
> +.endm

Maybe move the pmull into v27 and the eor into MI down a bit, since they have
more dependencies?  I.e.

	ext	v25.16b, X.16b, X.16b, #8
	ext	v26.16b, Y.16b, Y.16b, #8
	eor	v25.16b, v25.16b, X.16b
	eor	v26.16b, v26.16b, Y.16b
	pmull2	v28.1q, X.2d, Y.2d
	pmull	v29.1q, X.1d, Y.1d
	pmull	v27.1q, v25.1d, v26.1d
	eor	HI.16b, HI.16b, v28.16b
	eor	LO.16b, LO.16b, v29.16b
	eor	MI.16b, MI.16b, v27.16b

I don't know whether it will actually matter on arm64, but on lower-powered
arm32 CPUs this sort of thing definitely can matter.

> +.macro karatsuba1_store X Y
> +	X .req \X
> +	Y .req \Y
> +	ext	v25.16b, X.16b, X.16b, #8
> +	ext	v26.16b, Y.16b, Y.16b, #8
> +	eor	v25.16b, v25.16b, X.16b
> +	eor	v26.16b, v26.16b, Y.16b
> +	pmull	MI.1q, v25.1d, v26.1d
> +	pmull2	HI.1q, X.2d, Y.2d
> +	pmull	LO.1q, X.1d, Y.1d
> +	.unreq X
> +	.unreq Y
> +.endm

Likewise above.

> diff --git a/arch/arm64/crypto/polyval-ce-glue.c b/arch/arm64/crypto/polyval-ce-glue.c
[...]
> +struct polyval_async_ctx {
> +	struct cryptd_ahash *cryptd_tfm;
> +};

struct polyval_async_ctx is no longer used.

> +static int polyval_setkey(struct crypto_shash *tfm,
> +			const u8 *key, unsigned int keylen)
> +{
> +	struct polyval_ctx *ctx = crypto_shash_ctx(tfm);
> +	int i;
> +
> +	if (keylen != POLYVAL_BLOCK_SIZE)
> +		return -EINVAL;
> +
> +	BUILD_BUG_ON(sizeof(u128) != POLYVAL_BLOCK_SIZE);

Where does 'sizeof(u128)' come from?  This file doesn't use u128 at all.

- Eric

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-04-13  5:54 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12 17:28 [PATCH v4 0/8] crypto: HCTR2 support Nathan Huckleberry
2022-04-12 17:28 ` [PATCH v4 1/8] crypto: xctr - Add XCTR support Nathan Huckleberry
2022-04-18 19:03   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 2/8] crypto: polyval - Add POLYVAL support Nathan Huckleberry
2022-04-18 19:25   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 3/8] crypto: hctr2 - Add HCTR2 support Nathan Huckleberry
2022-04-13  4:20   ` Eric Biggers
2022-04-18 20:46   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 4/8] crypto: x86/aesni-xctr: Add accelerated implementation of XCTR Nathan Huckleberry
2022-04-14  7:00   ` Eric Biggers
2022-04-18 23:44   ` Eric Biggers
2022-04-19  0:13   ` Eric Biggers
2022-04-21 21:59     ` Nathan Huckleberry
2022-04-21 22:29       ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 5/8] crypto: arm64/aes-xctr: " Nathan Huckleberry
2022-04-19  4:33   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 6/8] crypto: x86/polyval: Add PCLMULQDQ accelerated implementation of POLYVAL Nathan Huckleberry
2022-04-13  5:18   ` Eric Biggers
2022-04-18 21:36   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 7/8] crypto: arm64/polyval: Add PMULL " Nathan Huckleberry
2022-04-13  5:53   ` Eric Biggers [this message]
2022-04-12 17:28 ` [PATCH v4 8/8] fscrypt: Add HCTR2 support for filename encryption Nathan Huckleberry
2022-04-13  6:10   ` Eric Biggers
2022-04-13  6:16     ` Ard Biesheuvel
2022-04-14  7:12       ` Eric Biggers
2022-04-14  7:15         ` Ard Biesheuvel
2022-04-18 18:05   ` Eric Biggers
2022-04-14 14:18 ` [PATCH v4 0/8] crypto: HCTR2 support Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YlZlXq96e96Bg6tL@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=ardb@kernel.org \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=nhuck@google.com \
    --cc=paulcrowley@google.com \
    --cc=samitolvanen@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox