All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-crypto@vger.kernel.org, davem@davemloft.net,
	gregkh@linuxfoundation.org, Samuel Neves <sneves@dei.uc.pt>,
	Andy Lutomirski <luto@kernel.org>,
	Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Subject: Re: [PATCH net-next v5 03/20] zinc: ChaCha20 generic C implementation and selftest
Date: Tue, 18 Sep 2018 18:08:17 -0700	[thread overview]
Message-ID: <20180919010816.GD74746@gmail.com> (raw)
In-Reply-To: <20180918161646.19105-4-Jason@zx2c4.com>

On Tue, Sep 18, 2018 at 06:16:29PM +0200, Jason A. Donenfeld wrote:
> diff --git a/lib/zinc/chacha20/chacha20.c b/lib/zinc/chacha20/chacha20.c
> new file mode 100644
> index 000000000000..3f00e1edd4c8
> --- /dev/null
> +++ b/lib/zinc/chacha20/chacha20.c
> @@ -0,0 +1,193 @@
> +/* SPDX-License-Identifier: MIT
> + *
> + * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> + *
> + * Implementation of the ChaCha20 stream cipher.
> + *
> + * Information: https://cr.yp.to/chacha.html
> + */
> +
> +#include <zinc/chacha20.h>
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <crypto/algapi.h>
> +
> +#ifndef HAVE_CHACHA20_ARCH_IMPLEMENTATION
> +void __init chacha20_fpu_init(void)
> +{
> +}
> +static inline bool chacha20_arch(u8 *out, const u8 *in, const size_t len,
> +				 const u32 key[8], const u32 counter[4],
> +				 simd_context_t *simd_context)
> +{
> +	return false;
> +}
> +static inline bool hchacha20_arch(u8 *derived_key, const u8 *nonce,
> +				  const u8 *key, simd_context_t *simd_context)
> +{
> +	return false;
> +}
> +#endif
> +
> +#define EXPAND_32_BYTE_K 0x61707865U, 0x3320646eU, 0x79622d32U, 0x6b206574U
> +
> +#define QUARTER_ROUND(x, a, b, c, d) ( \
> +	x[a] += x[b], \
> +	x[d] = rol32((x[d] ^ x[a]), 16), \
> +	x[c] += x[d], \
> +	x[b] = rol32((x[b] ^ x[c]), 12), \
> +	x[a] += x[b], \
> +	x[d] = rol32((x[d] ^ x[a]), 8), \
> +	x[c] += x[d], \
> +	x[b] = rol32((x[b] ^ x[c]), 7) \
> +)
> +
> +#define C(i, j) (i * 4 + j)
> +
> +#define DOUBLE_ROUND(x) ( \
> +	/* Column Round */ \
> +	QUARTER_ROUND(x, C(0, 0), C(1, 0), C(2, 0), C(3, 0)), \
> +	QUARTER_ROUND(x, C(0, 1), C(1, 1), C(2, 1), C(3, 1)), \
> +	QUARTER_ROUND(x, C(0, 2), C(1, 2), C(2, 2), C(3, 2)), \
> +	QUARTER_ROUND(x, C(0, 3), C(1, 3), C(2, 3), C(3, 3)), \
> +	/* Diagonal Round */ \
> +	QUARTER_ROUND(x, C(0, 0), C(1, 1), C(2, 2), C(3, 3)), \
> +	QUARTER_ROUND(x, C(0, 1), C(1, 2), C(2, 3), C(3, 0)), \
> +	QUARTER_ROUND(x, C(0, 2), C(1, 3), C(2, 0), C(3, 1)), \
> +	QUARTER_ROUND(x, C(0, 3), C(1, 0), C(2, 1), C(3, 2)) \
> +)
> +
> +#define TWENTY_ROUNDS(x) ( \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x), \
> +	DOUBLE_ROUND(x) \
> +)

Does this consistently perform as well as an implementation that organizes the
operations such that the quarterrounds for all columns/diagonals are
interleaved?  As-is, there are tight dependencies in QUARTER_ROUND() (as well as
in the existing chacha20_block() in lib/chacha20.c, for that matter), so we're
heavily depending on the compiler to do the needed interleaving so as to not get
potentially disastrous performance.  Making it explicit could be a good idea.

> +
> +static void chacha20_block_generic(__le32 *stream, u32 *state)
> +{
> +	u32 x[CHACHA20_BLOCK_WORDS];
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(x); ++i)
> +		x[i] = state[i];
> +
> +	TWENTY_ROUNDS(x);
> +
> +	for (i = 0; i < ARRAY_SIZE(x); ++i)
> +		stream[i] = cpu_to_le32(x[i] + state[i]);
> +
> +	++state[12];
> +}
> +
> +static void chacha20_generic(u8 *out, const u8 *in, u32 len, const u32 key[8],
> +			     const u32 counter[4])
> +{
> +	__le32 buf[CHACHA20_BLOCK_WORDS];
> +	u32 x[] = {
> +		EXPAND_32_BYTE_K,
> +		key[0], key[1], key[2], key[3],
> +		key[4], key[5], key[6], key[7],
> +		counter[0], counter[1], counter[2], counter[3]
> +	};
> +
> +	if (out != in)
> +		memmove(out, in, len);
> +
> +	while (len >= CHACHA20_BLOCK_SIZE) {
> +		chacha20_block_generic(buf, x);
> +		crypto_xor(out, (u8 *)buf, CHACHA20_BLOCK_SIZE);
> +		len -= CHACHA20_BLOCK_SIZE;
> +		out += CHACHA20_BLOCK_SIZE;
> +	}
> +	if (len) {
> +		chacha20_block_generic(buf, x);
> +		crypto_xor(out, (u8 *)buf, len);
> +	}
> +}

If crypto_xor_cpy() is used instead of crypto_xor(), and 'in' is incremented
along with 'out', then the memmove() is not needed.

- Eric

  reply	other threads:[~2018-09-19  1:08 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-18 16:16 [PATCH net-next v5 00/20] WireGuard: Secure Network Tunnel Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 01/20] asm: simd context helper API Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 02/20] zinc: introduce minimal cryptography library Jason A. Donenfeld
2018-09-20 15:41   ` Ard Biesheuvel
2018-09-20 16:01     ` Andy Lutomirski
2018-09-20 16:02     ` Arnd Bergmann
2018-09-21  0:11       ` Jason A. Donenfeld
2018-09-21  3:12         ` Andrew Lunn
2018-09-21  3:16           ` Jason A. Donenfeld
2018-09-21  3:23           ` Andy Lutomirski
2018-09-21  4:15             ` Jason A. Donenfeld
2018-09-21  4:30               ` Ard Biesheuvel
2018-09-21  4:32                 ` Jason A. Donenfeld
2018-09-21  4:52                 ` Andy Lutomirski
2018-09-22 16:11         ` Arnd Bergmann
2018-09-25  7:18           ` Arnd Bergmann
2018-09-25 14:29             ` Jason A. Donenfeld
2018-09-21  0:17     ` Jason A. Donenfeld
2018-09-25 10:25   ` Ard Biesheuvel
2018-09-25 14:44     ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 03/20] zinc: ChaCha20 generic C implementation and selftest Jason A. Donenfeld
2018-09-19  1:08   ` Eric Biggers [this message]
2018-09-19  2:02     ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 04/20] zinc: ChaCha20 x86_64 implementation Jason A. Donenfeld
2018-09-18 22:29   ` Thomas Gleixner
2018-09-19  2:14     ` Jason A. Donenfeld
2018-09-19  6:13       ` Thomas Gleixner
2018-09-19 11:33         ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 05/20] zinc: ChaCha20 ARM and ARM64 implementations Jason A. Donenfeld
2018-09-18 16:16   ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 06/20] zinc: ChaCha20 MIPS32r2 implementation Jason A. Donenfeld
2018-09-18 20:25   ` Paul Burton
2018-09-20 13:19     ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 07/20] zinc: Poly1305 generic C implementations and selftest Jason A. Donenfeld
2018-09-19  0:50   ` Eric Biggers
2018-09-19  1:35     ` Jason A. Donenfeld
2018-09-19  4:13       ` Andy Lutomirski
2018-09-19 11:50         ` Jason A. Donenfeld
2018-09-19 12:26           ` Jason A. Donenfeld
2018-09-19  1:39     ` Jason A. Donenfeld
2018-09-19  1:41       ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 08/20] zinc: Poly1305 x86_64 implementation Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 09/20] zinc: Poly1305 ARM and ARM64 implementations Jason A. Donenfeld
2018-09-18 16:16   ` Jason A. Donenfeld
2018-09-18 22:55   ` Eric Biggers
2018-09-18 22:55     ` Eric Biggers
2018-09-19  0:17     ` Jason A. Donenfeld
2018-09-19  0:17       ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 10/20] zinc: Poly1305 MIPS32r2 and MIPS64 implementations Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 11/20] zinc: ChaCha20Poly1305 construction and selftest Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 12/20] zinc: BLAKE2s generic C implementation " Jason A. Donenfeld
2018-09-19  0:41   ` Eric Biggers
2018-09-19  0:45     ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 13/20] zinc: BLAKE2s x86_64 implementation Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 14/20] zinc: Curve25519 generic C implementations and selftest Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 15/20] zinc: Curve25519 x86_64 implementation Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 16/20] zinc: Curve25519 ARM implementation Jason A. Donenfeld
2018-09-18 16:16   ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 17/20] crypto: port Poly1305 to Zinc Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 18/20] crypto: port ChaCha20 " Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 19/20] security/keys: rewrite big_key crypto to use Zinc Jason A. Donenfeld
2018-09-18 17:01   ` David Howells
2018-09-18 17:12     ` Jason A. Donenfeld
2018-09-18 16:16 ` [PATCH net-next v5 20/20] net: WireGuard secure network tunnel Jason A. Donenfeld
2018-09-18 23:34   ` Andrew Lunn
2018-09-19  2:04     ` Jason A. Donenfeld
2018-09-19 12:38       ` Andrew Lunn
2018-09-18 18:28 ` [PATCH net-next v5 00/20] WireGuard: Secure Network Tunnel Ard Biesheuvel
2018-09-18 21:01   ` Jason A. Donenfeld
2018-09-19 17:21     ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180919010816.GD74746@gmail.com \
    --to=ebiggers@kernel.org \
    --cc=Jason@zx2c4.com \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jeanphilippe.aumasson@gmail.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sneves@dei.uc.pt \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.