All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: "open list:HARDWARE RANDOM NUMBER GENERATOR CORE"
	<linux-crypto@vger.kernel.org>,
	linux-fscrypt@vger.kernel.org,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Paul Crowley <paulcrowley@google.com>,
	Greg Kaiser <gkaiser@google.com>,
	Michael Halcrow <mhalcrow@google.com>,
	"Jason A . Donenfeld" <Jason@zx2c4.com>,
	Samuel Neves <samuel.c.p.neves@gmail.com>,
	Tomer Ashur <tomer.ashur@esat.kuleuven.be>
Subject: Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
Date: Mon, 22 Oct 2018 15:40:10 -0700	[thread overview]
Message-ID: <20181022224008.GB59695@gmail.com> (raw)
In-Reply-To: <CAKv+Gu_vXmfNQT8j=G_Zz5C3-zDPPEQ2ne6ZZQw8mD0rifO8qA@mail.gmail.com>

Hi Ard,

On Mon, Oct 22, 2018 at 07:25:27PM -0300, Ard Biesheuvel wrote:
> >
> > Hmm, I'm actually leaning towards the following instead.  Unrolling multiple
> > strides to try to reduce loads of the keys doesn't seem worthwhile in the C
> > implementation; for one, it bloats the code size a lot
> > (412 => 2332 bytes on arm32).
> >
> > static void nh_generic(const u32 *key, const u8 *message, size_t message_len,
> >                        __le64 hash[NH_NUM_PASSES])
> > {
> >         u64 sums[4] = { 0, 0, 0, 0 };
> >
> >         BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
> >         BUILD_BUG_ON(NH_NUM_PASSES != 4);
> >
> >         while (message_len) {
> >                 u32 m0 = get_unaligned_le32(message + 0);
> >                 u32 m1 = get_unaligned_le32(message + 4);
> >                 u32 m2 = get_unaligned_le32(message + 8);
> >                 u32 m3 = get_unaligned_le32(message + 12);
> >
> >                 sums[0] += (u64)(u32)(m0 + key[ 0]) * (u32)(m2 + key[ 2]);
> >                 sums[1] += (u64)(u32)(m0 + key[ 4]) * (u32)(m2 + key[ 6]);
> >                 sums[2] += (u64)(u32)(m0 + key[ 8]) * (u32)(m2 + key[10]);
> >                 sums[3] += (u64)(u32)(m0 + key[12]) * (u32)(m2 + key[14]);
> >                 sums[0] += (u64)(u32)(m1 + key[ 1]) * (u32)(m3 + key[ 3]);
> >                 sums[1] += (u64)(u32)(m1 + key[ 5]) * (u32)(m3 + key[ 7]);
> >                 sums[2] += (u64)(u32)(m1 + key[ 9]) * (u32)(m3 + key[11]);
> >                 sums[3] += (u64)(u32)(m1 + key[13]) * (u32)(m3 + key[15]);
> 
> Are these (u32) casts really necessary? All the addends are u32 types,
> so I'd expect each (x + y) subexpression to have a u32 type already as
> well. Or am I missing something?
> 

The (u32) casts are only necessary when sizeof(int) > sizeof(u32), as then the
addends will be promoted to 'int'.  Of course, that's never the case for the
Linux kernel.  But I prefer it to be as robust and well-defined as possible,
since people might use this as a reference when coding other implementations,
which could end up finding their way into unusual and/or future platforms.

- Eric

WARNING: multiple messages have this Message-ID (diff)
From: ebiggers@kernel.org (Eric Biggers)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
Date: Mon, 22 Oct 2018 15:40:10 -0700	[thread overview]
Message-ID: <20181022224008.GB59695@gmail.com> (raw)
In-Reply-To: <CAKv+Gu_vXmfNQT8j=G_Zz5C3-zDPPEQ2ne6ZZQw8mD0rifO8qA@mail.gmail.com>

Hi Ard,

On Mon, Oct 22, 2018 at 07:25:27PM -0300, Ard Biesheuvel wrote:
> >
> > Hmm, I'm actually leaning towards the following instead.  Unrolling multiple
> > strides to try to reduce loads of the keys doesn't seem worthwhile in the C
> > implementation; for one, it bloats the code size a lot
> > (412 => 2332 bytes on arm32).
> >
> > static void nh_generic(const u32 *key, const u8 *message, size_t message_len,
> >                        __le64 hash[NH_NUM_PASSES])
> > {
> >         u64 sums[4] = { 0, 0, 0, 0 };
> >
> >         BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
> >         BUILD_BUG_ON(NH_NUM_PASSES != 4);
> >
> >         while (message_len) {
> >                 u32 m0 = get_unaligned_le32(message + 0);
> >                 u32 m1 = get_unaligned_le32(message + 4);
> >                 u32 m2 = get_unaligned_le32(message + 8);
> >                 u32 m3 = get_unaligned_le32(message + 12);
> >
> >                 sums[0] += (u64)(u32)(m0 + key[ 0]) * (u32)(m2 + key[ 2]);
> >                 sums[1] += (u64)(u32)(m0 + key[ 4]) * (u32)(m2 + key[ 6]);
> >                 sums[2] += (u64)(u32)(m0 + key[ 8]) * (u32)(m2 + key[10]);
> >                 sums[3] += (u64)(u32)(m0 + key[12]) * (u32)(m2 + key[14]);
> >                 sums[0] += (u64)(u32)(m1 + key[ 1]) * (u32)(m3 + key[ 3]);
> >                 sums[1] += (u64)(u32)(m1 + key[ 5]) * (u32)(m3 + key[ 7]);
> >                 sums[2] += (u64)(u32)(m1 + key[ 9]) * (u32)(m3 + key[11]);
> >                 sums[3] += (u64)(u32)(m1 + key[13]) * (u32)(m3 + key[15]);
> 
> Are these (u32) casts really necessary? All the addends are u32 types,
> so I'd expect each (x + y) subexpression to have a u32 type already as
> well. Or am I missing something?
> 

The (u32) casts are only necessary when sizeof(int) > sizeof(u32), as then the
addends will be promoted to 'int'.  Of course, that's never the case for the
Linux kernel.  But I prefer it to be as robust and well-defined as possible,
since people might use this as a reference when coding other implementations,
which could end up finding their way into unusual and/or future platforms.

- Eric

  reply	other threads:[~2018-10-23  7:00 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
2018-10-15 17:54 ` Eric Biggers
2018-10-15 17:54 ` [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-19 14:13   ` Ard Biesheuvel
2018-10-19 14:13     ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 02/12] crypto: chacha20-generic - add XChaCha20 support Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-19 14:24   ` Ard Biesheuvel
2018-10-19 14:24     ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 03/12] crypto: chacha20-generic - refactor to allow varying number of rounds Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-19 14:25   ` Ard Biesheuvel
2018-10-19 14:25     ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-19 14:34   ` Ard Biesheuvel
2018-10-19 14:34     ` Ard Biesheuvel
2018-10-19 18:28     ` Eric Biggers
2018-10-19 18:28       ` Eric Biggers
2018-10-15 17:54 ` [RFC PATCH v2 05/12] crypto: arm/chacha20 - add XChaCha20 support Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-20  2:29   ` Ard Biesheuvel
2018-10-20  2:29     ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-20  3:35   ` Ard Biesheuvel
2018-10-20  3:35     ` Ard Biesheuvel
2018-10-20  5:26     ` Eric Biggers
2018-10-20  5:26       ` Eric Biggers
2018-10-15 17:54 ` [RFC PATCH v2 07/12] crypto: arm/chacha - add XChaCha12 support Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-20  3:36   ` Ard Biesheuvel
2018-10-20  3:36     ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 08/12] crypto: poly1305 - add Poly1305 core API Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-20  3:45   ` Ard Biesheuvel
2018-10-20  3:45     ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-20  4:00   ` Ard Biesheuvel
2018-10-20  4:00     ` Ard Biesheuvel
2018-10-20  5:38     ` Eric Biggers
2018-10-20  5:38       ` Eric Biggers
2018-10-20 15:06       ` Ard Biesheuvel
2018-10-20 15:06         ` Ard Biesheuvel
2018-10-22 18:42         ` Eric Biggers
2018-10-22 18:42           ` Eric Biggers
2018-10-22 22:25           ` Ard Biesheuvel
2018-10-22 22:25             ` Ard Biesheuvel
2018-10-22 22:40             ` Eric Biggers [this message]
2018-10-22 22:40               ` Eric Biggers
2018-10-22 22:43               ` Ard Biesheuvel
2018-10-22 22:43                 ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305 Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-20  4:12   ` Ard Biesheuvel
2018-10-20  4:12     ` Ard Biesheuvel
2018-10-20  5:51     ` Eric Biggers
2018-10-20  5:51       ` Eric Biggers
2018-10-20 15:00       ` Ard Biesheuvel
2018-10-20 15:00         ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-20  4:17   ` Ard Biesheuvel
2018-10-20  4:17     ` Ard Biesheuvel
2018-10-20  7:12     ` Eric Biggers
2018-10-20  7:12       ` Eric Biggers
2018-10-23 10:40       ` Ard Biesheuvel
2018-10-23 10:40         ` Ard Biesheuvel
2018-10-24 22:06         ` Eric Biggers
2018-10-24 22:06           ` Eric Biggers
2018-10-30  8:17           ` Herbert Xu
2018-10-30  8:17             ` Herbert Xu
2018-10-15 17:54 ` [RFC PATCH v2 12/12] fscrypt: " Eric Biggers
2018-10-15 17:54   ` Eric Biggers
2018-10-19 15:58 ` [RFC PATCH v2 00/12] crypto: " Jason A. Donenfeld
2018-10-19 15:58   ` Jason A. Donenfeld
2018-10-19 18:19   ` Paul Crowley
2018-10-19 18:19     ` Paul Crowley
2018-10-20  3:24     ` Ard Biesheuvel
2018-10-20  3:24       ` Ard Biesheuvel
2018-10-20  5:22       ` Eric Biggers
2018-10-20  5:22         ` Eric Biggers
2018-10-22 10:19     ` Tomer Ashur
2018-10-22 11:20       ` Tomer Ashur
2018-10-22 11:20         ` Tomer Ashur
2018-10-19 19:04   ` Eric Biggers
2018-10-19 19:04     ` Eric Biggers
2018-10-20 10:26     ` Milan Broz
2018-10-20 10:26       ` Milan Broz
2018-10-20 13:47       ` Jason A. Donenfeld
2018-10-20 13:47         ` Jason A. Donenfeld
2018-11-16 21:52       ` Eric Biggers
2018-11-16 21:52         ` Eric Biggers
2018-11-17 10:29         ` Milan Broz
2018-11-17 10:29           ` Milan Broz
2018-11-19 19:28           ` Eric Biggers
2018-11-19 19:28             ` Eric Biggers
2018-11-19 20:05             ` Milan Broz
2018-11-19 20:05               ` Milan Broz
2018-11-19 20:30               ` Jason A. Donenfeld
2018-11-19 20:30                 ` Jason A. Donenfeld
2018-10-21 22:23     ` Eric Biggers
2018-10-21 22:23       ` Eric Biggers
2018-10-21 22:51       ` Jason A. Donenfeld
2018-10-21 22:51         ` Jason A. Donenfeld
2018-10-22 17:17         ` Paul Crowley
2018-10-22 17:17           ` Paul Crowley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181022224008.GB59695@gmail.com \
    --to=ebiggers@kernel.org \
    --cc=Jason@zx2c4.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=gkaiser@google.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhalcrow@google.com \
    --cc=paulcrowley@google.com \
    --cc=samuel.c.p.neves@gmail.com \
    --cc=tomer.ashur@esat.kuleuven.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.