Linux cryptographic layer development
 help / color / mirror / Atom feed
* RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers
From: George Spelvin @ 2016-12-16 15:44 UTC (permalink / raw)
  To: ak, davem, David.Laight, ebiggers3, hannes, Jason,
	kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev,
	tom, torvalds, tytso, vegard.nossum
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0240EFA@AcuExch.aculab.com>

Jason A. Donenfeld wrote:
> Isn't that equivalent to:
>	v0 = key[0];
>	v1 = key[1];
>	v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL);
>	v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL);

(Pre-XORing key[] with the first two constants which, if the constants
are random in the first place, can be a no-op.)  Other than the typo
in the v2 line, yes.  If they key is non-public, then you can xor an
arbitrary constant in to both halves to slightly speed up the startup.

(Nits: There's a typo in the v2 line, you don't need to parenthesize
associative operators like xor, and the "ull" suffix is redundant here.)

> Those constants also look like ASCII strings.

They are.  The ASCII is "somepseudorandomlygeneratedbytes".

> What cryptographic analysis has been done on the values?

They're "nothing up my sleeve numbers".

They're arbitrary numbers, and almost any other values would do exactly
as well.  The main properties are:

1) They're different (particulatly v0 != v2 and v1 != v3), and
2) Neither they, nor their xor, is rotationally symmetric like 0x55555555.
   (Because SipHash is mostly rotationally symmetric, broken only by the
   interruption of the carry chain at the msbit, it helps slightly
   to break this up at the beginning.)

Those exact values only matter for portability.  If you don't need anyone
else to be able to compute matching outputs, then you could use any other
convenient constants (like the MD5 round constants).

^ permalink raw reply

* Re: algif for compression?
From: abed mohammad kamaluddin @ 2016-12-16 14:11 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20161210081014.GA32746@gondor.apana.org.au>

>
> The compression interface is currently in a state of flux.  We
> should make it settle down first before exporting it to user-space.
>
> For a start it would be good to actually switch IPsec/IPcomp over
> to the new compression interface.

Thanks Herbert. Are there timelines or  ongoing efforts for moving
IPcomp/Ipsec to use acomp? Or any proposals that have been or need to
be taken up in this regard.

Thanks,
Abed

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Stephan Müller @ 2016-12-16 13:58 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20161216123127.GA20406@gondor.apana.org.au>

Am Freitag, 16. Dezember 2016, 20:31:27 CET schrieb Herbert Xu:

Hi Herbert,
> > 
> > You are right, this will introduce a memleak. But with the immediate
> > freeing of sreq->tsg in the current code, the AIO interface cannot
> > support multiple IOCBs.
> > 
> > Thus, the entire memory handling in the AIO case seems broken.
> 
> Right, but can we please fix it properly? For example, you could
> save the original tsg in a new field and free that when you are
> done.

Absolutely, I concur. I will work on that.

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jean-Philippe Aumasson @ 2016-12-16 13:22 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
	Eric Biggers, Hannes Frederic Sowa, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum,
	Daniel J . Bernstein
In-Reply-To: <CAHmME9pjoAsoct1sVDpFFuqaqutv9X7DGJ5OBQXRAS57KFimUA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

It needs some basic security review, which I'll try do next week (check for
security margin, optimality of rotation counts, etc.). But after a lot of
experience with this kind of construction (BLAKE, SipHash, NORX), I'm
confident it will be safe as it is.



On Fri, Dec 16, 2016 at 1:44 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> Hey JP,
>
> On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson
> <jeanphilippe.aumasson@gmail.com> wrote:
> > Here's a tentative HalfSipHash:
> > https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c
> >
> > Haven't computed the cycle count nor measured its speed.
>
> This is incredible. Really. Wow!
>
> I'll integrate this into my patchset and will write up some
> documentation about when one should be used over the other.
>
> Thanks again. Quite exciting.
>
> Jason
>

[-- Attachment #2: Type: text/html, Size: 1706 bytes --]

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 12:39 UTC (permalink / raw)
  To: Jean-Philippe Aumasson
  Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
	Eric Biggers, Hannes Frederic Sowa, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum,
	Daniel J . Bernstein
In-Reply-To: <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com>

Hey JP,

On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson
<jeanphilippe.aumasson@gmail.com> wrote:
> Here's a tentative HalfSipHash:
> https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c
>
> Haven't computed the cycle count nor measured its speed.

This is incredible. Really. Wow!

I'll integrate this into my patchset and will write up some
documentation about when one should be used over the other.

Thanks again. Quite exciting.

Jason

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Herbert Xu @ 2016-12-16 12:31 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <2916745.RxhEJByb92@tauon.atsec.com>

On Fri, Dec 16, 2016 at 01:27:50PM +0100, Stephan Müller wrote:
> Am Freitag, 16. Dezember 2016, 19:54:36 CET schrieb Herbert Xu:
> 
> Hi Herbert,
> 
> > On Tue, Dec 13, 2016 at 09:42:45PM +0100, Stephan Müller wrote:
> > > +		/*
> > > +		 * The async operation may have processed only a subset of
> > > +		 * the data that was initially received from the caller.
> > > +		 * Thus, we only can release the data that a cipher operation
> > > +		 * processed.
> > > +		 */
> > > +		if (len < sg->length) {
> > > +			/* ensure that empty SGLs are not referenced any more */
> > > +			sreq->tsg = sg;
> > 
> > Hmm if you change sreq->tsg how is the original tsg ever going to
> > get freed?
> 
> You are right, this will introduce a memleak. But with the immediate freeing 
> of sreq->tsg in the current code, the AIO interface cannot support multiple 
> IOCBs.
> 
> Thus, the entire memory handling in the AIO case seems broken.

Right, but can we please fix it properly? For example, you could
save the original tsg in a new field and free that when you are
done.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Stephan Müller @ 2016-12-16 12:27 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20161216115436.GA19917@gondor.apana.org.au>

Am Freitag, 16. Dezember 2016, 19:54:36 CET schrieb Herbert Xu:

Hi Herbert,

> On Tue, Dec 13, 2016 at 09:42:45PM +0100, Stephan Müller wrote:
> > +		/*
> > +		 * The async operation may have processed only a subset of
> > +		 * the data that was initially received from the caller.
> > +		 * Thus, we only can release the data that a cipher operation
> > +		 * processed.
> > +		 */
> > +		if (len < sg->length) {
> > +			/* ensure that empty SGLs are not referenced any more */
> > +			sreq->tsg = sg;
> 
> Hmm if you change sreq->tsg how is the original tsg ever going to
> get freed?

You are right, this will introduce a memleak. But with the immediate freeing 
of sreq->tsg in the current code, the AIO interface cannot support multiple 
IOCBs.

Thus, the entire memory handling in the AIO case seems broken.
> 
> > +
> > +			/* advance the buffers to the unprocessed data */
> > +			sg->length -= len;
> > +			sg->offset += len;
> > +			return;
> > +		}
> > +
> > +		len -= sg->length;
> > +		put_page(page);
> > +	}
> > 
> >  	kfree(sreq->tsg);
> 
> Thanks,



Ciao
Stephan

^ permalink raw reply

* Re: [PATCH v2] crypto: marvell - Copy IVDIG before launching partial DMA ahash requests
From: Herbert Xu @ 2016-12-16 12:07 UTC (permalink / raw)
  To: Romain Perier
  Cc: boris.brezillon, arno, linux-crypto, jason, andrew,
	sebastian.hesselbarth, gregory.clement, thomas.petazzoni, nadavh,
	oferh, radioconfusion, romain.perier, stable
In-Reply-To: <20161214141507.19105-1-romain.perier@free-electrons.com>

Romain Perier <romain.perier@free-electrons.com> wrote:
> Currently, inner IV/DIGEST data are only copied once into the hash
> engines and not set explicitly before launching a request that is not a
> first frag. This is an issue especially when multiple ahash reqs are
> computed in parallel or chained with cipher request, as the state of the
> request being computed is not updated into the hash engine. It leads to
> non-deterministic corrupted digest results.
> 
> Fixes: commit 2786cee8e50b ("crypto: marvell - Move SRAM I/O operations to step functions")
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
> Cc: <stable@vger.kernel.org>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Herbert Xu @ 2016-12-16 11:54 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <4632372.rm33NXUfDp@positron.chronox.de>

On Tue, Dec 13, 2016 at 09:42:45PM +0100, Stephan Müller wrote:
>
> +		/*
> +		 * The async operation may have processed only a subset of
> +		 * the data that was initially received from the caller.
> +		 * Thus, we only can release the data that a cipher operation
> +		 * processed.
> +		 */
> +		if (len < sg->length) {
> +			/* ensure that empty SGLs are not referenced any more */
> +			sreq->tsg = sg;

Hmm if you change sreq->tsg how is the original tsg ever going to
get freed?

> +
> +			/* advance the buffers to the unprocessed data */
> +			sg->length -= len;
> +			sg->offset += len;
> +			return;
> +		}
> +
> +		len -= sg->length;
> +		put_page(page);
> +	}
>  
>  	kfree(sreq->tsg);

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* (unknown), 
From: системы администратор @ 2016-12-16 10:46 UTC (permalink / raw)


внимания;

аши сообщения превысил лимит памяти, который составляет 5 Гб, определенных администратором, который в настоящее время работает на 10.9GB, Вы не сможете отправить или получить новую почту, пока вы повторно не проверить ваш почтовый ящик почты. Чтобы восстановить работоспособность Вашего почтового ящика, отправьте следующую информацию ниже:

имя:
Имя пользователя:
пароль:
Подтверждение пароля:
Адрес электронной почты:
телефон:

Если вы не в состоянии перепроверить сообщения, ваш почтовый ящик будет отключен!

Приносим извинения за неудобства.
Проверочный код: EN: Ru...776774990..2016
Почты технической поддержки ©2016

спасибо
системы администратор

^ permalink raw reply

* RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers
From: David Laight @ 2016-12-16 10:39 UTC (permalink / raw)
  To: 'Jason A. Donenfeld', Netdev,
	kernel-hardening@lists.openwall.com, LKML,
	linux-crypto@vger.kernel.org, Ted Tso, Hannes Frederic Sowa,
	Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin,
	Vegard Nossum, ak@linux.intel.com, davem@davemloft.net,
	luto@amacapital.net
In-Reply-To: <20161215203003.31989-3-Jason@zx2c4.com>

From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> These restore parity with the jhash interface by providing high
> performance helpers for common input sizes.
...
> +#define PREAMBLE(len) \
> +	u64 v0 = 0x736f6d6570736575ULL; \
> +	u64 v1 = 0x646f72616e646f6dULL; \
> +	u64 v2 = 0x6c7967656e657261ULL; \
> +	u64 v3 = 0x7465646279746573ULL; \
> +	u64 b = ((u64)len) << 56; \
> +	v3 ^= key[1]; \
> +	v2 ^= key[0]; \
> +	v1 ^= key[1]; \
> +	v0 ^= key[0];

Isn't that equivalent to:
	v0 = key[0];
	v1 = key[1];
	v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL);
	v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL);

Those constants also look like ASCII strings.
What cryptographic analysis has been done on the values?

	David

^ permalink raw reply

* RE: [PATCH v5 3/4] secure_seq: use SipHash in place of MD5
From: David Laight @ 2016-12-16  9:59 UTC (permalink / raw)
  To: 'Jason A. Donenfeld', Netdev,
	kernel-hardening@lists.openwall.com, LKML,
	linux-crypto@vger.kernel.org, Ted Tso, Hannes Frederic Sowa,
	Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin,
	Vegard Nossum, ak@linux.intel.com, davem@davemloft.net,
	luto@amacapital.net
In-Reply-To: <20161215203003.31989-4-Jason@zx2c4.com>

From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> This gives a clear speed and security improvement. Siphash is both
> faster and is more solid crypto than the aging MD5.
> 
> Rather than manually filling MD5 buffers, for IPv6, we simply create
> a layout by a simple anonymous struct, for which gcc generates
> rather efficient code. For IPv4, we pass the values directly to the
> short input convenience functions.
...
> diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
> index 88a8e429fc3e..c80583bf3213 100644
...
> +	const struct {
> +		struct in6_addr saddr;
> +		struct in6_addr daddr;
> +		__be16 sport;
> +		__be16 dport;
> +		u32 padding;
> +	} __aligned(SIPHASH_ALIGNMENT) combined = {
> +		.saddr = *(struct in6_addr *)saddr,
> +		.daddr = *(struct in6_addr *)daddr,
> +		.sport = sport,
> +		.dport = dport
> +	};

I think you should explicitly initialise the 'padding'.
It can do no harm and makes it obvious that it is necessary.

You are still putting over-aligned data on stack.
You only need to align it to the alignment of u64 (not the size of u64).
If an on-stack item has a stronger alignment requirement than the stack
the gcc has to generate two stack frames for the function.

If you assign to each field (instead of using initialisers) then you
can get the alignment by making the first member an anonymous union
of in6_addr and u64.

Oh - and wait a bit longer between revisions.

	David

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jean-Philippe Aumasson @ 2016-12-16  8:08 UTC (permalink / raw)
  To: George Spelvin, ak, davem, David.Laight, ebiggers3, hannes, Jason,
	kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom,
	torvalds, tytso, vegard.nossum
  Cc: djb
In-Reply-To: <20161216034618.28276.qmail@ns.sciencehorizons.net>

[-- Attachment #1: Type: text/plain, Size: 4380 bytes --]

Here's a tentative HalfSipHash:
https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c

Haven't computed the cycle count nor measured its speed.

On Fri, Dec 16, 2016 at 4:46 AM George Spelvin <linux@sciencehorizons.net>
wrote:

> Jean-Philippe Aumasson wrote:
> > If a halved version of SipHash can bring significant performance boost
> > (with 32b words instead of 64b words) with an acceptable security level
> > (64-bit enough?) then we may design such a version.
>
> It would be fairly significant, a 2x speed benefit on a lot of 32-bit
> machines.
>
> First is the fact that a 64-bit SipHash round on a generic 32-bit machine
> requires not twice as many instructions, but more than three.
>
> Consider the core SipHash quarter-round operation:
>         a += b;
>         b = rotate_left(b, k)
>         b ^= a
>
> The add and xor are equivalent between 32- and 64-bit rounds; twice the
> instructions do twice the work.  (There's a dependency via the carry
> bit between the two halves of the add, but it ends up not being on the
> critical path even in a superscalar implementation.)
>
> The problem is the rotates.  Although some particularly nice code is
> possible on 32-bit ARM due to its support for shift-and-xor operations,
> on a generic 32-bit CPU the rotate grows to 6 instructions with a 2-cycle
> dependency chain (more in practice because barrel shifters are large and
> even quad-issue CPUs can't do 4 shifts per cycle):
>
>         temp_lo = b_lo >> (32-k)
>         temp_hi = b_hi >> (32-k)
>         b_lo <<= k
>         b_hi <<= k
>         b_lo ^= temp_hi
>         b_hi ^= temp_lo
>
> The resultant instruction counts and (assuming wide issue)
> latencies are:
>
>         64-bit SipHash  "Half" SipHash
>         Inst.   Latency Inst.   Latency
>          10      3        3      2      Quarter round
>          40      6       12      4      Full round
>          80     12       24      8      Two rounds
>          82     13       26      9      Mix in one word
>          82     13       52     18      Mix in 64 bits
>         166     26       61     18      Four round finalization + final XOR
>         248     39      113     36      Hash 64 bits
>         330     52      165     54      Hash 128 bits
>         412     65      217     72      Hash 192 bits
>
> While the ideal latencies are actually better for the 64-bit algorithm,
> that requires an unrealistic 6+-wide superscalar implementation that's
> more than twice as wide as the 64-bit code requires (which is already
> optimized for quad-issue).  For a 1- or 2-wide processor, the instruction
> counts dominate, and not only does the 64-bit algorithm take 60% more
> time to mix in the same number of bytes, but the finalization rounds
> bring the ratio to 2:1 for small inputs.
>
> (And I haven't included the possible savings if the input size is an odd
> number of 32-bit words, such as networking applications which include
> the source/dest port numbers.)
>
>
> Notes on particular processors:
> - x86 can do a 64-bit rotate in 3 instructions and 2 cycles using
>   the SHLD/SHRD instructions instead:
>         movl    %b_hi, %temp
>         shldl   $k, %b_lo, %b_hi
>         shldl   $k, %temp, %b_lo
>   ... but as I mentioned the problem is registers.  SipHash needs 8 32-bit
>   words plus at least one temporary, and 32-bit x86 has only 7 available.
>   (And compilers can rarely manage to keep more than 6 of them busy.)
> - 64-bit SipHash is particularly efficient on 32-bit ARM due to its
>   support for shift-and-op instructions.  The 64-bit shift and following
>   xor can be done in 4 instructions.  So the only benefit is from the
>   reduced finalization.
> - Double-width adds cost a little more on CPUs like MIPS and RISC-V without
>   condition codes.
> - Certain particularly crappy uClinux processors with slow shifts
>   (68000, anyone?) really suffer from extra shifts.
>
> One *weakly* requested feature: It might simplify some programming
> interfaces if we could use the same key for multiple hash tables with a
> 1-word "tweak" (e.g. pointer to the hash table, so it could be assumed
> non-zero if that helped) to make distinct functions.  That would let us
> more safely use a global key for multiple small hash tables without the
> need to add code to generate and store key material for each place that
> an unkeyed hash is replaced.
>

[-- Attachment #2: Type: text/html, Size: 6883 bytes --]

^ permalink raw reply

* Re: [RFC PATCH v2] crypto: Add IV generation algorithms
From: Binoy Jayan @ 2016-12-16  5:55 UTC (permalink / raw)
  To: Milan Broz
  Cc: Oded, Ofir, Herbert Xu, David S. Miller, linux-crypto, Mark Brown,
	Arnd Bergmann, Linux kernel mailing list, Alasdair Kergon,
	Mike Snitzer, dm-devel, Shaohua Li, linux-raid, Rajendra
In-Reply-To: <d6d92865-98fa-4d02-035f-9080bc265c35@gmail.com>

Hi Milan,

On 13 December 2016 at 15:31, Milan Broz <gmazyland@gmail.com> wrote:

> I think that IV generators should not modify or read encrypted data directly,
> it should only generate IV.

I was trying to find more information about what you said and how a
iv generator should be written. I saw two examples of IV generators
too used with AEAD ciphers (crypto/seqiv.c and crypto/echainiv.c)

Excerpt from crypto api doc:
http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority

2. Now, SEQIV uses the AEAD API function calls to invoke the associated
AEAD cipher. In our case, during the instantiation of SEQIV, the cipher
handle for GCM is provided to SEQIV. This means that SEQIV invokes
AEAD cipher operations with the GCM cipher handle.

Here, it says seqiv invokes cipher operations. However the code crypto/seqiv.c
does not look similar to how the modes are implemented which is confusing. I
was looking for an example of an IV generator used with a regular block cipher
and not a AEAD cipher. Could you point me out to some?

Thanks,
Binoy

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-16  3:46 UTC (permalink / raw)
  To: ak, davem, David.Laight, ebiggers3, hannes, Jason,
	jeanphilippe.aumasson, kernel-hardening, linux-crypto,
	linux-kernel, linux, luto, netdev, tom, torvalds, tytso,
	vegard.nossum
  Cc: djb
In-Reply-To: <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com>

Jean-Philippe Aumasson wrote:
> If a halved version of SipHash can bring significant performance boost
> (with 32b words instead of 64b words) with an acceptable security level
> (64-bit enough?) then we may design such a version.

It would be fairly significant, a 2x speed benefit on a lot of 32-bit
machines.

First is the fact that a 64-bit SipHash round on a generic 32-bit machine
requires not twice as many instructions, but more than three.

Consider the core SipHash quarter-round operation:
	a += b;
	b = rotate_left(b, k)
	b ^= a

The add and xor are equivalent between 32- and 64-bit rounds; twice the
instructions do twice the work.  (There's a dependency via the carry
bit between the two halves of the add, but it ends up not being on the
critical path even in a superscalar implementation.)

The problem is the rotates.  Although some particularly nice code is
possible on 32-bit ARM due to its support for shift-and-xor operations,
on a generic 32-bit CPU the rotate grows to 6 instructions with a 2-cycle
dependency chain (more in practice because barrel shifters are large and
even quad-issue CPUs can't do 4 shifts per cycle):

	temp_lo = b_lo >> (32-k)
	temp_hi = b_hi >> (32-k)
	b_lo <<= k
	b_hi <<= k
	b_lo ^= temp_hi
	b_hi ^= temp_lo

The resultant instruction counts and (assuming wide issue)
latencies are:

	64-bit SipHash	"Half" SipHash
	Inst.	Latency	Inst.	Latency
	 10	 3	  3	 2	Quarter round
	 40	 6	 12	 4	Full round
	 80	12	 24	 8	Two rounds
	 82	13	 26	 9	Mix in one word
	 82	13	 52	18	Mix in 64 bits
	166	26	 61	18	Four round finalization + final XOR
	248	39	113	36	Hash 64 bits
	330	52	165	54	Hash 128 bits
	412	65	217	72	Hash 192 bits

While the ideal latencies are actually better for the 64-bit algorithm,
that requires an unrealistic 6+-wide superscalar implementation that's
more than twice as wide as the 64-bit code requires (which is already
optimized for quad-issue).  For a 1- or 2-wide processor, the instruction
counts dominate, and not only does the 64-bit algorithm take 60% more
time to mix in the same number of bytes, but the finalization rounds
bring the ratio to 2:1 for small inputs.

(And I haven't included the possible savings if the input size is an odd
number of 32-bit words, such as networking applications which include
the source/dest port numbers.)


Notes on particular processors:
- x86 can do a 64-bit rotate in 3 instructions and 2 cycles using
  the SHLD/SHRD instructions instead:
	movl	%b_hi, %temp
	shldl	$k, %b_lo, %b_hi
	shldl	$k, %temp, %b_lo
  ... but as I mentioned the problem is registers.  SipHash needs 8 32-bit
  words plus at least one temporary, and 32-bit x86 has only 7 available.
  (And compilers can rarely manage to keep more than 6 of them busy.)
- 64-bit SipHash is particularly efficient on 32-bit ARM due to its
  support for shift-and-op instructions.  The 64-bit shift and following
  xor can be done in 4 instructions.  So the only benefit is from the
  reduced finalization.
- Double-width adds cost a little more on CPUs like MIPS and RISC-V without
  condition codes.
- Certain particularly crappy uClinux processors with slow shifts
  (68000, anyone?) really suffer from extra shifts.

One *weakly* requested feature: It might simplify some programming
interfaces if we could use the same key for multiple hash tables with a
1-word "tweak" (e.g. pointer to the hash table, so it could be assumed
non-zero if that helped) to make distinct functions.  That would let us
more safely use a global key for multiple small hash tables without the
need to add code to generate and store key material for each place that
an unkeyed hash is replaced.

^ permalink raw reply

* [PATCH v6 2/5] secure_seq: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161216030328.11602-1-Jason@zx2c4.com>

This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.

Rather than manually filling MD5 buffers, for IPv6, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code. For IPv4, we pass the values directly to the
short input convenience functions.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Laight <David.Laight@aculab.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/core/secure_seq.c | 133 ++++++++++++++++++++------------------------------
 1 file changed, 52 insertions(+), 81 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 88a8e429fc3e..c80583bf3213 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */
+
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/cryptohash.h>
@@ -8,14 +10,14 @@
 #include <linux/ktime.h>
 #include <linux/string.h>
 #include <linux/net.h>
-
+#include <linux/siphash.h>
 #include <net/secure_seq.h>
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
+#include <linux/in6.h>
 #include <net/tcp.h>
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
 
-static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
+static siphash_key_t net_secret;
 
 static __always_inline void net_secret_init(void)
 {
@@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq)
 u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 				 __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
+	u64 hash;
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash(&combined, sizeof(combined), net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 dport;
+		u16 padding1;
+		u32 padding2;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32) daddr[i];
-	secret[4] = net_secret[4] + (__force u32)dport;
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	return hash[0];
+	return siphash(&combined, sizeof(combined), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 			       __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	u64 hash;
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash_4u32(saddr, daddr, sport, dport, net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = (__force u32)dport ^ net_secret[14];
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	return hash[0];
+	return siphash_4u32(saddr, daddr, dport, 0, net_secret);
 }
 EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 #endif
@@ -126,21 +110,11 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
 	u64 seq;
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash_4u32(saddr, daddr, sport, dport, net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccp_sequence_number);
@@ -149,26 +123,23 @@ EXPORT_SYMBOL(secure_dccp_sequence_number);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
 				  __be16 sport, __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-	u32 i;
-
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash(&combined, sizeof(combined), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccpv6_sequence_number);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v6 0/5] The SipHash Patchset
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

Hey again,

This keeps getting more ambitious, which is good I suppose. If the frequency
of new versioned patchsets is too high for LKML and not customary, please let
me know. Otherwise, read on to see what's new this time...

With Hannes' suggestion, there is now only one siphash() function, which will
use the faster aligned version by compile-time constant folding. Additionally,
I now use constant folding to optionally switch to the helper siphash_Nu64
functions that are a bit faster for data of length 8, 16, 24, and 32. So, the
result is that you use siphash(data, len, key) if you have a buffer of sorts,
and then everything is taken care of for you. Or, if you have a series of
integers, you can opt to use siphash_Nu{32,64} functions instead. The basic
API is now complete.

After replacing MD5 in secure sequence number generation and the RNG, it
turned out that md5_transform wasn't used any place else in the tree, so
finally -- this is something to rejoice over -- lib/md5.c has been deleted and
now that function lives as a static function in crypto/md5.c where it belongs.

Meanwhile, it seems that sha_transform is used in places where SipHash would
be more fitting, so the IPv4 and IPv6 syncookies implementation now uses
SipHash, which should speed up TCP performance. Some BSDs already do this.

I'd like to replace sha_transform in addrconf, but that code is a bit gnarley,
so I don't want to be too meddlesome. I'm not entirely convinced either that
SipHash is a good choice for it. But I'm open to discussion here, so if you
have an opinion, please speak up.

If you've been following the evolution of this patchset, and think that
certain patches in it are fine, please do lend me your Reviewed-by to carry
into any subsequent versions, so that in case you disappear your useful
reviews will still keep the ball moving.

Thanks for all the great feedback thus far.

Jason

Jason A. Donenfeld (5):
  siphash: add cryptographically secure PRF
  secure_seq: use SipHash in place of MD5
  random: use SipHash in place of MD5
  md5: remove from lib and only live in crypto
  syncookies: use SipHash in place of SHA1

 MAINTAINERS             |   7 ++
 crypto/md5.c            |  95 +++++++++++++++++++++-
 drivers/char/random.c   |  32 +++-----
 include/linux/siphash.h |  86 ++++++++++++++++++++
 lib/Kconfig.debug       |   6 +-
 lib/Makefile            |   7 +-
 lib/md5.c               |  95 ----------------------
 lib/siphash.c           | 210 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      | 101 +++++++++++++++++++++++
 net/core/secure_seq.c   | 133 ++++++++++++------------------
 net/ipv4/syncookies.c   |  20 +----
 net/ipv6/syncookies.c   |  37 ++++-----
 12 files changed, 590 insertions(+), 239 deletions(-)
 create mode 100644 include/linux/siphash.h
 delete mode 100644 lib/md5.c
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

-- 
2.11.0

^ permalink raw reply

* [PATCH v6 5/5] syncookies: use SipHash in place of SHA1
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161216030328.11602-1-Jason@zx2c4.com>

SHA1 is slower and less secure than SipHash, and so replacing syncookie
generation with SipHash makes natural sense. Some BSDs have been doing
this for several years in fact.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 net/ipv4/syncookies.c | 20 ++++----------------
 net/ipv6/syncookies.c | 37 ++++++++++++++++---------------------
 2 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 3e88467d70ee..03bb068f8888 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -13,13 +13,13 @@
 #include <linux/tcp.h>
 #include <linux/slab.h>
 #include <linux/random.h>
-#include <linux/cryptohash.h>
+#include <linux/siphash.h>
 #include <linux/kernel.h>
 #include <linux/export.h>
 #include <net/tcp.h>
 #include <net/route.h>
 
-static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly;
+static siphash_key_t syncookie_secret[2] __read_mostly;
 
 #define COOKIEBITS 24	/* Upper bits store count */
 #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
@@ -48,24 +48,12 @@ static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly;
 #define TSBITS	6
 #define TSMASK	(((__u32)1 << TSBITS) - 1)
 
-static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv4_cookie_scratch);
-
 static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport,
 		       u32 count, int c)
 {
-	__u32 *tmp;
-
 	net_get_random_once(syncookie_secret, sizeof(syncookie_secret));
-
-	tmp  = this_cpu_ptr(ipv4_cookie_scratch);
-	memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c]));
-	tmp[0] = (__force u32)saddr;
-	tmp[1] = (__force u32)daddr;
-	tmp[2] = ((__force u32)sport << 16) + (__force u32)dport;
-	tmp[3] = count;
-	sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5);
-
-	return tmp[17];
+	return siphash_4u32(saddr, daddr, (u32)sport << 16 | dport, count,
+			    syncookie_secret[c]);
 }
 
 
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index a4d49760bf43..04d19e89a3e0 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -16,7 +16,7 @@
 
 #include <linux/tcp.h>
 #include <linux/random.h>
-#include <linux/cryptohash.h>
+#include <linux/siphash.h>
 #include <linux/kernel.h>
 #include <net/ipv6.h>
 #include <net/tcp.h>
@@ -24,7 +24,7 @@
 #define COOKIEBITS 24	/* Upper bits store count */
 #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
 
-static u32 syncookie6_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly;
+static siphash_key_t syncookie6_secret[2] __read_mostly;
 
 /* RFC 2460, Section 8.3:
  * [ipv6 tcp] MSS must be computed as the maximum packet size minus 60 [..]
@@ -41,30 +41,25 @@ static __u16 const msstab[] = {
 	9000 - 60,
 };
 
-static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv6_cookie_scratch);
-
 static u32 cookie_hash(const struct in6_addr *saddr, const struct in6_addr *daddr,
 		       __be16 sport, __be16 dport, u32 count, int c)
 {
-	__u32 *tmp;
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		u32 count;
+		u16 sport;
+		u16 dport;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *saddr,
+		.daddr = *daddr,
+		.count = count,
+		.sport = sport,
+		.dport = dport
+	};
 
 	net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret));
-
-	tmp  = this_cpu_ptr(ipv6_cookie_scratch);
-
-	/*
-	 * we have 320 bits of information to hash, copy in the remaining
-	 * 192 bits required for sha_transform, from the syncookie6_secret
-	 * and overwrite the digest with the secret
-	 */
-	memcpy(tmp + 10, syncookie6_secret[c], 44);
-	memcpy(tmp, saddr, 16);
-	memcpy(tmp + 4, daddr, 16);
-	tmp[8] = ((__force u32)sport << 16) + (__force u32)dport;
-	tmp[9] = count;
-	sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5);
-
-	return tmp[17];
+	return siphash(&combined, sizeof(combined), syncookie6_secret[c]);
 }
 
 static __u32 secure_tcp_syn_cookie(const struct in6_addr *saddr,
-- 
2.11.0

^ permalink raw reply related

* [PATCH v6 4/5] md5: remove from lib and only live in crypto
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161216030328.11602-1-Jason@zx2c4.com>

The md5_transform function is no longer used any where in the tree,
except for the crypto api's actual implementation of md5, so we can drop
the function from lib and put it as a static function of the crypto
file, where it belongs. There should be no new users of md5_transform,
anyway, since there are more modern ways of doing what it once achieved.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 crypto/md5.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 lib/Makefile |  2 +-
 lib/md5.c    | 95 ------------------------------------------------------------
 3 files changed, 95 insertions(+), 97 deletions(-)
 delete mode 100644 lib/md5.c

diff --git a/crypto/md5.c b/crypto/md5.c
index 2355a7c25c45..f7ae1a48225b 100644
--- a/crypto/md5.c
+++ b/crypto/md5.c
@@ -21,9 +21,11 @@
 #include <linux/module.h>
 #include <linux/string.h>
 #include <linux/types.h>
-#include <linux/cryptohash.h>
 #include <asm/byteorder.h>
 
+#define MD5_DIGEST_WORDS 4
+#define MD5_MESSAGE_BYTES 64
+
 const u8 md5_zero_message_hash[MD5_DIGEST_SIZE] = {
 	0xd4, 0x1d, 0x8c, 0xd9, 0x8f, 0x00, 0xb2, 0x04,
 	0xe9, 0x80, 0x09, 0x98, 0xec, 0xf8, 0x42, 0x7e,
@@ -47,6 +49,97 @@ static inline void cpu_to_le32_array(u32 *buf, unsigned int words)
 	}
 }
 
+#define F1(x, y, z)	(z ^ (x & (y ^ z)))
+#define F2(x, y, z)	F1(z, x, y)
+#define F3(x, y, z)	(x ^ y ^ z)
+#define F4(x, y, z)	(y ^ (x | ~z))
+
+#define MD5STEP(f, w, x, y, z, in, s) \
+	(w += f(x, y, z) + in, w = (w<<s | w>>(32-s)) + x)
+
+static void md5_transform(__u32 *hash, __u32 const *in)
+{
+	u32 a, b, c, d;
+
+	a = hash[0];
+	b = hash[1];
+	c = hash[2];
+	d = hash[3];
+
+	MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478, 7);
+	MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756, 12);
+	MD5STEP(F1, c, d, a, b, in[2] + 0x242070db, 17);
+	MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceee, 22);
+	MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0faf, 7);
+	MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62a, 12);
+	MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613, 17);
+	MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501, 22);
+	MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8, 7);
+	MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7af, 12);
+	MD5STEP(F1, c, d, a, b, in[10] + 0xffff5bb1, 17);
+	MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7be, 22);
+	MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122, 7);
+	MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193, 12);
+	MD5STEP(F1, c, d, a, b, in[14] + 0xa679438e, 17);
+	MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821, 22);
+
+	MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562, 5);
+	MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340, 9);
+	MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51, 14);
+	MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20);
+	MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105d, 5);
+	MD5STEP(F2, d, a, b, c, in[10] + 0x02441453, 9);
+	MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681, 14);
+	MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20);
+	MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6, 5);
+	MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6, 9);
+	MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87, 14);
+	MD5STEP(F2, b, c, d, a, in[8] + 0x455a14ed, 20);
+	MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905, 5);
+	MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8, 9);
+	MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9, 14);
+	MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20);
+
+	MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942, 4);
+	MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681, 11);
+	MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122, 16);
+	MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380c, 23);
+	MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44, 4);
+	MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9, 11);
+	MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60, 16);
+	MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70, 23);
+	MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6, 4);
+	MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127fa, 11);
+	MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085, 16);
+	MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05, 23);
+	MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039, 4);
+	MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5, 11);
+	MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8, 16);
+	MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665, 23);
+
+	MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244, 6);
+	MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97, 10);
+	MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7, 15);
+	MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039, 21);
+	MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3, 6);
+	MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92, 10);
+	MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47d, 15);
+	MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1, 21);
+	MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4f, 6);
+	MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10);
+	MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314, 15);
+	MD5STEP(F4, b, c, d, a, in[13] + 0x4e0811a1, 21);
+	MD5STEP(F4, a, b, c, d, in[4] + 0xf7537e82, 6);
+	MD5STEP(F4, d, a, b, c, in[11] + 0xbd3af235, 10);
+	MD5STEP(F4, c, d, a, b, in[2] + 0x2ad7d2bb, 15);
+	MD5STEP(F4, b, c, d, a, in[9] + 0xeb86d391, 21);
+
+	hash[0] += a;
+	hash[1] += b;
+	hash[2] += c;
+	hash[3] += d;
+}
+
 static inline void md5_transform_helper(struct md5_state *ctx)
 {
 	le32_to_cpu_array(ctx->block, sizeof(ctx->block) / sizeof(u32));
diff --git a/lib/Makefile b/lib/Makefile
index 71d398b04a74..1079152607e0 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -19,7 +19,7 @@ KCOV_INSTRUMENT_dynamic_debug.o := n
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 rbtree.o radix-tree.o dump_stack.o timerqueue.o\
 	 idr.o int_sqrt.o extable.o \
-	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
+	 sha1.o chacha20.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 	 earlycpio.o seq_buf.o siphash.o \
diff --git a/lib/md5.c b/lib/md5.c
deleted file mode 100644
index bb0cd01d356d..000000000000
--- a/lib/md5.c
+++ /dev/null
@@ -1,95 +0,0 @@
-#include <linux/compiler.h>
-#include <linux/export.h>
-#include <linux/cryptohash.h>
-
-#define F1(x, y, z)	(z ^ (x & (y ^ z)))
-#define F2(x, y, z)	F1(z, x, y)
-#define F3(x, y, z)	(x ^ y ^ z)
-#define F4(x, y, z)	(y ^ (x | ~z))
-
-#define MD5STEP(f, w, x, y, z, in, s) \
-	(w += f(x, y, z) + in, w = (w<<s | w>>(32-s)) + x)
-
-void md5_transform(__u32 *hash, __u32 const *in)
-{
-	u32 a, b, c, d;
-
-	a = hash[0];
-	b = hash[1];
-	c = hash[2];
-	d = hash[3];
-
-	MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478, 7);
-	MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756, 12);
-	MD5STEP(F1, c, d, a, b, in[2] + 0x242070db, 17);
-	MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceee, 22);
-	MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0faf, 7);
-	MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62a, 12);
-	MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613, 17);
-	MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501, 22);
-	MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8, 7);
-	MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7af, 12);
-	MD5STEP(F1, c, d, a, b, in[10] + 0xffff5bb1, 17);
-	MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7be, 22);
-	MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122, 7);
-	MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193, 12);
-	MD5STEP(F1, c, d, a, b, in[14] + 0xa679438e, 17);
-	MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821, 22);
-
-	MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562, 5);
-	MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340, 9);
-	MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51, 14);
-	MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20);
-	MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105d, 5);
-	MD5STEP(F2, d, a, b, c, in[10] + 0x02441453, 9);
-	MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681, 14);
-	MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20);
-	MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6, 5);
-	MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6, 9);
-	MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87, 14);
-	MD5STEP(F2, b, c, d, a, in[8] + 0x455a14ed, 20);
-	MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905, 5);
-	MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8, 9);
-	MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9, 14);
-	MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20);
-
-	MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942, 4);
-	MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681, 11);
-	MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122, 16);
-	MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380c, 23);
-	MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44, 4);
-	MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9, 11);
-	MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60, 16);
-	MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70, 23);
-	MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6, 4);
-	MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127fa, 11);
-	MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085, 16);
-	MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05, 23);
-	MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039, 4);
-	MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5, 11);
-	MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8, 16);
-	MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665, 23);
-
-	MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244, 6);
-	MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97, 10);
-	MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7, 15);
-	MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039, 21);
-	MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3, 6);
-	MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92, 10);
-	MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47d, 15);
-	MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1, 21);
-	MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4f, 6);
-	MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10);
-	MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314, 15);
-	MD5STEP(F4, b, c, d, a, in[13] + 0x4e0811a1, 21);
-	MD5STEP(F4, a, b, c, d, in[4] + 0xf7537e82, 6);
-	MD5STEP(F4, d, a, b, c, in[11] + 0xbd3af235, 10);
-	MD5STEP(F4, c, d, a, b, in[2] + 0x2ad7d2bb, 15);
-	MD5STEP(F4, b, c, d, a, in[9] + 0xeb86d391, 21);
-
-	hash[0] += a;
-	hash[1] += b;
-	hash[2] += c;
-	hash[3] += d;
-}
-EXPORT_SYMBOL(md5_transform);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v6 3/5] random: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson
In-Reply-To: <20161216030328.11602-1-Jason@zx2c4.com>

This duplicates the current algorithm for get_random_int/long, but uses
siphash instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
separates hashed fields into three values instead of one, in order to
increase diffusion.

The previous MD5 algorithm used a per-cpu MD5 state, which caused
successive calls to the function to chain upon each other. While it's
not entirely clear that this kind of chaining is absolutely necessary
when using a secure PRF like siphash, it can't hurt, and the timing of
the call chain does add a degree of natural entropy. So, in keeping with
this design, instead of the massive per-cpu 64-byte MD5 state, there is
instead a per-cpu previously returned value for chaining.

The speed benefits are substantial:

                | siphash | md5    | speedup |
		------------------------------
get_random_long | 137130  | 415983 | 3.03x   |
get_random_int  | 86384   | 343323 | 3.97x   |

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Ted Tso <tytso@mit.edu>
---
 drivers/char/random.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d6876d506220..a51f0ff43f00 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -262,6 +262,7 @@
 #include <linux/syscalls.h>
 #include <linux/completion.h>
 #include <linux/uuid.h>
+#include <linux/siphash.h>
 #include <crypto/chacha20.h>
 
 #include <asm/processor.h>
@@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = {
 };
 #endif 	/* CONFIG_SYSCTL */
 
-static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned;
+static siphash_key_t random_int_secret;
 
 int random_int_secret_init(void)
 {
@@ -2050,8 +2051,7 @@ int random_int_secret_init(void)
 	return 0;
 }
 
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
-		__aligned(sizeof(unsigned long));
+static DEFINE_PER_CPU(u64, get_random_int_chaining);
 
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
@@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
  */
 unsigned int get_random_int(void)
 {
-	__u32 *hash;
 	unsigned int ret;
+	u64 *chaining;
 
 	if (arch_get_random_int(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = hash[0];
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_int);
@@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int);
  */
 unsigned long get_random_long(void)
 {
-	__u32 *hash;
 	unsigned long ret;
+	u64 *chaining;
 
 	if (arch_get_random_long(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = *(unsigned long *)hash;
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_long);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v6 1/5] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson
In-Reply-To: <20161216030328.11602-1-Jason@zx2c4.com>

SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function, or as a
general PRF for short input use cases, such as sequence numbers or RNG
chaining.

For the first usage:

There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector. Currently
hashtables use jhash, which is fast but not secure, and some kind of
rotating key scheme (or none at all, which isn't good). SipHash is meant
as a replacement for jhash in these cases.

There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.

While SipHash is extremely fast for a cryptographically secure function,
it is likely a bit slower than the insecure jhash, and so replacements
will be evaluated on a case-by-case basis based on whether or not the
difference in speed is negligible and whether or not the current jhash usage
poses a real security risk.

For the second usage:

A few places in the kernel are using MD5 or SHA1 for creating secure
sequence numbers, syn cookies, port numbers, or fast random numbers.
SipHash is a faster and more fitting, and more secure replacement for MD5
in those situations. Replacing MD5 and SHA1 with SipHash for these uses is
obvious and straight-forward, and so is submitted along with this patch
series. There shouldn't be much of a debate over its efficacy.

Dozens of languages are already using this internally for their hash
tables and PRFs. Some of the BSDs already use this in their kernels.
SipHash is a widely known high-speed solution to a widely known set of
problems, and it's time we catch-up.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
---
 MAINTAINERS             |   7 ++
 include/linux/siphash.h |  86 ++++++++++++++++++++
 lib/Kconfig.debug       |   6 +-
 lib/Makefile            |   5 +-
 lib/siphash.c           | 210 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      | 101 +++++++++++++++++++++++
 6 files changed, 410 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 59c9895d73d5..5d87a8c1056a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11231,6 +11231,13 @@ F:	arch/arm/mach-s3c24xx/mach-bast.c
 F:	arch/arm/mach-s3c24xx/bast-ide.c
 F:	arch/arm/mach-s3c24xx/bast-irq.c
 
+SIPHASH PRF ROUTINES
+M:	Jason A. Donenfeld <Jason@zx2c4.com>
+S:	Maintained
+F:	lib/siphash.c
+F:	lib/test_siphash.c
+F:	include/linux/siphash.h
+
 TI DAVINCI MACHINE SUPPORT
 M:	Sekhar Nori <nsekhar@ti.com>
 M:	Kevin Hilman <khilman@kernel.org>
diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index 000000000000..e82fce48a0f6
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,86 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4.
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+
+#define SIPHASH_ALIGNMENT 8
+typedef u64 siphash_key_t[2];
+
+u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t key);
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t key);
+#endif
+
+u64 siphash_1u64(const u64 a, const siphash_key_t key);
+u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key);
+u64 siphash_3u64(const u64 a, const u64 b, const u64 c,
+		 const siphash_key_t key);
+u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d,
+		 const siphash_key_t key);
+
+static inline u64 ___siphash_aligned(const u64 *data, size_t len, const siphash_key_t key)
+{
+	if (__builtin_constant_p(len) && len == 8)
+		return siphash_1u64(data[0], key);
+	if (__builtin_constant_p(len) && len == 16)
+		return siphash_2u64(data[0], data[1], key);
+	if (__builtin_constant_p(len) && len == 24)
+		return siphash_3u64(data[0], data[1], data[2], key);
+	if (__builtin_constant_p(len) && len == 32)
+		return siphash_4u64(data[0], data[1], data[2], data[3], key);
+	return __siphash_aligned(data, len, key);
+}
+
+/**
+ * siphash - compute 64-bit siphash PRF value
+ * @data: buffer to hash
+ * @size: size of @data
+ * @key: the siphash key
+ */
+static inline u64 siphash(const void *data, size_t len, const siphash_key_t key)
+{
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+	if (!IS_ALIGNED((unsigned long)data, SIPHASH_ALIGNMENT))
+		return __siphash_unaligned(data, len, key);
+#endif
+	return ___siphash_aligned(data, len, key);
+}
+
+static inline u64 siphash_2u32(const u32 a, const u32 b, const siphash_key_t key)
+{
+	return siphash_1u64((u64)b << 32 | a, key);
+}
+
+static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const siphash_key_t key)
+{
+	return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key);
+}
+
+static inline u64 siphash_6u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const u32 e, const u32 f, const siphash_key_t key)
+{
+	return siphash_3u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e,
+			    key);
+}
+
+static inline u64 siphash_8u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const u32 e, const u32 f, const u32 g, const u32 h,
+			       const siphash_key_t key)
+{
+	return siphash_4u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e,
+			    (u64)h << 32 | g, key);
+}
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7446097f72bd..86254ea99b45 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1843,9 +1843,9 @@ config TEST_HASH
 	tristate "Perform selftest on hash functions"
 	default n
 	help
-	  Enable this option to test the kernel's integer (<linux/hash,h>)
-	  and string (<linux/stringhash.h>) hash functions on boot
-	  (or module load).
+	  Enable this option to test the kernel's integer (<linux/hash.h>),
+	  string (<linux/stringhash.h>), and siphash (<linux/siphash.h>)
+	  hash functions on boot (or module load).
 
 	  This is intended to help people writing architecture-specific
 	  optimized versions.  If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..71d398b04a74 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
-	 earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
+	 earlycpio.o seq_buf.o siphash.o \
+	 nmi_backtrace.o nodemask.o win_minmax.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
@@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
 obj-y += kstrtox.o
 obj-$(CONFIG_TEST_BPF) += test_bpf.o
 obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
-obj-$(CONFIG_TEST_HASH) += test_hash.o
+obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_KASAN) += test_kasan.o
 obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
 obj-$(CONFIG_TEST_LKM) += test_module.o
diff --git a/lib/siphash.c b/lib/siphash.c
new file mode 100644
index 000000000000..7efc273de5d0
--- /dev/null
+++ b/lib/siphash.c
@@ -0,0 +1,210 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4.
+ */
+
+#include <linux/siphash.h>
+#include <asm/unaligned.h>
+
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+#include <linux/dcache.h>
+#include <asm/word-at-a-time.h>
+#endif
+
+#define SIPROUND \
+	do { \
+	v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \
+	v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \
+	v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \
+	v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
+	} while(0)
+
+#define PREAMBLE(len) \
+	u64 v0 = 0x736f6d6570736575ULL; \
+	u64 v1 = 0x646f72616e646f6dULL; \
+	u64 v2 = 0x6c7967656e657261ULL; \
+	u64 v3 = 0x7465646279746573ULL; \
+	u64 b = ((u64)len) << 56; \
+	v3 ^= key[1]; \
+	v2 ^= key[0]; \
+	v1 ^= key[1]; \
+	v0 ^= key[0];
+
+#define POSTAMBLE \
+	v3 ^= b; \
+	SIPROUND; \
+	SIPROUND; \
+	v0 ^= b; \
+	v2 ^= 0xff; \
+	SIPROUND; \
+	SIPROUND; \
+	SIPROUND; \
+	SIPROUND; \
+	return (v0 ^ v1) ^ (v2 ^ v3);
+
+u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t key)
+{
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	u64 m;
+	PREAMBLE(len)
+	for (; data != end; data += sizeof(u64)) {
+		m = le64_to_cpup(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	if (left)
+		b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+						  bytemask_from_count(left)));
+#else
+	switch (left) {
+	case 7: b |= ((u64)end[6]) << 48;
+	case 6: b |= ((u64)end[5]) << 40;
+	case 5: b |= ((u64)end[4]) << 32;
+	case 4: b |= le32_to_cpup(data); break;
+	case 3: b |= ((u64)end[2]) << 16;
+	case 2: b |= le16_to_cpup(data); break;
+	case 1: b |= end[0];
+	}
+#endif
+	POSTAMBLE
+}
+EXPORT_SYMBOL(__siphash_aligned);
+
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t key)
+{
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	u64 m;
+	PREAMBLE(len)
+	for (; data != end; data += sizeof(u64)) {
+		m = get_unaligned_le64(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	if (left)
+		b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+						  bytemask_from_count(left)));
+#else
+	switch (left) {
+	case 7: b |= ((u64)end[6]) << 48;
+	case 6: b |= ((u64)end[5]) << 40;
+	case 5: b |= ((u64)end[4]) << 32;
+	case 4: b |= get_unaligned_le32(end); break;
+	case 3: b |= ((u64)end[2]) << 16;
+	case 2: b |= get_unaligned_le16(end); break;
+	case 1: b |= end[0];
+	}
+#endif
+	POSTAMBLE
+}
+EXPORT_SYMBOL(__siphash_unaligned);
+#endif
+
+/**
+ * siphash_1u64 - compute 64-bit siphash PRF value of a u64
+ * @first: first u64
+ * @key: the siphash key
+ */
+u64 siphash_1u64(const u64 first, const siphash_key_t key)
+{
+	PREAMBLE(8)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_1u64);
+
+/**
+ * siphash_2u64 - compute 64-bit siphash PRF value of 2 u64
+ * @first: first u64
+ * @second: second u64
+ * @key: the siphash key
+ */
+u64 siphash_2u64(const u64 first, const u64 second, const siphash_key_t key)
+{
+	PREAMBLE(16)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	v3 ^= second;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= second;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_2u64);
+
+/**
+ * siphash_3u64 - compute 64-bit siphash PRF value of 3 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @key: the siphash key
+ */
+u64 siphash_3u64(const u64 first, const u64 second, const u64 third,
+		 const siphash_key_t key)
+{
+	PREAMBLE(24)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	v3 ^= second;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= second;
+	v3 ^= third;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= third;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_3u64);
+
+/**
+ * siphash_4u64 - compute 64-bit siphash PRF value of 4 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @forth: forth u64
+ * @key: the siphash key
+ */
+u64 siphash_4u64(const u64 first, const u64 second, const u64 third,
+		 const u64 forth, const siphash_key_t key)
+{
+	PREAMBLE(32)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	v3 ^= second;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= second;
+	v3 ^= third;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= third;
+	v3 ^= forth;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= forth;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_4u64);
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
new file mode 100644
index 000000000000..906e58a2c946
--- /dev/null
+++ b/lib/test_siphash.c
@@ -0,0 +1,101 @@
+/* Test cases for siphash.c
+ *
+ * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+
+/* Test vectors taken from official reference source available at:
+ *     https://131002.net/siphash/siphash24.c
+ */
+static const u64 test_vectors[64] = {
+	0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
+	0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
+	0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
+	0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
+	0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
+	0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
+	0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
+	0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
+	0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
+	0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
+	0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
+	0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
+	0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
+	0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
+	0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
+	0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
+	0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
+	0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
+	0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
+	0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
+	0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
+	0x958a324ceb064572ULL
+};
+static const siphash_key_t test_key =
+	{ 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL };
+
+static int __init siphash_test_init(void)
+{
+	u8 in[64] __aligned(SIPHASH_ALIGNMENT);
+	u8 in_unaligned[65];
+	u8 i;
+	int ret = 0;
+
+	for (i = 0; i < 64; ++i) {
+		in[i] = i;
+		in_unaligned[i + 1] = i;
+		if (siphash(in, i, test_key) != test_vectors[i]) {
+			pr_info("self-test aligned %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+		if (siphash(in_unaligned + 1, i, test_key) != test_vectors[i]) {
+			pr_info("self-test unaligned %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+	}
+	if (siphash_1u64(0x0706050403020100ULL, test_key) != test_vectors[8]) {
+		pr_info("self-test 1u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key) != test_vectors[16]) {
+		pr_info("self-test 2u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+			 0x1716151413121110ULL, test_key) != test_vectors[24]) {
+		pr_info("self-test 3u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+			 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key) != test_vectors[32]) {
+		pr_info("self-test 4u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (!ret)
+		pr_info("self-tests: pass\n");
+	return ret;
+}
+
+static void __exit siphash_test_exit(void)
+{
+}
+
+module_init(siphash_test_init);
+module_exit(siphash_test_exit);
+
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_LICENSE("Dual BSD/GPL");
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: kbuild test robot @ 2016-12-16  2:14 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: kbuild-all, Netdev, kernel-hardening, LKML, linux-crypto,
	David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds,
	Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum, ak,
	davem, luto, Jason A. Donenfeld, Jean-Philippe Aumasson,
	Daniel J . Bernstein
In-Reply-To: <20161215203003.31989-2-Jason@zx2c4.com>

[-- Attachment #1: Type: text/plain, Size: 1530 bytes --]

Hi Jason,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.9 next-20161215]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jason-A-Donenfeld/siphash-add-cryptographically-secure-PRF/20161216-092837
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=ia64 

All errors (new ones prefixed by >>):

   lib/siphash.c: In function 'siphash_unaligned':
>> lib/siphash.c:123:15: error: 'bytes' undeclared (first use in this function)
     case 1: b |= bytes[0];
                  ^~~~~
   lib/siphash.c:123:15: note: each undeclared identifier is reported only once for each function it appears in

vim +/bytes +123 lib/siphash.c

   117		case 7: b |= ((u64)end[6]) << 48;
   118		case 6: b |= ((u64)end[5]) << 40;
   119		case 5: b |= ((u64)end[4]) << 32;
   120		case 4: b |= get_unaligned_le32(end); break;
   121		case 3: b |= ((u64)end[2]) << 16;
   122		case 2: b |= get_unaligned_le16(end); break;
 > 123		case 1: b |= bytes[0];
   124		}
   125	#endif
   126		v3 ^= b;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 45664 bytes --]

^ permalink raw reply

* RE: [virtio-dev] Re: [Qemu-devel] [PATCH v7 1/1] crypto: add virtio-crypto driver
From: Gonglei (Arei) @ 2016-12-16  0:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Zeng, Xin, Halil Pasic, linux-kernel@vger.kernel.org,
	qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org,
	virtualization@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org, Huangweidong (C), Claudio Fontana,
	Luonengjun, Hanweidong (Randy), Xuquan (Quan Xu),
	Wanzongshun (Vincent), stefanha@redhat.com, Zhoujian (jay, Euler)
In-Reply-To: <20161215184153-mutt-send-email-mst@kernel.org>

Hi Michael,

>
> >
> >
> > > Subject: RE: [Qemu-devel] [PATCH v7 1/1] crypto: add virtio-crypto driver
> > >
> > > On Thursday, December 15, 2016 8:45 AM, Gonglei (Arei) Wrote:
> > > < > > diff --git a/drivers/crypto/virtio/virtio_crypto_core.c
> > > < > b/drivers/crypto/virtio/virtio_crypto_core.c
> > > < > > new file mode 100644
> > > < > > index 0000000..c0854a1
> > > < > > --- /dev/null
> > > < > > +++ b/drivers/crypto/virtio/virtio_crypto_core.c
> > > < > > @@ -0,0 +1,474 @@
> > > < > [..]
> > > < > > +
> > > < > > +static void virtcrypto_dataq_callback(struct virtqueue *vq)
> > > < > > +{
> > > < > > +	struct virtio_crypto *vcrypto = vq->vdev->priv;
> > > < > > +	struct virtio_crypto_request *vc_req;
> > > < > > +	unsigned long flags;
> > > < > > +	unsigned int len;
> > > < > > +	struct ablkcipher_request *ablk_req;
> > > < > > +	int error;
> > > < > > +
> > > < > > +	spin_lock_irqsave(&vcrypto->lock, flags);
> > > < >
> > > < > Would it make sense to use a per virtqueue lock
> > > < > like in virtio_blk for example instead of locking on the whole
> > > < > device? OK, it seems you use only one dataqueue, so it
> > > < > may not be that relevant.
> > > < >
> > > < Currently yes, both the backend device (cryptodev-backend-builtin)
> > > < and the frontend driver use one dataqueue.
> > > <
> > >
> > > I think it makes sense to use per virtqueue lock here though it only uses one
> > > queue so far,
> > > but in the spec we already have multi queues support.
> > >
> > Yes, I agree. Will do that in V8 soon.
> > Hope to catch up with Michael's pull request for 4.10.
> >
> > Regards,
> > -Gonglei
> 
> I merged v7, this change will have to wait. Sorry.
> 
That's OK. Thanks!

I can post a separate patch after this pull request. 

Regards,
-Gonglei

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa @ 2016-12-16  0:03 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: David Laight, Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <CAHmME9rF0pKxfc3p9ThD3LZU2+ZNPeR-4=9MVEE3AJQKBMNA+w@mail.gmail.com>

On 16.12.2016 00:43, Jason A. Donenfeld wrote:
> Hi Hannes,
> 
> Good news.
> 
> On Thu, Dec 15, 2016 at 10:45 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>>> How's that sound?
>>
>> I am still very much concerned about the API.
> 
> Thanks for pushing me and putting up with my daftness... the constant
> folding works absolutely perfectly. I've run several tests. When gcc
> knows that a struct is aligned (say, via __aligned(8)), then it erases
> the branch and makes a direct jump to the aligned code. When it's
> uncertain, it evaluates at runtime. So, now there is a single
> siphash() function that chooses the best one automatically. Behind the
> scene there's siphash_aligned and siphash_unaligned, but nobody needs
> to call these directly. (Should I rename these to have a double
> underscore prefix?) On platforms that have
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, of course all of this
> disappears and everything goes directly to the aligned version.
> 
> So, I think this assuages your concerns entirely. A single API entry
> point that does the right thing.
> 
> Whew! Good thinking, and thanks again for the suggestion.

Awesome, thanks for trying this out. This basically resolves my concern
API-wise so far.

Hannes out. ;)

^ permalink raw reply

* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-15 23:47 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Laight, Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <e574d97f-e0b4-f5e2-6d60-88d3ce185249@stressinduktion.org>

On Thu, Dec 15, 2016 at 10:45 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> By the way, if you target net-next, it is currently closed. So no need
> to hurry.

Honestly I have no idea what I'm targeting. The hash function touches
lib/. The secure_seq stuff touches net/. The rng stuff touches
random.c. Shall this be for net-next? For lib-next (doesn't exist)?
For tytso-next? Since a lot of feedback has come from netdev people, I
suspect net-next is the correct answer. In that case, I'll ask Ted for
his sign-off to touch random.c, and then we'll get this queued up in
net-next. Please correct me if this doesn't actually resemble how
things work around here...

Jason

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox