* Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function
From: Christian Kujau @ 2016-12-18 0:06 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Tom Herbert, Netdev, kernel-hardening, LKML,
Linux Crypto Mailing List, Jean-Philippe Aumasson,
Daniel J . Bernstein, Linus Torvalds, Eric Biggers, David Laight
In-Reply-To: <CAHmME9qNcsXtdWO_rmngSXXeBsTbA9B_33oLJ_pWOWcO7P2JZg@mail.gmail.com>
On Thu, 15 Dec 2016, Jason A. Donenfeld wrote:
> > I'd still drop the "24" unless you really think we're going to have
> > multiple variants coming into the kernel.
>
> Okay. I don't have a problem with this, unless anybody has some reason
> to the contrary.
What if the 2/4-round version falls and we need more rounds to withstand
future cryptoanalysis? We'd then have siphash_ and siphash48_ functions,
no? My amateurish bike-shedding argument would be "let's keep the 24 then" :-)
C.
--
BOFH excuse #354:
Chewing gum on /dev/sd3c
^ permalink raw reply
* Re: Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jeffrey Walton @ 2016-12-17 16:14 UTC (permalink / raw)
To: Theodore Ts'o, kernel-hardening, Jason A. Donenfeld,
George Spelvin, ak, davem, David Laight, D. J. Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
linux-crypto, LKML, luto, Netdev, Tom Herbert, Linus Torvalds,
Vegard Nossum
In-Reply-To: <20161217154152.5oug7mzb4tmfknwv@thunk.org>
> As far as half-siphash is concerned, it occurs to me that the main
> problem will be those users who need to guarantee that output can't be
> guessed over a long period of time. For example, if you have a
> long-running process, then the output needs to remain unguessable over
> potentially months or years, or else you might be weakening the ASLR
> protections. If on the other hand, the hash table or the process will
> be going away in a matter of seconds or minutes, the requirements with
> respect to cryptographic strength go down significantly.
Perhaps SipHash-4-8 should be used instead of SipHash-2-4. I believe
SipHash-4-8 is recommended for the security conscious who want to be
more conservative in their security estimates.
SipHash-4-8 does not add much more processing. If you are clocking
SipHash-2-4 at 2.0 or 2.5 cpb, then SipHash-4-8 will run at 3.0 to
4.0. Both are well below MD5 times. (At least with the data sets I've
tested).
> Now, maybe this doesn't matter that much if we can guarantee (or make
> assumptions) that the attacker doesn't have unlimited access the
> output stream of get_random_{long,int}(), or if it's being used in an
> anti-DOS use case where it ultimately only needs to be harder than
> alternate ways of attacking the system.
>
> Rekeying every five minutes doesn't necessarily help the with respect
> to ASLR, but it might reduce the amount of the output stream that
> would be available to the attacker in order to be able to attack the
> get_random_{long,int}() generator, and it also reduces the value of
> doing that attack to only compromising the ASLR for those processes
> started within that five minute window.
Forgive my ignorance... I did not find reading on using the primitive
in a PRNG. Does anyone know what Aumasson or Bernstein have to say?
Aumasson's site does not seem to discuss the use case:
https://www.google.com/search?q=siphash+rng+site%3A131002.net. (And
their paper only mentions random-number once in a different context).
Making the leap from internal hash tables and short-lived network
packets to the rng case may leave something to be desired, especially
if the bits get used in unanticipated ways, like creating long term
private keys.
Jeff
^ permalink raw reply
* Re: Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Theodore Ts'o @ 2016-12-17 15:41 UTC (permalink / raw)
To: kernel-hardening
Cc: Jason, linux, ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, linux-crypto, linux-kernel, luto, netdev,
tom, torvalds, vegard.nossum
In-Reply-To: <20161217021503.32767.qmail@ns.sciencehorizons.net>
On Fri, Dec 16, 2016 at 09:15:03PM -0500, George Spelvin wrote:
> >> - Ted, Andy Lutorminski and I will try to figure out a construction of
> >> get_random_long() that we all like.
We don't have to find the most optimal solution right away; we can
approach this incrementally, after all.
So long as we replace get_random_{long,int}() with something which is
(a) strictly better in terms of security given today's use of MD5, and
(b) which is strictly *faster* than the current construction on 32-bit
and 64-bit systems, we can do that, and can try to make it be faster
while maintaining some minimum level of security which is sufficient
for all current users of get_random_{long,int}() and which can be
clearly artificulated for future users of get_random_{long,int}().
The main worry at this point I have is benchmarking siphash on a
32-bit system. It may be that simply batching the chacha20 output so
that we're using the urandom construction more efficiently is the
better way to go, since that *does* meet the criteron of strictly more
secure and strictly faster than the current MD5 solution. I'm open to
using siphash, but I want to see the the 32-bit numbers first.
As far as half-siphash is concerned, it occurs to me that the main
problem will be those users who need to guarantee that output can't be
guessed over a long period of time. For example, if you have a
long-running process, then the output needs to remain unguessable over
potentially months or years, or else you might be weakening the ASLR
protections. If on the other hand, the hash table or the process will
be going away in a matter of seconds or minutes, the requirements with
respect to cryptographic strength go down significantly.
Now, maybe this doesn't matter that much if we can guarantee (or make
assumptions) that the attacker doesn't have unlimited access the
output stream of get_random_{long,int}(), or if it's being used in an
anti-DOS use case where it ultimately only needs to be harder than
alternate ways of attacking the system.
Rekeying every five minutes doesn't necessarily help the with respect
to ASLR, but it might reduce the amount of the output stream that
would be available to the attacker in order to be able to attack the
get_random_{long,int}() generator, and it also reduces the value of
doing that attack to only compromising the ASLR for those processes
started within that five minute window.
Cheers,
- Ted
P.S. I'm using ASLR as an example use case, above; of course we will
need to make similar eximainations of the other uses of
get_random_{long,int}().
P.P.S. We might also want to think about potentially defining
get_random_{long,int}() to be unambiguously strong, and then creating
a get_weak_random_{long,int}() which on platforms where performance
might be a consideration, it uses a weaker algorithm perhaps with some
kind of rekeying interval.
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-17 15:21 UTC (permalink / raw)
To: tom
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, Jason,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, linux, luto, netdev, torvalds, tytso, vegard.nossum
In-Reply-To: <CALx6S351VFRZmEQphRQy6YtmZYPnOtTN7=XiNrJmhWJGv4HUBg@mail.gmail.com>
To follow up on my comments that your benchmark results were peculiar,
here's my benchmark code.
It just computes the hash of all n*(n+1)/2 possible non-empty substrings
of a buffer of n (called "max" below) bytes. "cpb" is "cycles per byte".
(The average length is (n+2)/3, c.f. https://oeis.org/A000292)
On x86-32, HSipHash is asymptotically twice the speed of SipHash,
rising to 2.5x for short strings:
SipHash/HSipHash benchmark, sizeof(long) = 4
SipHash: max= 4 cycles= 10495 cpb=524.7500 (sum=47a4f5554869fa97)
HSipHash: max= 4 cycles= 3400 cpb=170.0000 (sum=146a863e)
SipHash: max= 8 cycles= 24468 cpb=203.9000 (sum=21c41a86355affcc)
HSipHash: max= 8 cycles= 9237 cpb= 76.9750 (sum=d3b5e0cd)
SipHash: max= 16 cycles= 94622 cpb=115.9583 (sum=26d816b72721e48f)
HSipHash: max= 16 cycles= 34499 cpb= 42.2782 (sum=16bb7475)
SipHash: max= 32 cycles= 418767 cpb= 69.9811 (sum=dd5a97694b8a832d)
HSipHash: max= 32 cycles= 156695 cpb= 26.1857 (sum=eed00fcb)
SipHash: max= 64 cycles= 2119152 cpb= 46.3101 (sum=a2a725aecc09ed00)
HSipHash: max= 64 cycles= 1008678 cpb= 22.0428 (sum=99b9f4f)
SipHash: max= 128 cycles= 12728659 cpb= 35.5788 (sum=420878cd20272817)
HSipHash: max= 128 cycles= 5452931 cpb= 15.2419 (sum=f1f4ad18)
SipHash: max= 256 cycles= 38931946 cpb= 13.7615 (sum=e05dfb28b90dfd98)
HSipHash: max= 256 cycles= 13807312 cpb= 4.8805 (sum=ceeafcc1)
SipHash: max= 512 cycles= 205537380 cpb= 9.1346 (sum=7d129d4de145fbea)
HSipHash: max= 512 cycles= 103420960 cpb= 4.5963 (sum=7f15a313)
SipHash: max=1024 cycles=1540259472 cpb= 8.5817 (sum=cca7cbdc778ca8af)
HSipHash: max=1024 cycles= 796090824 cpb= 4.4355 (sum=d8f3374f)
On x86-64, SipHash is consistently faster, asymptotically approaching 2x
for long strings:
SipHash/HSipHash benchmark, sizeof(long) = 8
SipHash: max= 4 cycles= 2642 cpb=132.1000 (sum=47a4f5554869fa97)
HSipHash: max= 4 cycles= 2498 cpb=124.9000 (sum=146a863e)
SipHash: max= 8 cycles= 5270 cpb= 43.9167 (sum=21c41a86355affcc)
HSipHash: max= 8 cycles= 7140 cpb= 59.5000 (sum=d3b5e0cd)
SipHash: max= 16 cycles= 19950 cpb= 24.4485 (sum=26d816b72721e48f)
HSipHash: max= 16 cycles= 23546 cpb= 28.8554 (sum=16bb7475)
SipHash: max= 32 cycles= 80188 cpb= 13.4004 (sum=dd5a97694b8a832d)
HSipHash: max= 32 cycles= 101218 cpb= 16.9148 (sum=eed00fcb)
SipHash: max= 64 cycles= 373286 cpb= 8.1575 (sum=a2a725aecc09ed00)
HSipHash: max= 64 cycles= 535568 cpb= 11.7038 (sum=99b9f4f)
SipHash: max= 128 cycles= 2075224 cpb= 5.8006 (sum=420878cd20272817)
HSipHash: max= 128 cycles= 3336820 cpb= 9.3270 (sum=f1f4ad18)
SipHash: max= 256 cycles= 14276278 cpb= 5.0463 (sum=e05dfb28b90dfd98)
HSipHash: max= 256 cycles= 28847880 cpb= 10.1970 (sum=ceeafcc1)
SipHash: max= 512 cycles= 50135180 cpb= 2.2281 (sum=7d129d4de145fbea)
HSipHash: max= 512 cycles= 86145916 cpb= 3.8286 (sum=7f15a313)
SipHash: max=1024 cycles= 334111900 cpb= 1.8615 (sum=cca7cbdc778ca8af)
HSipHash: max=1024 cycles= 640432452 cpb= 3.5682 (sum=d8f3374f)
Here's the code; compile with -DSELFTEST. (The main purpose of
printing the sum is to prevent dead code elimination.)
#if SELFTEST
#include <stdint.h>
#include <stdlib.h>
static inline uint64_t rol64(uint64_t word, unsigned int shift)
{
return word << shift | word >> (64 - shift);
}
static inline uint32_t rol32(uint32_t word, unsigned int shift)
{
return word << shift | word >> (32 - shift);
}
static inline uint64_t get_unaligned_le64(void const *p)
{
return *(uint64_t const *)p;
}
static inline uint32_t get_unaligned_le32(void const *p)
{
return *(uint32_t const *)p;
}
static inline uint64_t le64_to_cpup(uint64_t const *p)
{
return *p;
}
static inline uint32_t le32_to_cpup(uint32_t const *p)
{
return *p;
}
#else
#include <linux/bitops.h> /* For rol64 */
#include <linux/cryptohash.h>
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#endif
/* The basic ARX mixing function, taken from Skein */
#define SIP_MIX(a, b, s) ((a) += (b), (b) = rol64(b, s), (b) ^= (a))
/*
* The complete SipRound. Note that, when unrolled twice like below,
* the 32-bit rotates drop out on 32-bit machines.
*/
#define SIP_ROUND(a, b, c, d) \
(SIP_MIX(a, b, 13), SIP_MIX(c, d, 16), (a) = rol64(a, 32), \
SIP_MIX(c, b, 17), SIP_MIX(a, d, 21), (c) = rol64(c, 32))
/*
* This is rolled up more than most implementations, resulting in about
* 55% the code size. Speed is a few precent slower. A crude benchmark
* (for (i=1; i <= max; i++) for (j = 0; j < 4096-i; j++) hash(buf+j, i);)
* produces the following timings (in usec):
*
* i386 i386 i386 x86_64 x86_64 x86_64 x86_64
* Length small unroll halfmd4 small unroll halfmd4 teahash
* 1..4 1069 1029 1608 195 160 399 690
* 1..8 2483 2381 3851 410 360 988 1659
* 1..12 4303 4152 6207 690 618 1642 2690
* 1..16 6122 5931 8668 968 876 2363 3786
* 1..20 8348 8137 11245 1323 1185 3162 5567
* 1..24 10580 10327 13935 1657 1504 4066 7635
* 1..28 13211 12956 16803 2069 1871 5028 9759
* 1..32 15843 15572 19725 2470 2260 6084 11932
* 1..36 18864 18609 24259 2934 2678 7566 14794
* 1..1024 5890194 6130242 10264816 881933 881244 3617392 7589036
*
* The performance penalty is quite minor, decreasing for long strings,
* and it's significantly faster than half_md4, so I'm going for the
* I-cache win.
*/
uint64_t
siphash24(char const *in, size_t len, uint32_t const seed[4])
{
uint64_t a = 0x736f6d6570736575; /* somepseu */
uint64_t b = 0x646f72616e646f6d; /* dorandom */
uint64_t c = 0x6c7967656e657261; /* lygenera */
uint64_t d = 0x7465646279746573; /* tedbytes */
uint64_t m = 0;
uint8_t padbyte = len;
m = seed[2] | (uint64_t)seed[3] << 32;
b ^= m;
d ^= m;
m = seed[0] | (uint64_t)seed[1] << 32;
/* a ^= m; is done in loop below */
c ^= m;
/*
* By using the same SipRound code for all iterations, we
* save space, at the expense of some branch prediction. But
* branch prediction is hard because of variable length anyway.
*/
len = len/8 + 3; /* Now number of rounds to perform */
do {
a ^= m;
switch (--len) {
unsigned bytes;
default: /* Full words */
d ^= m = get_unaligned_le64(in);
in += 8;
break;
case 2: /* Final partial word */
/*
* We'd like to do one 64-bit fetch rather than
* mess around with bytes, but reading past the end
* might hit a protection boundary. Fortunately,
* we know that protection boundaries are aligned,
* so we can consider only three cases:
* - The remainder occupies zero words
* - The remainder fits into one word
* - The remainder straddles two words
*/
bytes = padbyte & 7;
if (bytes == 0) {
m = 0;
} else {
unsigned offset = (unsigned)(uintptr_t)in & 7;
if (offset + bytes <= 8) {
m = le64_to_cpup((uint64_t const *)
(in - offset));
m >>= 8*offset;
} else {
m = get_unaligned_le64(in);
}
m &= ((uint64_t)1 << 8*bytes) - 1;
}
/* Could use | or +, but ^ allows associativity */
d ^= m ^= (uint64_t)padbyte << 56;
break;
case 1: /* Beginning of finalization */
m = 0;
c ^= 0xff;
/*FALLTHROUGH*/
case 0: /* Second half of finalization */
break;
}
SIP_ROUND(a, b, c, d);
SIP_ROUND(a, b, c, d);
} while (len);
return a ^ b ^ c ^ d;
}
#undef SIP_ROUND
#undef SIP_MIX
#define HSIP_MIX(a, b, s) ((a) += (b), (b) = rol32(b, s), (b) ^= (a))
/*
* These are the PRELIMINARY rotate constants suggested by
* Jean-Philippe Aumasson. Update to final when available.
*/
#define HSIP_ROUND(a, b, c, d) \
(HSIP_MIX(a, b, 5), HSIP_MIX(c, d, 8), (a) = rol32(a, 16), \
HSIP_MIX(c, b, 7), HSIP_MIX(a, d, 13), (c) = rol32(c, 16))
uint32_t
hsiphash24(char const *in, size_t len, uint32_t const key[2])
{
uint32_t c = key[0];
uint32_t d = key[1];
uint32_t a = 0x6c796765 ^ 0x736f6d65;
uint32_t b = d ^ 0x74656462 ^ 0x646f7261;
uint32_t m = c;
uint8_t padbyte = len;
/*
* By using the same SipRound code for all iterations, we
* save space, at the expense of some branch prediction. But
* branch prediction is hard because of variable length anyway.
*/
len = len/sizeof(m) + 3; /* Now number of rounds to perform */
do {
a ^= m;
switch (--len) {
unsigned bytes;
default: /* Full words */
d ^= m = get_unaligned_le32(in);
in += sizeof(m);
break;
case 2: /* Final partial word */
/*
* We'd like to do one 32-bit fetch rather than
* mess around with bytes, but reading past the end
* might hit a protection boundary. Fortunately,
* we know that protection boundaries are aligned,
* so we can consider only three cases:
* - The remainder occupies zero words
* - The remainder fits into one word
* - The remainder straddles two words
*/
bytes = padbyte & 3;
if (bytes == 0) {
m = 0;
} else {
unsigned offset = (unsigned)(uintptr_t)in & 3;
if (offset + bytes <= 4) {
m = le32_to_cpup((uint32_t const *)
(in - offset));
m >>= 8*offset;
} else {
m = get_unaligned_le32(in);
}
m &= ((uint32_t)1 << 8*bytes) - 1;
}
/* Could use | or +, but ^ allows associativity */
d ^= m ^= (uint32_t)padbyte << 24;
break;
case 1: /* Beginning of finalization */
m = 0;
c ^= 0xff;
/*FALLTHROUGH*/
case 0: /* Second half of finalization */
break;
}
HSIP_ROUND(a, b, c, d);
HSIP_ROUND(a, b, c, d);
} while (len);
return a ^ b ^ c ^ d;
// return c + d;
}
#undef HSIP_ROUND
#undef HSIP_MIX
/*
* No objection to EXPORT_SYMBOL, but we should probably figure out
* how the seed[] array should work first. Homework for the first
* person to want to call it from a module!
*/
#if SELFTEST
#include <stdio.h>
static uint64_t rdtsc()
{
uint32_t eax, edx;
asm volatile ("rdtsc" : "=a" (eax), "=d" (edx));
return (uint64_t)edx << 32 | eax;
}
int
main(void)
{
static char const buf[1024] = { 0 };
unsigned max;
static const uint32_t key[4] = { 1, 2, 3, 4 };
printf("SipHash/HSipHash benchmark, sizeof(long) = %u\n",
(unsigned)sizeof(long));
for (unsigned max = 4; max <= 1024; max *= 2) {
uint64_t sum1 = 0;
uint32_t sum2 = 0;
uint64_t cycles;
uint32_t bytes = 0;
/* A less lazy person could figure out the closed form */
for (int i = 1; i <= max; i++)
bytes += i * (max + 1 - i);
cycles = rdtsc();
for (int i = 1; i <= max; i++)
for (int j = 0; j <= max-i; j++)
sum1 += siphash24(buf+j, i, key);
cycles = rdtsc() - cycles;
printf(" SipHash: max=%4u cycles=%10llu cpb=%8.4f (sum=%llx)\n",
max, cycles, (double)cycles/bytes, sum1);
cycles = rdtsc();
for (int i = 1; i <= max; i++)
for (int j = 0; j <= max-i; j++)
sum2 += hsiphash24(buf+j, i, key);
cycles = rdtsc() - cycles;
printf("HSipHash: max=%4u cycles=%10llu cpb=%8.4f (sum=%lx)\n",
max, cycles, (double)cycles/bytes, sum2);
}
return 0;
}
#endif
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jeffrey Walton @ 2016-12-17 14:55 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto,
Jean-Philippe Aumasson, Daniel J . Bernstein
In-Reply-To: <20161215203003.31989-2-Jason@zx2c4.com>
> diff --git a/lib/test_siphash.c b/lib/test_siphash.c
> new file mode 100644
> index 000000000000..93549e4e22c5
> --- /dev/null
> +++ b/lib/test_siphash.c
> @@ -0,0 +1,83 @@
> +/* Test cases for siphash.c
> + *
> + * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> + *
> + * This file is provided under a dual BSD/GPLv2 license.
> + *
> + * SipHash: a fast short-input PRF
> + * https://131002.net/siphash/
> + *
> + * This implementation is specifically for SipHash2-4.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/siphash.h>
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +#include <linux/errno.h>
> +#include <linux/module.h>
> +
> +/* Test vectors taken from official reference source available at:
> + * https://131002.net/siphash/siphash24.c
> + */
> +static const u64 test_vectors[64] = {
> + 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
> + 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
> + 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
> + 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
> + 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
> + 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
> + 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
> + 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
> + 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
> + 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
> + 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
> + 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
> + 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
> + 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
> + 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
> + 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
> + 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
> + 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
> + 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
> + 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
> + 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
> + 0x958a324ceb064572ULL
> +};
> +static const siphash_key_t test_key =
> + { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL };
> +
> +static int __init siphash_test_init(void)
> +{
> + u8 in[64] __aligned(SIPHASH_ALIGNMENT);
> + u8 in_unaligned[65];
> + u8 i;
> + int ret = 0;
> +
> + for (i = 0; i < 64; ++i) {
> + in[i] = i;
> + in_unaligned[i + 1] = i;
> + if (siphash(in, i, test_key) != test_vectors[i]) {
> + pr_info("self-test aligned %u: FAIL\n", i + 1);
> + ret = -EINVAL;
> + }
> + if (siphash_unaligned(in_unaligned + 1, i, test_key) != test_vectors[i]) {
> + pr_info("self-test unaligned %u: FAIL\n", i + 1);
> + ret = -EINVAL;
> + }
> + }
> + if (!ret)
> + pr_info("self-tests: pass\n");
> + return ret;
> +}
> +
> +static void __exit siphash_test_exit(void)
> +{
> +}
> +
> +module_init(siphash_test_init);
> +module_exit(siphash_test_exit);
> +
> +MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
> +MODULE_LICENSE("Dual BSD/GPL");
> --
> 2.11.0
>
I believe the output of SipHash depends upon endianness. Folks who
request a digest through the af_alg interface will likely expect a
byte array.
I think that means on little endian machines, values like element 0
must be reversed byte reversed:
0x726fdb47dd0e0e31ULL => 31,0e,0e,dd,47,db,6f,72
If I am not mistaken, that value (and other tv's) are returned here:
return (v0 ^ v1) ^ (v2 ^ v3);
It may be prudent to include the endian reversal in the test to ensure
big endian machines produce expected results. Some closely related
testing on an old Apple PowerMac G5 revealed that result needed to be
reversed before returning it to a caller.
Jeff
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-17 12:42 UTC (permalink / raw)
To: Jason
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, linux, luto, netdev, tom, torvalds, tytso,
vegard.nossum
In-Reply-To: <CAHmME9rxCYfwyF6EADWqpAEt+yqCPgCLUVH0FPdAy7r-oPnrRg@mail.gmail.com>
BTW, here's some SipHash code I wrote for Linux a while ago.
My target application was ext4 directory hashing, resulting in different
implementation choices, although I still think that a rolled-up
implementation like this is reasonable. Reducing I-cache impact speeds
up the calling code.
One thing I'd like to suggest you steal is the way it handles the
fetch of the final partial word. It's a lot smaller and faster than
an 8-way case statement.
#include <linux/bitops.h> /* For rol64 */
#include <linux/cryptohash.h>
#include <asm/byteorder.h>
#include <asm/unaligned.h>
/* The basic ARX mixing function, taken from Skein */
#define SIP_MIX(a, b, s) ((a) += (b), (b) = rol64(b, s), (b) ^= (a))
/*
* The complete SipRound. Note that, when unrolled twice like below,
* the 32-bit rotates drop out on 32-bit machines.
*/
#define SIP_ROUND(a, b, c, d) \
(SIP_MIX(a, b, 13), SIP_MIX(c, d, 16), (a) = rol64(a, 32), \
SIP_MIX(c, b, 17), SIP_MIX(a, d, 21), (c) = rol64(c, 32))
/*
* This is rolled up more than most implementations, resulting in about
* 55% the code size. Speed is a few precent slower. A crude benchmark
* (for (i=1; i <= max; i++) for (j = 0; j < 4096-i; j++) hash(buf+j, i);)
* produces the following timings (in usec):
*
* i386 i386 i386 x86_64 x86_64 x86_64 x86_64
* Length small unroll halfmd4 small unroll halfmd4 teahash
* 1..4 1069 1029 1608 195 160 399 690
* 1..8 2483 2381 3851 410 360 988 1659
* 1..12 4303 4152 6207 690 618 1642 2690
* 1..16 6122 5931 8668 968 876 2363 3786
* 1..20 8348 8137 11245 1323 1185 3162 5567
* 1..24 10580 10327 13935 1657 1504 4066 7635
* 1..28 13211 12956 16803 2069 1871 5028 9759
* 1..32 15843 15572 19725 2470 2260 6084 11932
* 1..36 18864 18609 24259 2934 2678 7566 14794
* 1..1024 5890194 6130242 10264816 881933 881244 3617392 7589036
*
* The performance penalty is quite minor, decreasing for long strings,
* and it's significantly faster than half_md4, so I'm going for the
* I-cache win.
*/
uint64_t
siphash24(char const *in, size_t len, uint32_t const seed[4])
{
uint64_t a = 0x736f6d6570736575; /* somepseu */
uint64_t b = 0x646f72616e646f6d; /* dorandom */
uint64_t c = 0x6c7967656e657261; /* lygenera */
uint64_t d = 0x7465646279746573; /* tedbytes */
uint64_t m = 0;
uint8_t padbyte = len;
/*
* Mix in the 128-bit hash seed. This is in a format convenient
* to the ext3/ext4 code. Please feel free to adapt the
* */
if (seed) {
m = seed[2] | (uint64_t)seed[3] << 32;
b ^= m;
d ^= m;
m = seed[0] | (uint64_t)seed[1] << 32;
/* a ^= m; is done in loop below */
c ^= m;
}
/*
* By using the same SipRound code for all iterations, we
* save space, at the expense of some branch prediction. But
* branch prediction is hard because of variable length anyway.
*/
len = len/8 + 3; /* Now number of rounds to perform */
do {
a ^= m;
switch (--len) {
unsigned bytes;
default: /* Full words */
d ^= m = get_unaligned_le64(in);
in += 8;
break;
case 2: /* Final partial word */
/*
* We'd like to do one 64-bit fetch rather than
* mess around with bytes, but reading past the end
* might hit a protection boundary. Fortunately,
* we know that protection boundaries are aligned,
* so we can consider only three cases:
* - The remainder occupies zero words
* - The remainder fits into one word
* - The remainder straddles two words
*/
bytes = padbyte & 7;
if (bytes == 0) {
m = 0;
} else {
unsigned offset = (unsigned)(uintptr_t)in & 7;
if (offset + bytes <= 8) {
m = le64_to_cpup((uint64_t const *)
(in - offset));
m >>= 8*offset;
} else {
m = get_unaligned_le64(in);
}
m &= ((uint64_t)1 << 8*bytes) - 1;
}
/* Could use | or +, but ^ allows associativity */
d ^= m ^= (uint64_t)padbyte << 56;
break;
case 1: /* Beginning of finalization */
m = 0;
c ^= 0xff;
/*FALLTHROUGH*/
case 0: /* Second half of finalization */
break;
}
SIP_ROUND(a, b, c, d);
SIP_ROUND(a, b, c, d);
} while (len);
return a ^ b ^ c ^ d;
}
#undef SIP_ROUND
#undef SIP_MIX
/*
* No objection to EXPORT_SYMBOL, but we should probably figure out
* how the seed[] array should work first. Homework for the first
* person to want to call it from a module!
*/
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-17 2:15 UTC (permalink / raw)
To: Jason, linux
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum
In-Reply-To: <CAHmME9r4YNqNSZ-KXAHtJN_vm+eL1tSoC-6muHaFUN6fWhkO2g@mail.gmail.com>
> I already did this. Check my branch.
Do you think it should return "u32" (as you currently have it) or
"unsigned long"? I thought the latter, since it doesn't cost any
more and makes more
> I wonder if this could also lead to a similar aliasing
> with arch_get_random_int, since I'm pretty sure all rdrand-like
> instructions return native word size anyway.
Well, Intel's can return 16, 32 or 64 bits, and it makes a
small difference with reseed scheduling.
>> - Ted, Andy Lutorminski and I will try to figure out a construction of
>> get_random_long() that we all like.
> And me, I hope... No need to make this exclusive.
Gaah, engage brain before fingers. That was so obvious I didn't say
it, and the result came out sounding extremely rude.
A better (but longer) way to write it would be "I'm sorry that I, Ted,
and Andy are all arguing with you and each other about how to do this
and we can't finalize this part yet".
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-17 1:39 UTC (permalink / raw)
To: George Spelvin
Cc: Andi Kleen, David Miller, David Laight, Daniel J . Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
kernel-hardening, Linux Crypto Mailing List, LKML,
Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds,
Theodore Ts'o, Vegard Nossum
In-Reply-To: <20161216234408.30174.qmail@ns.sciencehorizons.net>
On Sat, Dec 17, 2016 at 12:44 AM, George Spelvin
<linux@sciencehorizons.net> wrote:
> Ths advice I'd give now is:
> - Implement
> unsigned long hsiphash(const void *data, size_t len, const unsigned long key[2])
> .. as SipHash on 64-bit (maybe SipHash-1-3, still being discussed) and
> HalfSipHash on 32-bit.
I already did this. Check my branch.
> - Document when it may or may not be used carefully.
Good idea. I'll write up some extensive documentation about all of
this, detailing use cases and our various conclusions.
> - #define get_random_int (unsigned)get_random_long
That's a good idea, since ultimately the other just casts in the
return value. I wonder if this could also lead to a similar aliasing
with arch_get_random_int, since I'm pretty sure all rdrand-like
instructions return native word size anyway.
> - Ted, Andy Lutorminski and I will try to figure out a construction of
> get_random_long() that we all like.
And me, I hope... No need to make this exclusive.
Jason
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-16 23:44 UTC (permalink / raw)
To: Jason, linux
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum
In-Reply-To: <CAHmME9oEhmqW3320Ch+Rczu_=CxQyUQXCGLnYjDm-CYbWugnSw@mail.gmail.com>
> 64-bit security for an RNG is not reasonable even with rekeying. No no
> no. Considering we already have a massive speed-up here with the
> secure version, there's zero reason to start weakening the security
> because we're trigger happy with our benchmarks. No no no.
Just to clarify, I was discussing the idea with Ted (who's in charge of
the whole thing, not me), not trying to make any sort of final decision
on the subject. I need to look at the various users (46 non-trivial ones
for get_random_int, 15 for get_random_long) and see what their security
requirements actually are.
I'm also trying to see if HalfSipHash can be used in a way that gives
slightly more than 64 bits of effective security.
The problem is that the old MD5-based transform had unclear, but
obviously ample, security. There were 64 bytes of global secret and
16 chaining bytes per CPU. Adapting SipHash (even the full version)
takes more thinking.
An actual HalfSipHash-based equivalent to the existing code would be:
#define RANDOM_INT_WORDS (64 / sizeof(long)) /* 16 or 8 */
static u32 random_int_secret[RANDOM_INT_WORDS]
____cacheline_aligned __read_mostly;
static DEFINE_PER_CPU(unsigned long[4], get_random_int_hash)
__aligned(sizeof(unsigned long));
unsigned long get_random_long(void)
{
unsigned long *hash = get_cpu_var(get_random_int_hash);
unsigned long v0 = hash[0], v1 = hash[1], v2 = hash[2], v3 = hash[3];
int i;
/* This could be improved, but it's equivalent */
v0 += current->pid + jiffies + random_get_entropy();
for (i = 0; i < RANDOM_INT_WORDS; i++) {
v3 ^= random_int_secret[i];
HSIPROUND;
HSIPROUND;
v0 ^= random_int_secret[i];
}
/* To be equivalent, we *don't* finalize the transform */
hash[0] = v0; hash[1] = v1; hash[2] = v2; hash[3] = v3;
put_cpu_var(get_random_int_hash);
return v0 ^ v1 ^ v2 ^ v3;
}
I don't think there's a 2^64 attack on that.
But 64 bytes of global secret is ridiculous if the hash function
doesn't require that minimum block size. It'll take some thinking.
Ths advice I'd give now is:
- Implement
unsigned long hsiphash(const void *data, size_t len, const unsigned long key[2])
.. as SipHash on 64-bit (maybe SipHash-1-3, still being discussed) and
HalfSipHash on 32-bit.
- Document when it may or may not be used carefully.
- #define get_random_int (unsigned)get_random_long
- Ted, Andy Lutorminski and I will try to figure out a construction of
get_random_long() that we all like.
('scuse me for a few hours, I have some unrelated things I really *should*
be working on...)
^ permalink raw reply
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-16 22:41 UTC (permalink / raw)
To: Jason, kernel-hardening
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, linux-crypto, linux-kernel, linux, luto,
netdev, tom, torvalds, tytso, vegard.nossum
In-Reply-To: <CAHmME9qPx3WUHF3__3wNOXr-AUti4WPO1qDiFus3Zr133FyV1g@mail.gmail.com>
An idea I had which mght be useful:
You could perhaps save two rounds in siphash_*u64.
The final word with the length (called "b" in your implementation)
only needs to be there if the input is variable-sized.
If every use of a given key is of a fixed-size input, you don't need
a length suffix. When the input is an even number of words, that can
save you two rounds.
This requires an audit of callers (e.g. you have to use different
keys for IPv4 and IPv6 ISNs), but can save time.
(This is crypto 101; search "MD-strengthening" or see the remark on
p. 101 on Damgaard's 1989 paper "A design principle for hash functions" at
http://saluc.engr.uconn.edu/refs/algorithms/hashalg/damgard89adesign.pdf
but I'm sure that Ted, Jean-Philippe, and/or DJB will confirm if you'd
like.)
Jason A. Donenfeld wrote:
> Oh, okay, that is exactly what I thought was going on. I just thought
> you were implying that jiffies could be moved inside the hash, which
> then confused my understanding of how things should be. In any case,
> thanks for the explanation.
No, the rekeying procedure is cleverer.
The thing is, all that matters is that the ISN increments fast enough,
but not wrap too soon.
It *is* permitted to change the random base, as long as it only
increases, and slower than the timestamp does.
So what you do is every few minutes, you increment the high 4 bits of the
random base and change the key used to generate the low 28 bits.
The base used for any particular host might change from 0x10000000
to 0x2fffffff, or from 0x1fffffff to 0x20000000, but either way, it's
increasing, and not too fast.
This has the downside that an attacker can see 4 bits of the base,
so only needs to send send 2^28 = 256 MB to flood the connection,
but the upside that the key used to generate the low bits changes
faster than it can be broken.
^ permalink raw reply
* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16 22:23 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Netdev, kernel-hardening@lists.openwall.com, LKML,
Linux Crypto Mailing List, David Laight, Ted Tso,
Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert,
George Spelvin, Vegard Nossum, Andi Kleen, David S. Miller,
Jean-Philippe Aumasson
In-Reply-To: <CALCETrX9v=Uwd1zZub=QpD73Lq0LM67NEi1qwqRUjtD5U1bHYw@mail.gmail.com>
Hi Andy,
> Agreed. A simpler contruction would be:
>
> chaining++;
> output = H(chaining, secret);
>
> And this looks a whole lot like Ted's ChaCha20 construction.
In that simpler construction with counter-based secret rekeying and in
Ted's ChaCha20 construction, the issue is that every X hits, there's a
call to get_random_bytes, which has variable performance and entropy
issues. Doing it my way with it being time based, in the event that
somebody runs ` :(){ :|:& };:`, system performance doesn't suffer
because ASLR is making repeated calls to get_random_bytes every 128 or
so process creations. In the time based way, the system performance
will not suffer.
Jason
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 22:18 UTC (permalink / raw)
To: George Spelvin
Cc: Theodore Ts'o, Andi Kleen, David Miller, David Laight,
Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
Jean-Philippe Aumasson, kernel-hardening,
Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
Tom Herbert, Linus Torvalds, Vegard Nossum
In-Reply-To: <20161216221352.26899.qmail@ns.sciencehorizons.net>
On Fri, Dec 16, 2016 at 11:13 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> Remembering that on "real" machines it's full SipHash, then I'd say that
> 64-bit security + rekeying seems reasonable.
64-bit security for an RNG is not reasonable even with rekeying. No no
no. Considering we already have a massive speed-up here with the
secure version, there's zero reason to start weakening the security
because we're trigger happy with our benchmarks. No no no.
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Andy Lutomirski @ 2016-12-16 22:15 UTC (permalink / raw)
To: George Spelvin
Cc: Ted Ts'o, Andi Kleen, David S. Miller, David Laight,
D. J. Bernstein, Eric Biggers, Hannes Frederic Sowa,
Jason A. Donenfeld, Jean-Philippe Aumasson,
kernel-hardening@lists.openwall.com, Linux Crypto Mailing List,
linux-kernel@vger.kernel.org, Network Development, Tom Herbert,
Linus Torvalds, Vegard Nossum
In-Reply-To: <20161216221352.26899.qmail@ns.sciencehorizons.net>
On Fri, Dec 16, 2016 at 2:13 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
>> What should we do with get_random_int() and get_random_long()? In
>> some cases it's being used in performance sensitive areas, and where
>> anti-DoS protection might be enough. In others, maybe not so much.
>
> This is tricky. The entire get_random_int() structure is an abuse of
> the hash function and will need to be thoroughly rethought to convert
> it to SipHash. Remember, SipHash's security goals are very different
> from MD5, so there's no obvious way to do the conversion.
>
> (It's *documented* as "not cryptographically secure", but we know
> where that goes.)
>
>> If we rekeyed the secret used by get_random_int() and
>> get_random_long() frequently (say, every minute or every 5 minutes),
>> would that be sufficient for current and future users of these
>> interfaces?
>
> Remembering that on "real" machines it's full SipHash, then I'd say that
> 64-bit security + rekeying seems reasonable.
>
> The question is, the idea has recently been floated to make hsiphash =
> SipHash-1-3 on 64-bit machines. Is *that* okay?
>
>
> The annoying thing about the currently proposed patch is that the *only*
> chaining is the returned value. What I'd *like* to do is the same
> pattern as we do with md5, and remember v[0..3] between invocations.
> But there's no partial SipHash primitive; we only get one word back.
>
> Even
> *chaining += ret = siphash_3u64(...)
>
> would be an improvement.
This is almost exactly what I suggested in my email on the other
thread from a few seconds ago :)
--Andy
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-16 22:13 UTC (permalink / raw)
To: linux, tytso
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, Jason,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, luto, netdev, tom, torvalds, vegard.nossum
In-Reply-To: <20161216204358.nlwifgcqnu6pitxs@thunk.org>
> What should we do with get_random_int() and get_random_long()? In
> some cases it's being used in performance sensitive areas, and where
> anti-DoS protection might be enough. In others, maybe not so much.
This is tricky. The entire get_random_int() structure is an abuse of
the hash function and will need to be thoroughly rethought to convert
it to SipHash. Remember, SipHash's security goals are very different
from MD5, so there's no obvious way to do the conversion.
(It's *documented* as "not cryptographically secure", but we know
where that goes.)
> If we rekeyed the secret used by get_random_int() and
> get_random_long() frequently (say, every minute or every 5 minutes),
> would that be sufficient for current and future users of these
> interfaces?
Remembering that on "real" machines it's full SipHash, then I'd say that
64-bit security + rekeying seems reasonable.
The question is, the idea has recently been floated to make hsiphash =
SipHash-1-3 on 64-bit machines. Is *that* okay?
The annoying thing about the currently proposed patch is that the *only*
chaining is the returned value. What I'd *like* to do is the same
pattern as we do with md5, and remember v[0..3] between invocations.
But there's no partial SipHash primitive; we only get one word back.
Even
*chaining += ret = siphash_3u64(...)
would be an improvement.
Although we could do something like
c0 = chaining[0];
chaining[0] = c1 = chaining[1];
ret = hsiphash(c0, c1, ...)
chaining[1] = c0 + ret;
^ permalink raw reply
* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
From: Andy Lutomirski @ 2016-12-16 22:13 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Netdev, kernel-hardening@lists.openwall.com, LKML,
Linux Crypto Mailing List, David Laight, Ted Tso,
Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert,
George Spelvin, Vegard Nossum, Andi Kleen, David S. Miller,
Jean-Philippe Aumasson
In-Reply-To: <CAHmME9rbKi3O1SS89LRMEUeMdKyrdutXAfjb9QmW3KNoCuE-wg@mail.gmail.com>
On Fri, Dec 16, 2016 at 1:45 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Hi Andy,
>
> On Fri, Dec 16, 2016 at 10:31 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> I think it would be nice to try to strenghen the PRNG construction.
>> FWIW, I'm not an expert in PRNGs, and there's fairly extensive
>> literature, but I can at least try.
>
> In an effort to keep this patchset as initially as uncontroversial as
> possible, I kept the same same construction as before and just swapped
> out slow MD5 for fast Siphash. Additionally, the function
> documentation says that it isn't cryptographically secure. But in the
> end I certainly agree with you; we should most definitely improve
> things, and seeing the eyeballs now on this series, I think we now
> might have a mandate to do so.
>
>> 1. A one-time leak of memory contents doesn't ruin security until
>> reboot. This is especially value across suspend and/or hibernation.
>
> Ted and I were discussing this in another thread, and it sounds like
> he wants the same thing. I'll add re-generation of the secret every
> once in a while. Perhaps time-based makes more sense than
> counter-based for rekeying frequency?
Counter may be faster -- you don't need to read a timer. Lots of CPUs
are surprisingly slow at timing. OTOH jiffies are fast.
>
>> 2. An attack with a low work factor (2^64?) shouldn't break the scheme
>> until reboot.
>
> It won't. The key is 128-bits.
Whoops, I thought the key was 64-bit...
>
>> This is effectively doing:
>>
>> output = H(prev_output, weak "entropy", per-boot secret);
>>
>> One unfortunately downside is that, if used in a context where an
>> attacker can see a single output, the attacker learns the chaining
>> value. If the attacker can guess the entropy, then, with 2^64 work,
>> they learn the secret, and they can predict future outputs.
>
> No, the secret is 128-bits, which isn't feasibly guessable. The secret
> also isn't part of the hash, but rather is the generator of the hash
> function. A keyed hash (a PRF) is a bit of a different construction
> than just hashing a secret value into a hash function.
I was thinking in the random oracle model, in which case the whole
function is just a PRF that takes a few input parameters, one of which
is a key.
>
>> Second, change the mode so that an attacker doesn't learn so much
>> internal state. For example:
>>
>> output = H(old_chain, entropy, secret);
>> new_chain = old_chain + entropy + output;
>
> Happy to make this change, with making the chaining value additive
> rather than a replacement.
>
> However, I'm not sure adding entropy to the new_chain makes a
> different. That entropy is basically just the cycle count plus the
> jiffies count. If an attacker can already guess them, then adding them
> again to the chaining value doesn't really add much.
Agreed. A simpler contruction would be:
chaining++;
output = H(chaining, secret);
And this looks a whole lot like Ted's ChaCha20 construction.
The benefit of my construction is that (in the random oracle model,
assuming my intuition is right), if an attacker misses ~128 samples
and entropy has at least one bit of independent min-entropy per
sample, then the attacker needs ~2^128 work to brute-force the
chaining value even fi the attacker knew both the original chaining
value and the secret.
--Andy
^ permalink raw reply
* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16 22:12 UTC (permalink / raw)
To: Andy Lutomirski, Ted Tso
Cc: Netdev, kernel-hardening@lists.openwall.com, LKML,
Linux Crypto Mailing List, David Laight, Hannes Frederic Sowa,
Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin,
Vegard Nossum, Andi Kleen, David S. Miller,
Jean-Philippe Aumasson
In-Reply-To: <CAHmME9rbKi3O1SS89LRMEUeMdKyrdutXAfjb9QmW3KNoCuE-wg@mail.gmail.com>
Hi Andy, Ted,
I've made the requested changes. Keys now rotate and are per-CPU
based. The chaining value is now additive instead of replacing.
DavidL suggested I lower the velocity of `git-send-email` triggers, so
if you'd like to take a look at this before I post v7, you can follow
along at my git tree here:
https://git.zx2c4.com/linux-dev/log/?h=siphash
Choose the commit entitled "random: use SipHash in place of MD5"
Thanks,
Jason
^ permalink raw reply
* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16 21:45 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Netdev, kernel-hardening@lists.openwall.com, LKML,
Linux Crypto Mailing List, David Laight, Ted Tso,
Hannes Frederic Sowa, Linus Torvalds, Eric Biggers, Tom Herbert,
George Spelvin, Vegard Nossum, Andi Kleen, David S. Miller,
Jean-Philippe Aumasson
Hi Andy,
On Fri, Dec 16, 2016 at 10:31 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> I think it would be nice to try to strenghen the PRNG construction.
> FWIW, I'm not an expert in PRNGs, and there's fairly extensive
> literature, but I can at least try.
In an effort to keep this patchset as initially as uncontroversial as
possible, I kept the same same construction as before and just swapped
out slow MD5 for fast Siphash. Additionally, the function
documentation says that it isn't cryptographically secure. But in the
end I certainly agree with you; we should most definitely improve
things, and seeing the eyeballs now on this series, I think we now
might have a mandate to do so.
> 1. A one-time leak of memory contents doesn't ruin security until
> reboot. This is especially value across suspend and/or hibernation.
Ted and I were discussing this in another thread, and it sounds like
he wants the same thing. I'll add re-generation of the secret every
once in a while. Perhaps time-based makes more sense than
counter-based for rekeying frequency?
> 2. An attack with a low work factor (2^64?) shouldn't break the scheme
> until reboot.
It won't. The key is 128-bits.
> This is effectively doing:
>
> output = H(prev_output, weak "entropy", per-boot secret);
>
> One unfortunately downside is that, if used in a context where an
> attacker can see a single output, the attacker learns the chaining
> value. If the attacker can guess the entropy, then, with 2^64 work,
> they learn the secret, and they can predict future outputs.
No, the secret is 128-bits, which isn't feasibly guessable. The secret
also isn't part of the hash, but rather is the generator of the hash
function. A keyed hash (a PRF) is a bit of a different construction
than just hashing a secret value into a hash function.
> Second, change the mode so that an attacker doesn't learn so much
> internal state. For example:
>
> output = H(old_chain, entropy, secret);
> new_chain = old_chain + entropy + output;
Happy to make this change, with making the chaining value additive
rather than a replacement.
However, I'm not sure adding entropy to the new_chain makes a
different. That entropy is basically just the cycle count plus the
jiffies count. If an attacker can already guess them, then adding them
again to the chaining value doesn't really add much.
Jason
^ permalink raw reply
* Re: [PATCH v6 3/5] random: use SipHash in place of MD5
From: Andy Lutomirski @ 2016-12-16 21:31 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Netdev, kernel-hardening@lists.openwall.com, LKML, linux-crypto,
David Laight, Ted Tso, Hannes Frederic Sowa, Linus Torvalds,
Eric Biggers, Tom Herbert, George Spelvin, Vegard Nossum,
Andi Kleen, David S. Miller, Jean-Philippe Aumasson
In-Reply-To: <20161216030328.11602-4-Jason@zx2c4.com>
On Thu, Dec 15, 2016 at 7:03 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
> - __aligned(sizeof(unsigned long));
> +static DEFINE_PER_CPU(u64, get_random_int_chaining);
>
[...]
> unsigned long get_random_long(void)
> {
> - __u32 *hash;
> unsigned long ret;
> + u64 *chaining;
>
> if (arch_get_random_long(&ret))
> return ret;
>
> - hash = get_cpu_var(get_random_int_hash);
> -
> - hash[0] += current->pid + jiffies + random_get_entropy();
> - md5_transform(hash, random_int_secret);
> - ret = *(unsigned long *)hash;
> - put_cpu_var(get_random_int_hash);
> -
> + chaining = &get_cpu_var(get_random_int_chaining);
> + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
> + current->pid, random_int_secret);
> + put_cpu_var(get_random_int_chaining);
> return ret;
> }
I think it would be nice to try to strenghen the PRNG construction.
FWIW, I'm not an expert in PRNGs, and there's fairly extensive
literature, but I can at least try. Here are some properties I'd
like:
1. A one-time leak of memory contents doesn't ruin security until
reboot. This is especially value across suspend and/or hibernation.
2. An attack with a low work factor (2^64?) shouldn't break the scheme
until reboot.
This is effectively doing:
output = H(prev_output, weak "entropy", per-boot secret);
One unfortunately downside is that, if used in a context where an
attacker can see a single output, the attacker learns the chaining
value. If the attacker can guess the entropy, then, with 2^64 work,
they learn the secret, and they can predict future outputs.
I would advocate adding two types of improvements. First, re-seed it
every now and then (every 128 calls?) by just replacing both the
chaining value and the percpu secret with fresh CSPRNG output.
Second, change the mode so that an attacker doesn't learn so much
internal state. For example:
output = H(old_chain, entropy, secret);
new_chain = old_chain + entropy + output;
This increases the effort needed to brute-force the internal state
from 2^64 to 2^128 (barring any weaknesses in the scheme).
Also, can we not call this get_random_int()? get_random_int() sounds
too much like get_random_bytes(), and the latter is intended to be a
real CSPRNG. Can we call it get_weak_random_int() or similar?
--Andy
^ permalink raw reply
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 21:31 UTC (permalink / raw)
To: kernel-hardening
Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
Jean-Philippe Aumasson, Linux Crypto Mailing List, LKML,
Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds,
Theodore Ts'o, Vegard Nossum
In-Reply-To: <20161216212528.26003.qmail@ns.sciencehorizons.net>
Hi George,
On Fri, Dec 16, 2016 at 10:25 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> But yes, the sequence number is supposed to be (random base) + (timestamp).
> In the old days before Canter & Siegel when the internet was a nice place,
> people just used a counter that started at boot time.
>
> But then someone observed that I can start a connection to host X,
> see the sequence number it gives back to me, and thereby learn the
> seauence number it's using on its connections to host Y.
>
> And I can use that to inject forged data into an X-to-Y connection,
> without ever seeing a single byte of the traffic! (If I *can* observe
> the traffic, of course, none of this makes the slightest difference.)
>
> So the random base was made a keyed hash of the endpoint identifiers.
> (Practically only the hosts matter, but generally the ports are thrown
> in for good measure.) That way, the ISN that host X sends to me
> tells me nothing about the ISN it's using to talk to host Y. Now the
> only way to inject forged data into the X-to-Y connection is to
> send 2^32 bytes, which is a little less practical.
Oh, okay, that is exactly what I thought was going on. I just thought
you were implying that jiffies could be moved inside the hash, which
then confused my understanding of how things should be. In any case,
thanks for the explanation.
Jason
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-16 21:25 UTC (permalink / raw)
To: Jason, linux
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum
In-Reply-To: <CAHmME9q0LaxQ3uinzWyD1mDCpyeLw_2TEAN33T6dDrTKCuHs7g@mail.gmail.com>
Jason A. Donenfeld wrote:
> I saw that jiffies addition in there and was wondering what it was all
> about. It's currently added _after_ the siphash input, not before, to
> keep with how the old algorithm worked. I'm not sure if this is
> correct or if there's something wrong with that, as I haven't studied
> how it works. If that jiffies should be part of the siphash input and
> not added to the result, please tell me. Otherwise I'll keep things
> how they are to avoid breaking something that seems to be working.
Oh, geez, I didn't realize you didn't understand this code.
Full details at
https://en.wikipedia.org/wiki/TCP_sequence_prediction_attack
But yes, the sequence number is supposed to be (random base) + (timestamp).
In the old days before Canter & Siegel when the internet was a nice place,
people just used a counter that started at boot time.
But then someone observed that I can start a connection to host X,
see the sequence number it gives back to me, and thereby learn the
seauence number it's using on its connections to host Y.
And I can use that to inject forged data into an X-to-Y connection,
without ever seeing a single byte of the traffic! (If I *can* observe
the traffic, of course, none of this makes the slightest difference.)
So the random base was made a keyed hash of the endpoint identifiers.
(Practically only the hosts matter, but generally the ports are thrown
in for good measure.) That way, the ISN that host X sends to me
tells me nothing about the ISN it's using to talk to host Y. Now the
only way to inject forged data into the X-to-Y connection is to
send 2^32 bytes, which is a little less practical.
^ permalink raw reply
* Re: Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Hannes Frederic Sowa @ 2016-12-16 21:15 UTC (permalink / raw)
To: Jason A. Donenfeld, kernel-hardening, Theodore Ts'o,
George Spelvin, Andi Kleen, David Miller, David Laight,
Daniel J . Bernstein, Eric Biggers, Jean-Philippe Aumasson,
Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
Tom Herbert, Linus Torvalds, Vegard Nossum
In-Reply-To: <CAHmME9pz=syTiLXUsbXFyGdbGK6pxbnU+TVLDkbYiTa-9+sckQ@mail.gmail.com>
On Fri, Dec 16, 2016, at 22:01, Jason A. Donenfeld wrote:
> Yes, on x86-64. But on i386 chacha20 incurs nearly the same kind of
> slowdown as siphash, so I expect the comparison to be more or less
> equal. There's another thing I really didn't like about your chacha20
> approach which is that it uses the /dev/urandom pool, which means
> various things need to kick in in the background to refill this.
> Additionally, having to refill the buffered chacha output every 32 or
> so longs isn't nice. These things together make for inconsistent and
> hard to understand general operating system performance, because
> get_random_long is called at every process startup for ASLR. So, in
> the end, I believe there's another reason for going with the siphash
> approach: deterministic performance.
*Hust*, so from where do you generate your key for siphash if called
early from ASLR?
Bye,
Hannes
^ permalink raw reply
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 21:09 UTC (permalink / raw)
To: Daniel Micay
Cc: kernel-hardening, Jean-Philippe Aumasson, George Spelvin,
Andi Kleen, David Miller, David Laight, Eric Biggers,
Hannes Frederic Sowa, Linux Crypto Mailing List, LKML,
Andy Lutomirski, Netdev, Linus Torvalds, Theodore Ts'o,
Vegard Nossum, Daniel J . Bernstein
In-Reply-To: <1481921067.1054.6.camel@gmail.com>
Hi Daniel,
On Fri, Dec 16, 2016 at 9:44 PM, Daniel Micay <danielmicay@gmail.com> wrote:
> On Fri, 2016-12-16 at 11:47 -0800, Tom Herbert wrote:
>>
>> That's about 3x of jhash speed (7 nsecs). So that might closer
>> to a more palatable replacement for jhash. Do we lose any security
>> advantages with halfsiphash?
>
> Have you tested a lower round SipHash? Probably best to stick with the
> usual construction for non-DoS mitigation, but why not try SipHash 1-3,
> 1-2, etc. for DoS mitigation?
>
> Rust and Swift both went with SipHash 1-3 for hash tables.
Maybe not a bad idea.
SipHash2-4 for MD5 replacement, as we've done so far. This is when we
actually want things to be secure (and fast).
And then HalfSipHash1-3 for certain jhash replacements. This is for
when we're talking only about DoS or sort of just joking about
security, and want things to be very competitive with jhash. (Of
course for 64-bit we'd use SipHash1-3 instead of HalfSipHash for the
speedup.)
I need to think on this a bit more, but preliminarily, I guess this
would be maybe okay...
George, JP - what do you think?
Jason
^ permalink raw reply
* Re: Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 21:01 UTC (permalink / raw)
To: kernel-hardening, Theodore Ts'o, George Spelvin, Jason,
Andi Kleen, David Miller, David Laight, Daniel J . Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
Tom Herbert, Linus Torvalds, Vegard Nossum
Hi Ted,
On Fri, Dec 16, 2016 at 9:43 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> What should we do with get_random_int() and get_random_long()? In
> some cases it's being used in performance sensitive areas, and where
> anti-DoS protection might be enough. In others, maybe not so much.
>
> If we rekeyed the secret used by get_random_int() and
> get_random_long() frequently (say, every minute or every 5 minutes),
> would that be sufficient for current and future users of these
> interfaces?
get_random_int() and get_random_long() should quite clearly use
SipHash with its secure 128-bit key and not HalfSipHash with its
64-bit key. HalfSipHash is absolutely insufficient for this use case.
Remember, we're already an order of magnitude or more faster than
md5...
With regard to periodic rekeying... since the secret is 128-bits, I
believe this is likely sufficient for _not_ rekeying. There's also the
chaining variable, to tie together invocations of the function. If
you'd prefer, instead of the chaining variable, we could use some
siphash output to mutate the original key, but I don't think this
approach is actually better and might introduce vulnerabilities. In my
opinion chaining+128bitkey is sufficient. On the other hand, rekeying
every X minutes is 3 or 4 lines of code. If you want (just say so),
I'll add this to my next revision.
You asked about the security requirements of these functions. The
comment says they're not cryptographically secure. And right now with
MD5 they're not. So the expectations are pretty low. Moving to siphash
adds some cryptographic security, certainly. Moving to siphash plus
rekeying adds a bit more. Of course, on recent x86, RDRAND is used
instead, so the cryptographic strength then depends on the thickness
of your tinfoil hat. So probably we shouldn't change what we advertise
these functions provide, even though we're certainly improving them
performance-wise and security-wise.
> P.S. I'll note that my performance figures when testing changes to
> get_random_int() were done on a 32-bit x86; Jason, I'm guessing your
> figures were using a 64-bit x86 system?. I haven't tried 32-bit ARM
> or smaller CPU's (e.g., mips, et. al.) that might be more likely to be
> used on IoT devices, but I'm worried about those too, of course.
Yes, on x86-64. But on i386 chacha20 incurs nearly the same kind of
slowdown as siphash, so I expect the comparison to be more or less
equal. There's another thing I really didn't like about your chacha20
approach which is that it uses the /dev/urandom pool, which means
various things need to kick in in the background to refill this.
Additionally, having to refill the buffered chacha output every 32 or
so longs isn't nice. These things together make for inconsistent and
hard to understand general operating system performance, because
get_random_long is called at every process startup for ASLR. So, in
the end, I believe there's another reason for going with the siphash
approach: deterministic performance.
Jason
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Tom Herbert @ 2016-12-16 20:57 UTC (permalink / raw)
To: George Spelvin
Cc: Jason A. Donenfeld, Andi Kleen, David S. Miller, David Laight,
Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
Jean-Philippe Aumasson, kernel-hardening,
Linux Crypto Mailing List, LKML, Andy Lutomirski,
Linux Kernel Network Developers, Linus Torvalds,
Theodore Ts'o, vegard.nossum
In-Reply-To: <20161216204128.25034.qmail@ns.sciencehorizons.net>
On Fri, Dec 16, 2016 at 12:41 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> Tom Herbert wrote:
>> Tested this. Distribution and avalanche effect are still good. Speed
>> wise I see about a 33% improvement over siphash (20 nsecs/op versus 32
>> nsecs). That's about 3x of jhash speed (7 nsecs). So that might closer
>> to a more palatable replacement for jhash. Do we lose any security
>> advantages with halfsiphash?
>
> What are you testing on? And what input size? And does "33% improvement"
> mean 4/3 the rate and 3/4 the time? Or 2/3 the time and 3/2 the rate?
>
Sorry, that is over an IPv4 tuple. Intel(R) Xeon(R) CPU E5-2660 0 @
2.20GHz. Recoded the function I was using to look like more like 64
bit version and yes it is indeed slower.
> These are very odd results. On a 64-bit machine, SipHash should be the
> same speed per round, and faster because it hashes more data per round.
> (Unless you're hitting some unexpected cache/decode effect due to REX
> prefixes.)
>
> On a 32-bit machine (other than ARM, where your results might make sense,
> or maybe if you're hashing large amounts of data), the difference should
> be larger.
>
> And yes, there is a *significant* security loss. SipHash is 128 bits
> ("don't worry about it"). hsiphash is 64 bits, which is known breakable
> ("worry about it"), so we have to do a careful analysis of the cost of
> a successful attack.
>
> As mentioned in the e-mails that just flew by, hsiphash is intended
> *only* for 32-bit machines which bog down on full SipHash. On all 64-bit
> machines, it will be implemented as an alias for SipHash and the security
> concerns will Just Go Away.
>
> The place where hsiphash is expected to make a big difference is 32-bit
> x86. If you only see 33% difference with "gcc -m32", I'm going to be
> very confused.
^ permalink raw reply
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 20:49 UTC (permalink / raw)
To: George Spelvin
Cc: Andi Kleen, David Miller, David Laight, Daniel J . Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
kernel-hardening, Linux Crypto Mailing List, LKML,
Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds,
Theodore Ts'o, Vegard Nossum
On Fri, Dec 16, 2016 at 9:17 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> My (speaking enerally; I should walk through every hash table you've
> converted) opinion is that:
>
> - Hash tables, even network-facing ones, can all use hsiphash as long
> as an attacker can only see collisions, i.e. ((H(x) ^ H(y)) & bits) ==
> 0, and the consequences of a successful attack is only more collisions
> (timing). While the attack is only 2x the cost (two hashes rather than
> one to test a key), the knowledge of the collision is statistical,
> especially for network attackers, which raises the cost of guessing
> beyond an even more brute-force attack.
> - When the hash value directly visible (e.g. included in a network
> packet), full SipHash should be the default.
> - Syncookies *could* use hsiphash, especially as there are
> two keys in there. Not sure if we need the performance.
> - For TCP ISNs, I'd prefer to use full SipHash. I know this is
> a very hot path, and if that's a performance bottleneck,
> we can work harder on it.
>
> In particular, TCP ISNs *used* to rotate the key periodically,
> limiting the time available to an attacker to perform an
> attack before the secret goes stale and is useless. commit
> 6e5714eaf77d79ae1c8b47e3e040ff5411b717ec upgraded to md5 and dropped
> the key rotation.
While I generally agree with this analysis for the most part, I do
think we should use SipHash and not HalfSipHash for syncookies.
Although the security risk is lower than with sequence numbers, it
previously used full MD5 for this, which means performance is not
generally a bottleneck and we'll get massive speedups no matter what,
whether using SipHash or HalfSipHash. In addition, using SipHash means
that the 128-bit key gives a larger margin and can be safe longterm.
So, I think we should err on the side of caution and stick with
SipHash in all cases in which we're upgrading from MD5.
In other words, only current jhash users should be potentially
eligible for hsiphash.
> Current code uses a 64 ns tick for the ISN, so it counts 2^24 per second.
> (32 bits wraps every 4.6 minutes.) A 4-bit counter and 28-bit hash
> (or even 3+29) would work as long as the key is regenerated no more
> than once per minute. (Just using the 4.6-minute ISN wrap time is the
> obvious simple implementation.)
>
> (Of course, I defer to DaveM's judgement on all network-related issues.)
I saw that jiffies addition in there and was wondering what it was all
about. It's currently added _after_ the siphash input, not before, to
keep with how the old algorithm worked. I'm not sure if this is
correct or if there's something wrong with that, as I haven't studied
how it works. If that jiffies should be part of the siphash input and
not added to the result, please tell me. Otherwise I'll keep things
how they are to avoid breaking something that seems to be working.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox