Linux cryptographic layer development

Linux cryptographic layer development
 help / color / mirror / Atom feed

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Tom Herbert @ 2016-12-16 19:47 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Jean-Philippe Aumasson, George Spelvin, Andi Kleen, David Miller,
	David Laight, Eric Biggers, Hannes Frederic Sowa,
	kernel-hardening, Linux Crypto Mailing List, LKML,
	Andy Lutomirski, Netdev, Linus Torvalds, Theodore Ts'o,
	vegard.nossum, Daniel J . Bernstein
In-Reply-To: <CAHmME9pjoAsoct1sVDpFFuqaqutv9X7DGJ5OBQXRAS57KFimUA@mail.gmail.com>

On Fri, Dec 16, 2016 at 4:39 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Hey JP,
>
> On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson
> <jeanphilippe.aumasson@gmail.com> wrote:
>> Here's a tentative HalfSipHash:
>> https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c
>>
>> Haven't computed the cycle count nor measured its speed.
>
Tested this. Distribution and avalanche effect are still good. Speed
wise I see about a 33% improvement over siphash (20 nsecs/op versus 32
nsecs). That's about 3x of jhash speed (7 nsecs). So that might closer
to a more palatable replacement for jhash. Do we lose any security
advantages with halfsiphash?

Tom

> This is incredible. Really. Wow!
>
> I'll integrate this into my patchset and will write up some
> documentation about when one should be used over the other.
>
> Thanks again. Quite exciting.
>
> Jason

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 18:00 UTC (permalink / raw)
  To: George Spelvin
  Cc: Jean-Philippe Aumasson, Andi Kleen, David Miller, David Laight,
	Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
	kernel-hardening, Linux Crypto Mailing List, LKML,
	Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds,
	Theodore Ts'o, Vegard Nossum
In-Reply-To: <20161216173624.21544.qmail@ns.sciencehorizons.net>

Hi George,

On Fri, Dec 16, 2016 at 6:36 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> A 128-bit output option was added to SipHash after the initial publication;
> this is just the equivalent in 32-bit.
> Personally, I'd put in a comment saying that "there's a 64-bit output
> variant that's not implemented" and punt until someone find a need.

That's a good way to think about it. Okay, I'll do precisely that.

> On a 64-bit machine, 64-bit SipHash is *always* faster than 32-bit, and
> should be used always.  Don't even compile the 32-bit code, to prevent
> anyone accidentally using it, and make hsiphash an alias for siphash.

Fascinating! Okay. So I'll alias hsiphash to siphash on 64-bit then. I
like this arrangement.


> Fortunately, the cost of brute-forcing hash functions can be fairly
> exactly quantified, thanks to bitcoin miners.  It currently takes 2^70
> hashes to create one bitcoin block, worth 25 bitcoins ($19,500).  Thus,
> 2^63 hashes cost $152.
>
> Now, there are two factors that must be considered:
> - That's a very very "wholesale" rate.  That's assuming you're doing
>   large numbers of these and can put in the up-front effort designing
>   silicon ASICs to do the attack.
> - That's for a more difficult hash (double sha-256) than SipHash.
>   That's a constant fator, but a pretty significant one.  If the wholesale
>   assumption holds, that might bring the cost down another 6 or 7 bits,
>   to $1-2 per break.
>
> If you're not the NSA and limited to general-purpose silicon, let's
> assume a state of the art GPU (Radeon HD 7970; AMD GPUs seem do to better
> than nVidia).  The bitcoin mining rate for those is about 700M/second,
> 29.4 bits.  So 63 bits is 152502 GPU-days, divided by some factor
> to account for SipHash's high speed compared to two rounds of SHA-2.
> Call it 1000 GPU-days.
>
> It's very doable, but also very non-trivial.  The question is, wouldn't
> it be cheaper and easier just to do a brute-force flooding DDoS?
>
> (This is why I wish the key size could be tweaked up to 80 bits.
> That would take all these numbers out of the reasonable range.)

That's a nice analysis. Might one conclude from that that hsiphash is
not useful for our purposes? Or does it still remain useful for
network facing code?

> Let me consider your second example above, "secure against local users".
> I should dig through your patchset and find the details, but what exactly
> are the consequences of such an attack?  Hasn't a local user already
> got much better ways to DoS the system?

For example, an unpriv'd user putting lots of entries in one hash
bucket for a shared resource that's used by root, like filesystems or
other lookup tables. If he can cause root to use more of root's cpu
schedule budget than otherwise in a directed way, then that's a bad
DoS.

> The thing to remember is that we're worried only about the combination
> of a *new* Linux kernel (new build or under active maintenance) and a
> 32-bit host.  You'd be hard-pressed to find a *single* machine fitting
> that description which is hosting multiple users or VMs and is not 64-bit.
>
> These days, 32-bit CPUs are for embedded applications: network appliances,
> TVs, etc.  That means basically single-user.  Even phones are 64 bit.
> Is this really a threat that needs to be defended against?

I interpret this to indicate all the more reason to alias hsiphash to
siphash on 64-bit, and then the problem space collapses in a clear
way.

> For your first case, network applications, the additional security
> is definitely attractive.  Syncookies are only a DoS, but sequence
> numbers are a real security issue; they can let you inject data into a
> TCP connection.
> With sequence numbers, large amounts (32 bits) the hash output is
> directly observable.

Right. Hence the need for always using full siphash and not hsiphash
for sequence numbers, per my earlier email to David.

>
> I wish we could get away with 64-bit security, but given that the
> modern internet involves attacks from NSA/Spetssvyaz/3PLA, I agree
> it's just not enough.

I take this comment to be relavent for the sequence number case.

For hashtables and hashtable flooding, is it still your opinion that
we will benefit from hsiphash? Or is this final conclusion a rejection
of hsiphash for that too? We're talking about two different use cases,
and your email kind of interleaved both into your analysis, so I'm not
certain so to precisely what your conclusion is for each use case. Can
you clear up the ambiguity?

Jason

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-16 17:36 UTC (permalink / raw)
  To: Jason, jeanphilippe.aumasson
  Cc: ak, davem, David.Laight, djb, ebiggers3, hannes, kernel-hardening,
	linux-crypto, linux-kernel, linux, luto, netdev, tom, torvalds,
	tytso, vegard.nossum
In-Reply-To: <CAHmME9rxCYfwyF6EADWqpAEt+yqCPgCLUVH0FPdAy7r-oPnrRg@mail.gmail.com>

> It appears that hsiphash can produce either 32-bit output or 64-bit
> output, with the output length parameter as part of the hash algorithm
> in there. When I code this for my kernel patchset, I very likely will
> only implement one output length size. Right now I'm leaning toward
> 32-bit.

A 128-bit output option was added to SipHash after the initial publication;
this is just the equivalent in 32-bit.

> - Is this a reasonable choice?

Yes.

> - Are there reasons why hsiphash with 64-bit output would be
>   reasonable? Or will we be fine sticking with 32-bit output only?

Personally, I'd put in a comment saying that "there's a 64-bit output
variant that's not implemented" and punt until someone find a need.

> With both hsiphash and siphash, the division of usage will probably become:
> - Use 64-bit output 128-bit key siphash for keyed RNG-like things,
>   such as syncookies and sequence numbers
> - Use 64-bit output 128-bit key siphash for hashtables that must
>   absolutely be secure to an extremely high bandwidth attacker, such as
>   userspace directly DoSing a kernel hashtable
> - Use 32-bit output 64-bit key hsiphash for quick hashtable functions
>   that still must be secure but do not require as large of a security
>   margin.

On a 64-bit machine, 64-bit SipHash is *always* faster than 32-bit, and
should be used always.  Don't even compile the 32-bit code, to prevent
anyone accidentally using it, and make hsiphash an alias for siphash.

On a 32-bit machine, it's a much trickier case.  I'd be tempted to
use the 32-bit code always, but it needs examination.

Fortunately, the cost of brute-forcing hash functions can be fairly
exactly quantified, thanks to bitcoin miners.  It currently takes 2^70
hashes to create one bitcoin block, worth 25 bitcoins ($19,500).  Thus,
2^63 hashes cost $152.

Now, there are two factors that must be considered:
- That's a very very "wholesale" rate.  That's assuming you're doing
  large numbers of these and can put in the up-front effort designing
  silicon ASICs to do the attack.
- That's for a more difficult hash (double sha-256) than SipHash.
  That's a constant fator, but a pretty significant one.  If the wholesale
  assumption holds, that might bring the cost down another 6 or 7 bits,
  to $1-2 per break.

If you're not the NSA and limited to general-purpose silicon, let's
assume a state of the art GPU (Radeon HD 7970; AMD GPUs seem do to better
than nVidia).  The bitcoin mining rate for those is about 700M/second,
29.4 bits.  So 63 bits is 152502 GPU-days, divided by some factor
to account for SipHash's high speed compared to two rounds of SHA-2.
Call it 1000 GPU-days.

It's very doable, but also very non-trivial.  The question is, wouldn't
it be cheaper and easier just to do a brute-force flooding DDoS?

(This is why I wish the key size could be tweaked up to 80 bits.
That would take all these numbers out of the reasonable range.)

Let me consider your second example above, "secure against local users".
I should dig through your patchset and find the details, but what exactly
are the consequences of such an attack?  Hasn't a local user already
got much better ways to DoS the system?

The thing to remember is that we're worried only about the combination
of a *new* Linux kernel (new build or under active maintenance) and a
32-bit host.  You'd be hard-pressed to find a *single* machine fitting
that description which is hosting multiple users or VMs and is not 64-bit.

These days, 32-bit CPUs are for embedded applications: network appliances,
TVs, etc.  That means basically single-user.  Even phones are 64 bit.
Is this really a threat that needs to be defended against?

For your first case, network applications, the additional security
is definitely attractive.  Syncookies are only a DoS, but sequence
numbers are a real security issue; they can let you inject data into a
TCP connection.

Hash tables are much harder to attack.  The information you get back from
timing probes is statistical, and thus testing a key is more expensive.
With sequence numbers, large amounts (32 bits) the hash output is
directly observable.

I wish we could get away with 64-bit security, but given that the
modern internet involves attacks from NSA/Spetssvyaz/3PLA, I agree
it's just not enough.

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 17:09 UTC (permalink / raw)
  To: David Laight
  Cc: George Spelvin, ak@linux.intel.com, davem@davemloft.net,
	ebiggers3@gmail.com, hannes@stressinduktion.org,
	jeanphilippe.aumasson@gmail.com,
	kernel-hardening@lists.openwall.com, linux-crypto@vger.kernel.org,
	linux-kernel@vger.kernel.org, luto@amacapital.net,
	netdev@vger.kernel.org, tom@herbertland.com,
	torvalds@linux-foundation.org, tytso@mit.edu,
	vegard.nossum@gmail.com, djb@cr.yp.to
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0241238@AcuExch.aculab.com>

Hi David,

On Fri, Dec 16, 2016 at 6:06 PM, David Laight <David.Laight@aculab.com> wrote:
> A 32bit hash would also remove all the issues about the alignment
> of IP addresses (etc) on 64bit systems.

The current replacements of md5_transform with siphash in the v6 patch
series will continue to use the original siphash, since the 128-bit
key is rather important for these kinds of secrets. Additionally,
64-bit siphash is already faster than the md5_transform that it
replaces. So the alignment concerns (now, non-issues; problems have
been solved, I believe) still remain.

Jason

^ permalink raw reply

* RE: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: David Laight @ 2016-12-16 17:06 UTC (permalink / raw)
  To: 'George Spelvin', ak@linux.intel.com, davem@davemloft.net,
	ebiggers3@gmail.com, hannes@stressinduktion.org, Jason@zx2c4.com,
	jeanphilippe.aumasson@gmail.com,
	kernel-hardening@lists.openwall.com, linux-crypto@vger.kernel.org,
	linux-kernel@vger.kernel.org, luto@amacapital.net,
	netdev@vger.kernel.org, tom@herbertland.com,
	torvalds@linux-foundation.org, "tytso@mit.edu" <tytso
  Cc: djb@cr.yp.to
In-Reply-To: <20161215232840.22459.qmail@ns.sciencehorizons.net>

From: George Spelvin
> Sent: 15 December 2016 23:29
> > If a halved version of SipHash can bring significant performance boost
> > (with 32b words instead of 64b words) with an acceptable security level
> > (64-bit enough?) then we may design such a version.
> 
> I was thinking if the key could be pushed to 80 bits, that would be nice,
> but honestly 64 bits is fine.  This is DoS protection, and while it's
> possible to brute-force a 64 bit secret, there are more effective (DDoS)
> attacks possible for the same cost.

A 32bit hash would also remove all the issues about the alignment
of IP addresses (etc) on 64bit systems.

> (I'd suggest a name of "HalfSipHash" to convey the reduced security
> effectively.)
> 
> > Regarding output size, are 64 bits sufficient?
> 
> As a replacement for jhash, 32 bits are sufficient.  It's for
> indexing an in-memory hash table on a 32-bit machine.

It is also worth remembering that if the intent is to generate
a hash table index (not a unique fingerprint) you will always
get collisions on the final value.
Randomness could still give overlong hash chains - which might
still need rehashing with a different key.

	David

^ permalink raw reply

* Re: [PATCH v5 3/4] secure_seq: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16 15:57 UTC (permalink / raw)
  To: David Laight
  Cc: Netdev, kernel-hardening@lists.openwall.com, LKML,
	linux-crypto@vger.kernel.org, Ted Tso, Hannes Frederic Sowa,
	Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin,
	Vegard Nossum, ak@linux.intel.com, davem@davemloft.net,
	luto@amacapital.net
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0240E66@AcuExch.aculab.com>

Hi David,

On Fri, Dec 16, 2016 at 10:59 AM, David Laight <David.Laight@aculab.com> wrote:
> You are still putting over-aligned data on stack.
> You only need to align it to the alignment of u64 (not the size of u64).
> If an on-stack item has a stronger alignment requirement than the stack
> the gcc has to generate two stack frames for the function.

Yesterday, folks were saying that sometimes 32-bit platforms need
8-byte alignment for certain 64-bit operations, so I shouldn't fall
back to 4-byte alignment there. But actually, looking at this more
closely, I can just make SIPHASH_ALIGNMENT == __alignof__(u64), which
will take care of all possible concerns, since gcc knows best which
platforms need what alignment. Thanks for making this clear to me with
"the alignment of u64 (not the size of u64)".

> Oh - and wait a bit longer between revisions.

Okay. We can be turtles.

Jason

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 15:51 UTC (permalink / raw)
  To: Jean-Philippe Aumasson
  Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
	Eric Biggers, Hannes Frederic Sowa, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum,
	Daniel J . Bernstein
In-Reply-To: <CAGiyFddB_HT3H2yhYQ5rprYZ487rJ4iCaH9uPJQD57hiPbn9ng@mail.gmail.com>

Hi JP & George,

My function names:
- SipHash -> siphash
- HalfSipHash -> hsiphash

It appears that hsiphash can produce either 32-bit output or 64-bit
output, with the output length parameter as part of the hash algorithm
in there. When I code this for my kernel patchset, I very likely will
only implement one output length size. Right now I'm leaning toward
32-bit. Questions:

- Is this a reasonable choice?
- When hsiphash is desired due to its faster speed, are there any
circumstances in which producing a 64-bit output would actually be
useful? Namely, are there any hashtables that could benefit from a
64-bit functions?
- Are there reasons why hsiphash with 64-bit output would be
reasonable? Or will we be fine sticking with 32-bit output only?

With both hsiphash and siphash, the division of usage will probably become:
- Use 64-bit output 128-bit key siphash for keyed RNG-like things,
such as syncookies and sequence numbers
- Use 64-bit output 128-bit key siphash for hashtables that must
absolutely be secure to an extremely high bandwidth attacker, such as
userspace directly DoSing a kernel hashtable
- Use 32-bit output 64-bit key hsiphash for quick hashtable functions
that still must be secure but do not require as large of a security
margin

Sound good?

Jason

^ permalink raw reply

* RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers
From: George Spelvin @ 2016-12-16 15:44 UTC (permalink / raw)
  To: ak, davem, David.Laight, ebiggers3, hannes, Jason,
	kernel-hardening, linux-crypto, linux-kernel, linux, luto, netdev,
	tom, torvalds, tytso, vegard.nossum
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0240EFA@AcuExch.aculab.com>

Jason A. Donenfeld wrote:
> Isn't that equivalent to:
>	v0 = key[0];
>	v1 = key[1];
>	v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL);
>	v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL);

(Pre-XORing key[] with the first two constants which, if the constants
are random in the first place, can be a no-op.)  Other than the typo
in the v2 line, yes.  If they key is non-public, then you can xor an
arbitrary constant in to both halves to slightly speed up the startup.

(Nits: There's a typo in the v2 line, you don't need to parenthesize
associative operators like xor, and the "ull" suffix is redundant here.)

> Those constants also look like ASCII strings.

They are.  The ASCII is "somepseudorandomlygeneratedbytes".

> What cryptographic analysis has been done on the values?

They're "nothing up my sleeve numbers".

They're arbitrary numbers, and almost any other values would do exactly
as well.  The main properties are:

1) They're different (particulatly v0 != v2 and v1 != v3), and
2) Neither they, nor their xor, is rotationally symmetric like 0x55555555.
   (Because SipHash is mostly rotationally symmetric, broken only by the
   interruption of the carry chain at the msbit, it helps slightly
   to break this up at the beginning.)

Those exact values only matter for portability.  If you don't need anyone
else to be able to compute matching outputs, then you could use any other
convenient constants (like the MD5 round constants).

^ permalink raw reply

* Re: algif for compression?
From: abed mohammad kamaluddin @ 2016-12-16 14:11 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20161210081014.GA32746@gondor.apana.org.au>

>
> The compression interface is currently in a state of flux.  We
> should make it settle down first before exporting it to user-space.
>
> For a start it would be good to actually switch IPsec/IPcomp over
> to the new compression interface.

Thanks Herbert. Are there timelines or  ongoing efforts for moving
IPcomp/Ipsec to use acomp? Or any proposals that have been or need to
be taken up in this regard.

Thanks,
Abed

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Stephan Müller @ 2016-12-16 13:58 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20161216123127.GA20406@gondor.apana.org.au>

Am Freitag, 16. Dezember 2016, 20:31:27 CET schrieb Herbert Xu:

Hi Herbert,
> > 
> > You are right, this will introduce a memleak. But with the immediate
> > freeing of sreq->tsg in the current code, the AIO interface cannot
> > support multiple IOCBs.
> > 
> > Thus, the entire memory handling in the AIO case seems broken.
> 
> Right, but can we please fix it properly? For example, you could
> save the original tsg in a new field and free that when you are
> done.

Absolutely, I concur. I will work on that.

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jean-Philippe Aumasson @ 2016-12-16 13:22 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
	Eric Biggers, Hannes Frederic Sowa, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum,
	Daniel J . Bernstein
In-Reply-To: <CAHmME9pjoAsoct1sVDpFFuqaqutv9X7DGJ5OBQXRAS57KFimUA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

It needs some basic security review, which I'll try do next week (check for
security margin, optimality of rotation counts, etc.). But after a lot of
experience with this kind of construction (BLAKE, SipHash, NORX), I'm
confident it will be safe as it is.



On Fri, Dec 16, 2016 at 1:44 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> Hey JP,
>
> On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson
> <jeanphilippe.aumasson@gmail.com> wrote:
> > Here's a tentative HalfSipHash:
> > https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c
> >
> > Haven't computed the cycle count nor measured its speed.
>
> This is incredible. Really. Wow!
>
> I'll integrate this into my patchset and will write up some
> documentation about when one should be used over the other.
>
> Thanks again. Quite exciting.
>
> Jason
>

[-- Attachment #2: Type: text/html, Size: 1706 bytes --]

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-16 12:39 UTC (permalink / raw)
  To: Jean-Philippe Aumasson
  Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
	Eric Biggers, Hannes Frederic Sowa, kernel-hardening,
	Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
	Tom Herbert, Linus Torvalds, Theodore Ts'o, vegard.nossum,
	Daniel J . Bernstein
In-Reply-To: <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com>

Hey JP,

On Fri, Dec 16, 2016 at 9:08 AM, Jean-Philippe Aumasson
<jeanphilippe.aumasson@gmail.com> wrote:
> Here's a tentative HalfSipHash:
> https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c
>
> Haven't computed the cycle count nor measured its speed.

This is incredible. Really. Wow!

I'll integrate this into my patchset and will write up some
documentation about when one should be used over the other.

Thanks again. Quite exciting.

Jason

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Herbert Xu @ 2016-12-16 12:31 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <2916745.RxhEJByb92@tauon.atsec.com>

On Fri, Dec 16, 2016 at 01:27:50PM +0100, Stephan Müller wrote:
> Am Freitag, 16. Dezember 2016, 19:54:36 CET schrieb Herbert Xu:
> 
> Hi Herbert,
> 
> > On Tue, Dec 13, 2016 at 09:42:45PM +0100, Stephan Müller wrote:
> > > +		/*
> > > +		 * The async operation may have processed only a subset of
> > > +		 * the data that was initially received from the caller.
> > > +		 * Thus, we only can release the data that a cipher operation
> > > +		 * processed.
> > > +		 */
> > > +		if (len < sg->length) {
> > > +			/* ensure that empty SGLs are not referenced any more */
> > > +			sreq->tsg = sg;
> > 
> > Hmm if you change sreq->tsg how is the original tsg ever going to
> > get freed?
> 
> You are right, this will introduce a memleak. But with the immediate freeing 
> of sreq->tsg in the current code, the AIO interface cannot support multiple 
> IOCBs.
> 
> Thus, the entire memory handling in the AIO case seems broken.

Right, but can we please fix it properly? For example, you could
save the original tsg in a new field and free that when you are
done.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Stephan Müller @ 2016-12-16 12:27 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20161216115436.GA19917@gondor.apana.org.au>

Am Freitag, 16. Dezember 2016, 19:54:36 CET schrieb Herbert Xu:

Hi Herbert,

> On Tue, Dec 13, 2016 at 09:42:45PM +0100, Stephan Müller wrote:
> > +		/*
> > +		 * The async operation may have processed only a subset of
> > +		 * the data that was initially received from the caller.
> > +		 * Thus, we only can release the data that a cipher operation
> > +		 * processed.
> > +		 */
> > +		if (len < sg->length) {
> > +			/* ensure that empty SGLs are not referenced any more */
> > +			sreq->tsg = sg;
> 
> Hmm if you change sreq->tsg how is the original tsg ever going to
> get freed?

You are right, this will introduce a memleak. But with the immediate freeing 
of sreq->tsg in the current code, the AIO interface cannot support multiple 
IOCBs.

Thus, the entire memory handling in the AIO case seems broken.
> 
> > +
> > +			/* advance the buffers to the unprocessed data */
> > +			sg->length -= len;
> > +			sg->offset += len;
> > +			return;
> > +		}
> > +
> > +		len -= sg->length;
> > +		put_page(page);
> > +	}
> > 
> >  	kfree(sreq->tsg);
> 
> Thanks,



Ciao
Stephan

^ permalink raw reply

* Re: [PATCH v2] crypto: marvell - Copy IVDIG before launching partial DMA ahash requests
From: Herbert Xu @ 2016-12-16 12:07 UTC (permalink / raw)
  To: Romain Perier
  Cc: boris.brezillon, arno, linux-crypto, jason, andrew,
	sebastian.hesselbarth, gregory.clement, thomas.petazzoni, nadavh,
	oferh, radioconfusion, romain.perier, stable
In-Reply-To: <20161214141507.19105-1-romain.perier@free-electrons.com>

Romain Perier <romain.perier@free-electrons.com> wrote:
> Currently, inner IV/DIGEST data are only copied once into the hash
> engines and not set explicitly before launching a request that is not a
> first frag. This is an issue especially when multiple ahash reqs are
> computed in parallel or chained with cipher request, as the state of the
> request being computed is not updated into the hash engine. It leads to
> non-deterministic corrupted digest results.
> 
> Fixes: commit 2786cee8e50b ("crypto: marvell - Move SRAM I/O operations to step functions")
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
> Cc: <stable@vger.kernel.org>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: AF_ALG - fix memory management of aio with multiple iocbs
From: Herbert Xu @ 2016-12-16 11:54 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <4632372.rm33NXUfDp@positron.chronox.de>

On Tue, Dec 13, 2016 at 09:42:45PM +0100, Stephan Müller wrote:
>
> +		/*
> +		 * The async operation may have processed only a subset of
> +		 * the data that was initially received from the caller.
> +		 * Thus, we only can release the data that a cipher operation
> +		 * processed.
> +		 */
> +		if (len < sg->length) {
> +			/* ensure that empty SGLs are not referenced any more */
> +			sreq->tsg = sg;

Hmm if you change sreq->tsg how is the original tsg ever going to
get freed?

> +
> +			/* advance the buffers to the unprocessed data */
> +			sg->length -= len;
> +			sg->offset += len;
> +			return;
> +		}
> +
> +		len -= sg->length;
> +		put_page(page);
> +	}
>  
>  	kfree(sreq->tsg);

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* (unknown), 
From: системы администратор @ 2016-12-16 10:46 UTC (permalink / raw)


внимания;

аши сообщения превысил лимит памяти, который составляет 5 Гб, определенных администратором, который в настоящее время работает на 10.9GB, Вы не сможете отправить или получить новую почту, пока вы повторно не проверить ваш почтовый ящик почты. Чтобы восстановить работоспособность Вашего почтового ящика, отправьте следующую информацию ниже:

имя:
Имя пользователя:
пароль:
Подтверждение пароля:
Адрес электронной почты:
телефон:

Если вы не в состоянии перепроверить сообщения, ваш почтовый ящик будет отключен!

Приносим извинения за неудобства.
Проверочный код: EN: Ru...776774990..2016
Почты технической поддержки ©2016

спасибо
системы администратор

^ permalink raw reply

* RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers
From: David Laight @ 2016-12-16 10:39 UTC (permalink / raw)
  To: 'Jason A. Donenfeld', Netdev,
	kernel-hardening@lists.openwall.com, LKML,
	linux-crypto@vger.kernel.org, Ted Tso, Hannes Frederic Sowa,
	Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin,
	Vegard Nossum, ak@linux.intel.com, davem@davemloft.net,
	luto@amacapital.net
In-Reply-To: <20161215203003.31989-3-Jason@zx2c4.com>

From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> These restore parity with the jhash interface by providing high
> performance helpers for common input sizes.
...
> +#define PREAMBLE(len) \
> +	u64 v0 = 0x736f6d6570736575ULL; \
> +	u64 v1 = 0x646f72616e646f6dULL; \
> +	u64 v2 = 0x6c7967656e657261ULL; \
> +	u64 v3 = 0x7465646279746573ULL; \
> +	u64 b = ((u64)len) << 56; \
> +	v3 ^= key[1]; \
> +	v2 ^= key[0]; \
> +	v1 ^= key[1]; \
> +	v0 ^= key[0];

Isn't that equivalent to:
	v0 = key[0];
	v1 = key[1];
	v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL);
	v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL);

Those constants also look like ASCII strings.
What cryptographic analysis has been done on the values?

	David

^ permalink raw reply

* RE: [PATCH v5 3/4] secure_seq: use SipHash in place of MD5
From: David Laight @ 2016-12-16  9:59 UTC (permalink / raw)
  To: 'Jason A. Donenfeld', Netdev,
	kernel-hardening@lists.openwall.com, LKML,
	linux-crypto@vger.kernel.org, Ted Tso, Hannes Frederic Sowa,
	Linus Torvalds, Eric Biggers, Tom Herbert, George Spelvin,
	Vegard Nossum, ak@linux.intel.com, davem@davemloft.net,
	luto@amacapital.net
In-Reply-To: <20161215203003.31989-4-Jason@zx2c4.com>

From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> This gives a clear speed and security improvement. Siphash is both
> faster and is more solid crypto than the aging MD5.
> 
> Rather than manually filling MD5 buffers, for IPv6, we simply create
> a layout by a simple anonymous struct, for which gcc generates
> rather efficient code. For IPv4, we pass the values directly to the
> short input convenience functions.
...
> diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
> index 88a8e429fc3e..c80583bf3213 100644
...
> +	const struct {
> +		struct in6_addr saddr;
> +		struct in6_addr daddr;
> +		__be16 sport;
> +		__be16 dport;
> +		u32 padding;
> +	} __aligned(SIPHASH_ALIGNMENT) combined = {
> +		.saddr = *(struct in6_addr *)saddr,
> +		.daddr = *(struct in6_addr *)daddr,
> +		.sport = sport,
> +		.dport = dport
> +	};

I think you should explicitly initialise the 'padding'.
It can do no harm and makes it obvious that it is necessary.

You are still putting over-aligned data on stack.
You only need to align it to the alignment of u64 (not the size of u64).
If an on-stack item has a stronger alignment requirement than the stack
the gcc has to generate two stack frames for the function.

If you assign to each field (instead of using initialisers) then you
can get the alignment by making the first member an anonymous union
of in6_addr and u64.

Oh - and wait a bit longer between revisions.

	David

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jean-Philippe Aumasson @ 2016-12-16  8:08 UTC (permalink / raw)
  To: George Spelvin, ak, davem, David.Laight, ebiggers3, hannes, Jason,
	kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom,
	torvalds, tytso, vegard.nossum
  Cc: djb
In-Reply-To: <20161216034618.28276.qmail@ns.sciencehorizons.net>

[-- Attachment #1: Type: text/plain, Size: 4380 bytes --]

Here's a tentative HalfSipHash:
https://github.com/veorq/SipHash/blob/halfsiphash/halfsiphash.c

Haven't computed the cycle count nor measured its speed.

On Fri, Dec 16, 2016 at 4:46 AM George Spelvin <linux@sciencehorizons.net>
wrote:

> Jean-Philippe Aumasson wrote:
> > If a halved version of SipHash can bring significant performance boost
> > (with 32b words instead of 64b words) with an acceptable security level
> > (64-bit enough?) then we may design such a version.
>
> It would be fairly significant, a 2x speed benefit on a lot of 32-bit
> machines.
>
> First is the fact that a 64-bit SipHash round on a generic 32-bit machine
> requires not twice as many instructions, but more than three.
>
> Consider the core SipHash quarter-round operation:
>         a += b;
>         b = rotate_left(b, k)
>         b ^= a
>
> The add and xor are equivalent between 32- and 64-bit rounds; twice the
> instructions do twice the work.  (There's a dependency via the carry
> bit between the two halves of the add, but it ends up not being on the
> critical path even in a superscalar implementation.)
>
> The problem is the rotates.  Although some particularly nice code is
> possible on 32-bit ARM due to its support for shift-and-xor operations,
> on a generic 32-bit CPU the rotate grows to 6 instructions with a 2-cycle
> dependency chain (more in practice because barrel shifters are large and
> even quad-issue CPUs can't do 4 shifts per cycle):
>
>         temp_lo = b_lo >> (32-k)
>         temp_hi = b_hi >> (32-k)
>         b_lo <<= k
>         b_hi <<= k
>         b_lo ^= temp_hi
>         b_hi ^= temp_lo
>
> The resultant instruction counts and (assuming wide issue)
> latencies are:
>
>         64-bit SipHash  "Half" SipHash
>         Inst.   Latency Inst.   Latency
>          10      3        3      2      Quarter round
>          40      6       12      4      Full round
>          80     12       24      8      Two rounds
>          82     13       26      9      Mix in one word
>          82     13       52     18      Mix in 64 bits
>         166     26       61     18      Four round finalization + final XOR
>         248     39      113     36      Hash 64 bits
>         330     52      165     54      Hash 128 bits
>         412     65      217     72      Hash 192 bits
>
> While the ideal latencies are actually better for the 64-bit algorithm,
> that requires an unrealistic 6+-wide superscalar implementation that's
> more than twice as wide as the 64-bit code requires (which is already
> optimized for quad-issue).  For a 1- or 2-wide processor, the instruction
> counts dominate, and not only does the 64-bit algorithm take 60% more
> time to mix in the same number of bytes, but the finalization rounds
> bring the ratio to 2:1 for small inputs.
>
> (And I haven't included the possible savings if the input size is an odd
> number of 32-bit words, such as networking applications which include
> the source/dest port numbers.)
>
>
> Notes on particular processors:
> - x86 can do a 64-bit rotate in 3 instructions and 2 cycles using
>   the SHLD/SHRD instructions instead:
>         movl    %b_hi, %temp
>         shldl   $k, %b_lo, %b_hi
>         shldl   $k, %temp, %b_lo
>   ... but as I mentioned the problem is registers.  SipHash needs 8 32-bit
>   words plus at least one temporary, and 32-bit x86 has only 7 available.
>   (And compilers can rarely manage to keep more than 6 of them busy.)
> - 64-bit SipHash is particularly efficient on 32-bit ARM due to its
>   support for shift-and-op instructions.  The 64-bit shift and following
>   xor can be done in 4 instructions.  So the only benefit is from the
>   reduced finalization.
> - Double-width adds cost a little more on CPUs like MIPS and RISC-V without
>   condition codes.
> - Certain particularly crappy uClinux processors with slow shifts
>   (68000, anyone?) really suffer from extra shifts.
>
> One *weakly* requested feature: It might simplify some programming
> interfaces if we could use the same key for multiple hash tables with a
> 1-word "tweak" (e.g. pointer to the hash table, so it could be assumed
> non-zero if that helped) to make distinct functions.  That would let us
> more safely use a global key for multiple small hash tables without the
> need to add code to generate and store key material for each place that
> an unkeyed hash is replaced.
>

[-- Attachment #2: Type: text/html, Size: 6883 bytes --]

^ permalink raw reply

* Re: [RFC PATCH v2] crypto: Add IV generation algorithms
From: Binoy Jayan @ 2016-12-16  5:55 UTC (permalink / raw)
  To: Milan Broz
  Cc: Oded, Ofir, Herbert Xu, David S. Miller, linux-crypto, Mark Brown,
	Arnd Bergmann, Linux kernel mailing list, Alasdair Kergon,
	Mike Snitzer, dm-devel, Shaohua Li, linux-raid, Rajendra
In-Reply-To: <d6d92865-98fa-4d02-035f-9080bc265c35@gmail.com>

Hi Milan,

On 13 December 2016 at 15:31, Milan Broz <gmazyland@gmail.com> wrote:

> I think that IV generators should not modify or read encrypted data directly,
> it should only generate IV.

I was trying to find more information about what you said and how a
iv generator should be written. I saw two examples of IV generators
too used with AEAD ciphers (crypto/seqiv.c and crypto/echainiv.c)

Excerpt from crypto api doc:
http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority

2. Now, SEQIV uses the AEAD API function calls to invoke the associated
AEAD cipher. In our case, during the instantiation of SEQIV, the cipher
handle for GCM is provided to SEQIV. This means that SEQIV invokes
AEAD cipher operations with the GCM cipher handle.

Here, it says seqiv invokes cipher operations. However the code crypto/seqiv.c
does not look similar to how the modes are implemented which is confusing. I
was looking for an example of an IV generator used with a regular block cipher
and not a AEAD cipher. Could you point me out to some?

Thanks,
Binoy

^ permalink raw reply

* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: George Spelvin @ 2016-12-16  3:46 UTC (permalink / raw)
  To: ak, davem, David.Laight, ebiggers3, hannes, Jason,
	jeanphilippe.aumasson, kernel-hardening, linux-crypto,
	linux-kernel, linux, luto, netdev, tom, torvalds, tytso,
	vegard.nossum
  Cc: djb
In-Reply-To: <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com>

Jean-Philippe Aumasson wrote:
> If a halved version of SipHash can bring significant performance boost
> (with 32b words instead of 64b words) with an acceptable security level
> (64-bit enough?) then we may design such a version.

It would be fairly significant, a 2x speed benefit on a lot of 32-bit
machines.

First is the fact that a 64-bit SipHash round on a generic 32-bit machine
requires not twice as many instructions, but more than three.

Consider the core SipHash quarter-round operation:
	a += b;
	b = rotate_left(b, k)
	b ^= a

The add and xor are equivalent between 32- and 64-bit rounds; twice the
instructions do twice the work.  (There's a dependency via the carry
bit between the two halves of the add, but it ends up not being on the
critical path even in a superscalar implementation.)

The problem is the rotates.  Although some particularly nice code is
possible on 32-bit ARM due to its support for shift-and-xor operations,
on a generic 32-bit CPU the rotate grows to 6 instructions with a 2-cycle
dependency chain (more in practice because barrel shifters are large and
even quad-issue CPUs can't do 4 shifts per cycle):

	temp_lo = b_lo >> (32-k)
	temp_hi = b_hi >> (32-k)
	b_lo <<= k
	b_hi <<= k
	b_lo ^= temp_hi
	b_hi ^= temp_lo

The resultant instruction counts and (assuming wide issue)
latencies are:

	64-bit SipHash	"Half" SipHash
	Inst.	Latency	Inst.	Latency
	 10	 3	  3	 2	Quarter round
	 40	 6	 12	 4	Full round
	 80	12	 24	 8	Two rounds
	 82	13	 26	 9	Mix in one word
	 82	13	 52	18	Mix in 64 bits
	166	26	 61	18	Four round finalization + final XOR
	248	39	113	36	Hash 64 bits
	330	52	165	54	Hash 128 bits
	412	65	217	72	Hash 192 bits

While the ideal latencies are actually better for the 64-bit algorithm,
that requires an unrealistic 6+-wide superscalar implementation that's
more than twice as wide as the 64-bit code requires (which is already
optimized for quad-issue).  For a 1- or 2-wide processor, the instruction
counts dominate, and not only does the 64-bit algorithm take 60% more
time to mix in the same number of bytes, but the finalization rounds
bring the ratio to 2:1 for small inputs.

(And I haven't included the possible savings if the input size is an odd
number of 32-bit words, such as networking applications which include
the source/dest port numbers.)

Notes on particular processors:
- x86 can do a 64-bit rotate in 3 instructions and 2 cycles using
  the SHLD/SHRD instructions instead:
	movl	%b_hi, %temp
	shldl	$k, %b_lo, %b_hi
	shldl	$k, %temp, %b_lo
  ... but as I mentioned the problem is registers.  SipHash needs 8 32-bit
  words plus at least one temporary, and 32-bit x86 has only 7 available.
  (And compilers can rarely manage to keep more than 6 of them busy.)
- 64-bit SipHash is particularly efficient on 32-bit ARM due to its
  support for shift-and-op instructions.  The 64-bit shift and following
  xor can be done in 4 instructions.  So the only benefit is from the
  reduced finalization.
- Double-width adds cost a little more on CPUs like MIPS and RISC-V without
  condition codes.
- Certain particularly crappy uClinux processors with slow shifts
  (68000, anyone?) really suffer from extra shifts.

One *weakly* requested feature: It might simplify some programming
interfaces if we could use the same key for multiple hash tables with a
1-word "tweak" (e.g. pointer to the hash table, so it could be assumed
non-zero if that helped) to make distinct functions.  That would let us
more safely use a global key for multiple small hash tables without the
need to add code to generate and store key material for each place that
an unkeyed hash is replaced.

^ permalink raw reply

* [PATCH v6 2/5] secure_seq: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161216030328.11602-1-Jason@zx2c4.com>

This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.

Rather than manually filling MD5 buffers, for IPv6, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code. For IPv4, we pass the values directly to the
short input convenience functions.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Laight <David.Laight@aculab.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/core/secure_seq.c | 133 ++++++++++++++++++++------------------------------
 1 file changed, 52 insertions(+), 81 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 88a8e429fc3e..c80583bf3213 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */
+
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/cryptohash.h>
@@ -8,14 +10,14 @@
 #include <linux/ktime.h>
 #include <linux/string.h>
 #include <linux/net.h>
-
+#include <linux/siphash.h>
 #include <net/secure_seq.h>
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
+#include <linux/in6.h>
 #include <net/tcp.h>
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
 
-static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
+static siphash_key_t net_secret;
 
 static __always_inline void net_secret_init(void)
 {
@@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq)
 u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 				 __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
+	u64 hash;
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash(&combined, sizeof(combined), net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 dport;
+		u16 padding1;
+		u32 padding2;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32) daddr[i];
-	secret[4] = net_secret[4] + (__force u32)dport;
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	return hash[0];
+	return siphash(&combined, sizeof(combined), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 			       __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	u64 hash;
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash_4u32(saddr, daddr, sport, dport, net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = (__force u32)dport ^ net_secret[14];
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	return hash[0];
+	return siphash_4u32(saddr, daddr, dport, 0, net_secret);
 }
 EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 #endif
@@ -126,21 +110,11 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
 	u64 seq;
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash_4u32(saddr, daddr, sport, dport, net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccp_sequence_number);
@@ -149,26 +123,23 @@ EXPORT_SYMBOL(secure_dccp_sequence_number);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
 				  __be16 sport, __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-	u32 i;
-
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash(&combined, sizeof(combined), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccpv6_sequence_number);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v6 0/5] The SipHash Patchset
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

Hey again,

This keeps getting more ambitious, which is good I suppose. If the frequency
of new versioned patchsets is too high for LKML and not customary, please let
me know. Otherwise, read on to see what's new this time...

With Hannes' suggestion, there is now only one siphash() function, which will
use the faster aligned version by compile-time constant folding. Additionally,
I now use constant folding to optionally switch to the helper siphash_Nu64
functions that are a bit faster for data of length 8, 16, 24, and 32. So, the
result is that you use siphash(data, len, key) if you have a buffer of sorts,
and then everything is taken care of for you. Or, if you have a series of
integers, you can opt to use siphash_Nu{32,64} functions instead. The basic
API is now complete.

After replacing MD5 in secure sequence number generation and the RNG, it
turned out that md5_transform wasn't used any place else in the tree, so
finally -- this is something to rejoice over -- lib/md5.c has been deleted and
now that function lives as a static function in crypto/md5.c where it belongs.

Meanwhile, it seems that sha_transform is used in places where SipHash would
be more fitting, so the IPv4 and IPv6 syncookies implementation now uses
SipHash, which should speed up TCP performance. Some BSDs already do this.

I'd like to replace sha_transform in addrconf, but that code is a bit gnarley,
so I don't want to be too meddlesome. I'm not entirely convinced either that
SipHash is a good choice for it. But I'm open to discussion here, so if you
have an opinion, please speak up.

If you've been following the evolution of this patchset, and think that
certain patches in it are fine, please do lend me your Reviewed-by to carry
into any subsequent versions, so that in case you disappear your useful
reviews will still keep the ball moving.

Thanks for all the great feedback thus far.

Jason

Jason A. Donenfeld (5):
  siphash: add cryptographically secure PRF
  secure_seq: use SipHash in place of MD5
  random: use SipHash in place of MD5
  md5: remove from lib and only live in crypto
  syncookies: use SipHash in place of SHA1

 MAINTAINERS             |   7 ++
 crypto/md5.c            |  95 +++++++++++++++++++++-
 drivers/char/random.c   |  32 +++-----
 include/linux/siphash.h |  86 ++++++++++++++++++++
 lib/Kconfig.debug       |   6 +-
 lib/Makefile            |   7 +-
 lib/md5.c               |  95 ----------------------
 lib/siphash.c           | 210 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      | 101 +++++++++++++++++++++++
 net/core/secure_seq.c   | 133 ++++++++++++------------------
 net/ipv4/syncookies.c   |  20 +----
 net/ipv6/syncookies.c   |  37 ++++-----
 12 files changed, 590 insertions(+), 239 deletions(-)
 create mode 100644 include/linux/siphash.h
 delete mode 100644 lib/md5.c
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

-- 
2.11.0

^ permalink raw reply

* [PATCH v6 5/5] syncookies: use SipHash in place of SHA1
From: Jason A. Donenfeld @ 2016-12-16  3:03 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161216030328.11602-1-Jason@zx2c4.com>

SHA1 is slower and less secure than SipHash, and so replacing syncookie
generation with SipHash makes natural sense. Some BSDs have been doing
this for several years in fact.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 net/ipv4/syncookies.c | 20 ++++----------------
 net/ipv6/syncookies.c | 37 ++++++++++++++++---------------------
 2 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 3e88467d70ee..03bb068f8888 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -13,13 +13,13 @@
 #include <linux/tcp.h>
 #include <linux/slab.h>
 #include <linux/random.h>
-#include <linux/cryptohash.h>
+#include <linux/siphash.h>
 #include <linux/kernel.h>
 #include <linux/export.h>
 #include <net/tcp.h>
 #include <net/route.h>
 
-static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly;
+static siphash_key_t syncookie_secret[2] __read_mostly;
 
 #define COOKIEBITS 24	/* Upper bits store count */
 #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
@@ -48,24 +48,12 @@ static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly;
 #define TSBITS	6
 #define TSMASK	(((__u32)1 << TSBITS) - 1)
 
-static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv4_cookie_scratch);
-
 static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport,
 		       u32 count, int c)
 {
-	__u32 *tmp;
-
 	net_get_random_once(syncookie_secret, sizeof(syncookie_secret));
-
-	tmp  = this_cpu_ptr(ipv4_cookie_scratch);
-	memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c]));
-	tmp[0] = (__force u32)saddr;
-	tmp[1] = (__force u32)daddr;
-	tmp[2] = ((__force u32)sport << 16) + (__force u32)dport;
-	tmp[3] = count;
-	sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5);
-
-	return tmp[17];
+	return siphash_4u32(saddr, daddr, (u32)sport << 16 | dport, count,
+			    syncookie_secret[c]);
 }
 
 
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index a4d49760bf43..04d19e89a3e0 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -16,7 +16,7 @@
 
 #include <linux/tcp.h>
 #include <linux/random.h>
-#include <linux/cryptohash.h>
+#include <linux/siphash.h>
 #include <linux/kernel.h>
 #include <net/ipv6.h>
 #include <net/tcp.h>
@@ -24,7 +24,7 @@
 #define COOKIEBITS 24	/* Upper bits store count */
 #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
 
-static u32 syncookie6_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly;
+static siphash_key_t syncookie6_secret[2] __read_mostly;
 
 /* RFC 2460, Section 8.3:
  * [ipv6 tcp] MSS must be computed as the maximum packet size minus 60 [..]
@@ -41,30 +41,25 @@ static __u16 const msstab[] = {
 	9000 - 60,
 };
 
-static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv6_cookie_scratch);
-
 static u32 cookie_hash(const struct in6_addr *saddr, const struct in6_addr *daddr,
 		       __be16 sport, __be16 dport, u32 count, int c)
 {
-	__u32 *tmp;
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		u32 count;
+		u16 sport;
+		u16 dport;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *saddr,
+		.daddr = *daddr,
+		.count = count,
+		.sport = sport,
+		.dport = dport
+	};
 
 	net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret));
-
-	tmp  = this_cpu_ptr(ipv6_cookie_scratch);
-
-	/*
-	 * we have 320 bits of information to hash, copy in the remaining
-	 * 192 bits required for sha_transform, from the syncookie6_secret
-	 * and overwrite the digest with the secret
-	 */
-	memcpy(tmp + 10, syncookie6_secret[c], 44);
-	memcpy(tmp, saddr, 16);
-	memcpy(tmp + 4, daddr, 16);
-	tmp[8] = ((__force u32)sport << 16) + (__force u32)dport;
-	tmp[9] = count;
-	sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5);
-
-	return tmp[17];
+	return siphash(&combined, sizeof(combined), syncookie6_secret[c]);
 }
 
 static __u32 secure_tcp_syn_cookie(const struct in6_addr *saddr,
-- 
2.11.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox