public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: "Theodore Ts'o" <tytso@mit.edu>,
	Bernard Normier <bernard@zeroc.com>,
	linux-kernel@vger.kernel.org
Subject: Re: Concurrent access to /dev/urandom
Date: Wed, 8 Dec 2004 18:58:31 -0800	[thread overview]
Message-ID: <20041209025831.GK8876@waste.org> (raw)
In-Reply-To: <20041209015705.GB6978@thunk.org>

On Wed, Dec 08, 2004 at 08:57:05PM -0500, Theodore Ts'o wrote:
> On Wed, Dec 08, 2004 at 01:56:14PM -0800, Matt Mackall wrote:
> > 
> > Ted, I think this is a bit more straightforward than your patch, and
> > safer as it protects get_random_bytes() and internal extract_entropy()
> > users. And I'd be leery of your get_cpu() trick due to preempt
> > issues.
> 
> I'm concerned that turning off interrupts during even a single SHA-1
> transform will put us above the radar with respect to the preempt
> latency statistics again.  We could use a separate spinlock that only
> pretects the mix_ptr and mixing access to the pool, so we're at least
> not disabling interrupts, but we still are holding a spinlock across a
> cryptographic operation.

A big mixlock was my first thought, but it'd still have to be _irqsave
as we can reach extract_entropy from irq handlers.
 
> So I've come up with another trick which I think avoids needing to add
> additional locking altogether.  What we do is we diddle the initial
> HASH input values with the following values: initial the processor ID,
> the current task pointer, and preempt_count().  On an UP system with
> preemption, it won't matter if we get preempted, since on a UP system
> access to the pool is by definition serialized :-).  On a SMP system
> with preemption, while we could theoretically get preempted away and
> then scheduled on another CPU, just in time for another process to
> call extract_entropy(), the task identifier is enough to guarantee a
> unique starting point.  The reason for adding preempt_count() is so we
> can deal with the case where a process gets interrupted, and the
> bottom half handler calls get_random_bytes(), and at the same time
> said process gets preempted away to another CPU().  I think this
> covers all of the cases.....
> 
> Yeah, it would be simper to reason about things if we were to just put
> it under the spinlock, but everyone seems tp be on a reduce latency at
> all costs kick as of late.  :-)

I'd like to combine this with my approach of fiddling with the mixing
offset in a similar manner. In the duplicate case, we were basically
returning SHA(x[y]) twice and now we're returning SHA(x[y]^knowns).
This makes me a bit uneasy. I'd rather do SHA(x[knowns%sizeof(x)]^knowns2),
then we've at least got some different _unknowns_ in the hash from the
attacker's perspective, yes?

Something like:

Signed-off-by: Matt Mackall <mpm@selenic.com>

Index: random/drivers/char/random.c
===================================================================
--- random.orig/drivers/char/random.c	2004-12-08 18:17:21.000000000 -0800
+++ random/drivers/char/random.c	2004-12-08 18:47:17.914493794 -0800
@@ -1343,7 +1343,7 @@
 {
 	ssize_t ret, i;
 	__u32 tmp[TMP_BUF_SIZE];
-	__u32 x;
+	__u32 x, offset, wrap;
 	unsigned long cpuflags;
 
 
@@ -1402,14 +1402,35 @@
 				  sec_random_state->entropy_count);
 		}
 
-		/* Hash the pool to get the output */
-		tmp[0] = 0x67452301;
+		/*
+		 * Hash the pool to get the output.
+		 *
+		 * We diddle the initial inputs so that if two
+		 * processors are executing extract_entropy
+		 * concurrently, they will get different results. Even
+		 * if we get preempted and moved to another CPU, the
+		 * combination of initial CPU, task pointer, and
+		 * preempt count is good enough to avoid duplication.
+		 * We could instead use more locking here, but the
+		 * resulting latency is painful.
+		 */
+		tmp[0] = 0x67452301 ^ smp_processor_id();
 		tmp[1] = 0xefcdab89;
-		tmp[2] = 0x98badcfe;
+		tmp[2] = 0x98badcfe ^ preempt_count();
 		tmp[3] = 0x10325476;
 #ifdef USE_SHA
 		tmp[4] = 0xc3d2e1f0;
 #endif
+
+		/*
+		 * Generate an offset for mixing (multiple of 16) so
+		 * that we have different unknowns in the mix in the
+		 * concurrent case as well.
+		 */
+
+		wrap = r->poolinfo.poolwords;
+		offset = ((__u32)current * 8675309 % wrap) & ~15;
+
 		/*
 		 * As we hash the pool, we mix intermediate values of
 		 * the hash back into the pool.  This eliminates
@@ -1419,10 +1440,10 @@
 		 * function can be inverted.
 		 */
 		for (i = 0, x = 0; i < r->poolinfo.poolwords; i += 16, x+=2) {
-			HASH_TRANSFORM(tmp, r->pool+i);
+			HASH_TRANSFORM(tmp, r->pool + (i + offset) % wrap);
 			add_entropy_words(r, &tmp[x%HASH_BUFFER_SIZE], 1);
 		}
-		
+
 		/*
 		 * In case the hash function has some recognizable
 		 * output pattern, we fold it in half.


-- 
Mathematics is the supreme nostalgia of our time.

  parent reply	other threads:[~2004-12-09  2:59 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-27 20:45 Concurrent access to /dev/urandom Bernard Normier
2004-11-27 20:56 ` Jan Engelhardt
2004-11-27 21:15   ` Bernard Normier
2004-11-27 21:22     ` Jan Engelhardt
2004-11-28 20:58       ` Bernard Normier
2004-12-07 23:41         ` Bernard Normier
2004-12-08  1:28           ` Theodore Ts'o
2004-12-08  1:56             ` Bernard Normier
2004-12-08 19:21               ` Theodore Ts'o
2004-12-08 20:15                 ` Bernard Normier
2004-12-08 21:56                 ` Matt Mackall
2004-12-09  1:57                   ` Theodore Ts'o
2004-12-09  2:46                     ` andyliu
2004-12-09  4:55                       ` Matt Mackall
2004-12-09  2:58                     ` Matt Mackall [this message]
2004-12-09 21:29                     ` Matt Mackall
2004-12-10  4:47                       ` Matt Mackall
2004-12-10 16:35                         ` Theodore Ts'o
2004-12-10 18:28                           ` Matt Mackall
2004-12-10 21:28                             ` Theodore Ts'o
2004-12-10 22:23                               ` Matt Mackall
2004-12-11  0:22                                 ` Adam Heath
2004-12-11  1:10                                   ` Matt Mackall
2004-12-11 17:33                                   ` Theodore Ts'o
2004-12-11 19:58                                     ` Adam Heath
2004-12-11 20:40                                       ` Matt Mackall
2004-12-12 16:19                                     ` Pavel Machek
2004-12-11  0:19                               ` Adam Heath
2004-12-09  3:10               ` David Lang
2004-12-09  4:52                 ` Matt Mackall
2004-12-09  6:36                 ` Theodore Ts'o
2004-11-29 22:47 ` Jon Masters
2004-11-29 23:14   ` Bernard Normier
2004-11-29 23:43     ` Sven-Haegar Koch
2004-11-30  2:31       ` David Schwartz
2004-11-30  4:14         ` Kyle Moffett
2004-11-30  8:23           ` Jan Engelhardt
2004-11-30 18:50             ` David Schwartz
2004-11-29 23:42   ` David Wagner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041209025831.GK8876@waste.org \
    --to=mpm@selenic.com \
    --cc=bernard@zeroc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox