public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] get_random_bytes returns the same on every boot
@ 2004-07-22 22:52 Balint Marton
  2004-07-22 23:28 ` Patrick McHardy
  2004-08-02 22:42 ` David Wagner
  0 siblings, 2 replies; 10+ messages in thread
From: Balint Marton @ 2004-07-22 22:52 UTC (permalink / raw)
  To: linux-kernel

Hi, 

At boot time, get_random_bytes always returns the same random data, as if
there were a constant random seed. For example, if I use the kernel level
ip autoconfiguration with dhcp, the kernel will create a dhcp request
packet with always the same transaction ID. (If you have more than one
computers, and they are booting at the same time, then this is a big
problem)

That happens, because only the primary entropy pool is initialized with
the system time, in function rand_initialize. The secondary pool is only
cleared. In this early stage of booting, there is usually no user
interaction, or usable disk interrupts, so the kernel can't add any real
random bytes to the primary pool. And altough the system time is in the
primary pool, the kernel does not consider it real random data, so you
can't read from the primary pool, before at least a part of it will be
filled with some real randomness (interrupt timing).
Therefore all random data will come from the secondary pool, and the
kernel cannot reseed the secondary pool, because there is no real 
randomness in the primary one.

The solution is simple: Initialize not just the primary, but also the 
secondary pool with the system time. My patch worked for me with 
2.6.8-rc2, but it was not tested too long. 

--- linux-2.6.8-rc2.orig/drivers/char/random.c	2004-06-16 07:18:57.000000000 +0200
+++ linux-2.6.8-rc2/drivers/char/random.c	2004-07-22 21:06:28.000000000 +0200
@@ -1537,6 +1537,7 @@
 	clear_entropy_store(random_state);
 	clear_entropy_store(sec_random_state);
 	init_std_data(random_state);
+	init_std_data(sec_random_state);
 #ifdef CONFIG_SYSCTL
 	sysctl_init_random(random_state);
 #endif

bye, 
	Cus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] get_random_bytes returns the same on every boot
  2004-07-22 22:52 [PATCH] get_random_bytes returns the same on every boot Balint Marton
@ 2004-07-22 23:28 ` Patrick McHardy
  2004-08-02 22:42 ` David Wagner
  1 sibling, 0 replies; 10+ messages in thread
From: Patrick McHardy @ 2004-07-22 23:28 UTC (permalink / raw)
  To: Balint Marton; +Cc: linux-kernel, netdev

Balint Marton wrote:
> Hi, 
> 
> At boot time, get_random_bytes always returns the same random data, as if
> there were a constant random seed. For example, if I use the kernel level
> ip autoconfiguration with dhcp, the kernel will create a dhcp request
> packet with always the same transaction ID. (If you have more than one
> computers, and they are booting at the same time, then this is a big
> problem)
> 
> That happens, because only the primary entropy pool is initialized with
> the system time, in function rand_initialize. The secondary pool is only
> cleared. In this early stage of booting, there is usually no user
> interaction, or usable disk interrupts, so the kernel can't add any real
> random bytes to the primary pool. And altough the system time is in the
> primary pool, the kernel does not consider it real random data, so you
> can't read from the primary pool, before at least a part of it will be
> filled with some real randomness (interrupt timing).
> Therefore all random data will come from the secondary pool, and the
> kernel cannot reseed the secondary pool, because there is no real 
> randomness in the primary one.
> 
> The solution is simple: Initialize not just the primary, but also the 
> secondary pool with the system time. My patch worked for me with 
> 2.6.8-rc2, but it was not tested too long. 

Many network hashes use get_random_bytes() to initialize a secret
value to avoid attacks on the hash function when first used.
I assume if DHCP can get bad random, they can too. Is this patch
enough to prevent get_random_bytes() from returning predictable
data at boot time ?

Regards
Patrick

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] get_random_bytes returns the same on every boot
@ 2004-07-26 13:57 Eble, Dan
  2004-07-26 19:31 ` Balint Marton
  2004-07-27 18:01 ` Balint Marton
  0 siblings, 2 replies; 10+ messages in thread
From: Eble, Dan @ 2004-07-26 13:57 UTC (permalink / raw)
  To: Balint Marton; +Cc: linux-kernel, netdev

Balint Marton wrote:
> At boot time, get_random_bytes always returns the same 
> random data, as if there were a constant random seed.
> packet with always the same transaction ID. (If you have 
> more than one computers, and they are booting at the
> same time, then this is a big problem)

If many systems are booting at the same time, is seeding with the system
time really an appropriate solution?  Shouldn't some system-specific
value also contribute to the randomization?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] get_random_bytes returns the same on every boot
  2004-07-26 13:57 Eble, Dan
@ 2004-07-26 19:31 ` Balint Marton
  2004-07-27 18:01 ` Balint Marton
  1 sibling, 0 replies; 10+ messages in thread
From: Balint Marton @ 2004-07-26 19:31 UTC (permalink / raw)
  To: Eble, Dan; +Cc: linux-kernel, netdev

On Mon, 26 Jul 2004, Eble, Dan wrote:
> If many systems are booting at the same time, is seeding with the system
> time really an appropriate solution?  Shouldn't some system-specific
> value also contribute to the randomization?

Yes, i agree, it would be nicer, if we could also use some 
system-specific stuff for the seeding, but i don't know if there is 
such data during the initialization of the random module. For example, 
we may use the MAC address of a network device, but unless i am mistaken 
the initialization of such network devices take place after the random 
dirver init. 

By the way, i made a little test with 40 computers. They were totally 
equvivalent by hardware, and all of them had a synchronized system 
clock. I turned them on by Wake On LAN exactly at the same time. All of 
them used the kernel level ip autoconfig, all of them got their right IP 
address, and i didn't even find a line of DHCPNAK in the dhcpd logfile.

Conclusion: Although using some system-specific data and the clock would 
be nicer, the system time alone also does the right thing dependably.

bye,
Cus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] get_random_bytes returns the same on every boot
       [not found] <2kUHO-6hJ-15@gated-at.bofh.it>
@ 2004-07-27 17:43 ` Andi Kleen
  2004-07-27 19:25   ` Balint Marton
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2004-07-27 17:43 UTC (permalink / raw)
  To: Balint Marton; +Cc: linux-kernel

Balint Marton <cus@fazekas.hu> writes:
> Therefore all random data will come from the secondary pool, and the
> kernel cannot reseed the secondary pool, because there is no real 
> randomness in the primary one.
>
> The solution is simple: Initialize not just the primary, but also the 
> secondary pool with the system time. My patch worked for me with 
> 2.6.8-rc2, but it was not tested too long. 

That still is an easily predictible value and may not even be 
unique when lots of systems are powered up at the same time
(e.g. after a power failure) 

It would be better to use the hardware random generators that
are available in some southbridges and some CPUs now. I did a patch
a long time ago to automatically seed random from the intel/amd
random driver. Maybe that would be a better solution here? 

Also BTW your problem presents a strong case why compiling in
DHCP probes is bad and such stuff should run from initrd/initramfs.

-Andi


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] get_random_bytes returns the same on every boot
  2004-07-26 13:57 Eble, Dan
  2004-07-26 19:31 ` Balint Marton
@ 2004-07-27 18:01 ` Balint Marton
  1 sibling, 0 replies; 10+ messages in thread
From: Balint Marton @ 2004-07-27 18:01 UTC (permalink / raw)
  To: Eble, Dan; +Cc: linux-kernel, netdev

Hi, 

In my previous email, i wrote about a 40 computer test.
Today, I repeated my test, and although every computer got the right IP 
address, there were at least 7 lines of DHCPNAK in the dhcpd logfile.
So the system time alone is not as good as it looked like.

Cus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] get_random_bytes returns the same on every boot
  2004-07-27 17:43 ` Andi Kleen
@ 2004-07-27 19:25   ` Balint Marton
  0 siblings, 0 replies; 10+ messages in thread
From: Balint Marton @ 2004-07-27 19:25 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

On Tue, 27 Jul 2004, Andi Kleen wrote:
> That still is an easily predictible value and may not even be 
> unique when lots of systems are powered up at the same time
> (e.g. after a power failure) 
Yes, my patch is not an ultimate solution, rather a step in the working
way :)
 
> Also BTW your problem presents a strong case why compiling in
> DHCP probes is bad and such stuff should run from initrd/initramfs.
I wouldn't say, its bad, it is only not supported yet under all
circumstances. But DHCP support may be improved for example by adding the
MAC address as entropy bytes to the secondary pool. Since we don't
add bytes to the primary pool, we don't harm things that really require
secure random data. Any opinions about this workaround?

Cus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] get_random_bytes returns the same on every boot
  2004-07-22 22:52 [PATCH] get_random_bytes returns the same on every boot Balint Marton
  2004-07-22 23:28 ` Patrick McHardy
@ 2004-08-02 22:42 ` David Wagner
  2004-08-03 17:47   ` Jack Lloyd
  1 sibling, 1 reply; 10+ messages in thread
From: David Wagner @ 2004-08-02 22:42 UTC (permalink / raw)
  To: linux-kernel

Balint Marton  wrote:
>At boot time, get_random_bytes always returns the same random data, as if
>there were a constant random seed.  [This is because no entropy is
>available yet.]

Are there any consequences of this for security?  A number of network
functions call get_random_bytes() to get unguessable numbers; if those
numbers are guessable, security might be compromised.  Note that most init
scripts save randomness state from the last reboot and fill it into the
entropy pool after boot, but before then any callers to get_random_bytes()
might be vulnerable.  Has anyone ever audited all places that call
get_random_bytes() to see if any of them might pose a security exposure
during the window of time between boot and execution of init scripts?
For instance, are TCP sequence numbers, SYN cookies, etc. vulnerable?

(Needless to say, seeding the pool with just the time of day and the
system hostname is not enough to defend against such attacks.)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] get_random_bytes returns the same on every boot
  2004-08-02 22:42 ` David Wagner
@ 2004-08-03 17:47   ` Jack Lloyd
  2004-08-03 20:53     ` Jesper Juhl
  0 siblings, 1 reply; 10+ messages in thread
From: Jack Lloyd @ 2004-08-03 17:47 UTC (permalink / raw)
  To: linux-kernel

On Mon, Aug 02, 2004 at 10:42:17PM +0000, David Wagner wrote:
> Balint Marton  wrote:
> >At boot time, get_random_bytes always returns the same random data, as if
> >there were a constant random seed.  [This is because no entropy is
> >available yet.]
> 
> Are there any consequences of this for security?  A number of network
> functions call get_random_bytes() to get unguessable numbers; if those
> numbers are guessable, security might be compromised.  Note that most init
> scripts save randomness state from the last reboot and fill it into the
> entropy pool after boot, but before then any callers to get_random_bytes()
> might be vulnerable.  Has anyone ever audited all places that call
> get_random_bytes() to see if any of them might pose a security exposure
> during the window of time between boot and execution of init scripts?
> For instance, are TCP sequence numbers, SYN cookies, etc. vulnerable?

If the init scripts haven't run, then most likely your machine doesn't have an
IP address configured anyway. On some distros the network is configured before
the saved entropy is added to the pool, but most servers don't get started
until afterward.

> (Needless to say, seeding the pool with just the time of day and the
> system hostname is not enough to defend against such attacks.)

I can't think of much else the machine could be adding at the point before init
is created. The TSC isn't going to be very unpredicable, since the machine just
booted, but it might have a few bits of entropy. Hardware serial numbers?
Fixed, and largely easy to get ahold of. I'm out of ideas.

Hmmm, it just occured to me that you could include process execution details
(owner, pathname, pid/ppid, timestamp) into the entropy pool, sort of like a
Cryptlib generator but in kernel space. But again, that isn't of much use
before the kernel creates init.

-Jack

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] get_random_bytes returns the same on every boot
  2004-08-03 17:47   ` Jack Lloyd
@ 2004-08-03 20:53     ` Jesper Juhl
  0 siblings, 0 replies; 10+ messages in thread
From: Jesper Juhl @ 2004-08-03 20:53 UTC (permalink / raw)
  To: Jack Lloyd; +Cc: linux-kernel

On Tue, 3 Aug 2004, Jack Lloyd wrote:

> On Mon, Aug 02, 2004 at 10:42:17PM +0000, David Wagner wrote:
> > Balint Marton  wrote:
> > >At boot time, get_random_bytes always returns the same random data, as if
> > >there were a constant random seed.  [This is because no entropy is
> > >available yet.]
> > 
> > Are there any consequences of this for security?  A number of network
> > functions call get_random_bytes() to get unguessable numbers; if those
> > numbers are guessable, security might be compromised.  Note that most init
> > scripts save randomness state from the last reboot and fill it into the
> > entropy pool after boot, but before then any callers to get_random_bytes()
> > might be vulnerable.  Has anyone ever audited all places that call
> > get_random_bytes() to see if any of them might pose a security exposure
> > during the window of time between boot and execution of init scripts?
> > For instance, are TCP sequence numbers, SYN cookies, etc. vulnerable?
> 
> If the init scripts haven't run, then most likely your machine doesn't have an
> IP address configured anyway. On some distros the network is configured before
> the saved entropy is added to the pool, but most servers don't get started
> until afterward.
> 
> > (Needless to say, seeding the pool with just the time of day and the
> > system hostname is not enough to defend against such attacks.)
> 
> I can't think of much else the machine could be adding at the point before init
> is created. The TSC isn't going to be very unpredicable, since the machine just
> booted, but it might have a few bits of entropy. Hardware serial numbers?
> Fixed, and largely easy to get ahold of. I'm out of ideas.
> 
> Hmmm, it just occured to me that you could include process execution details
> (owner, pathname, pid/ppid, timestamp) into the entropy pool, sort of like a
> Cryptlib generator but in kernel space. But again, that isn't of much use
> before the kernel creates init.
> 
First of all, please excuse me if I don't make any sense at all - I really 
don't know anything about generating good random numbers and good sources 
of entropy, but I had a few thoughts while reading this thread and just 
thought I'd mention them in case there was actually something useful in 
them.

How about using some of the following as early sources of entropy (keeping 
in mind that this early on we just want something a little less 
predictable since completely unpredictable is probably impossible) : 

As you yourself mentioned; hardware serial numbers.

The clock of course, although the predictability of that was what started 
this thread, it's still a source.

The time the first (or first few) interrupts happen (any interrupt)? I'm 
guessing there must be *some* interrupts that the hardware generates for 
all boxes (and even if there's not for a few boxes, then that in itself 
is data), and if the hardware differs from box to box, then this would 
also differ between boxes.

The RAM size could provide a number that, although easily discovered, 
would often differ from box to box, same goes for CPU clock frequency 
(which I've often observed differs slightly even amongst supposedly 
identical CPU's).

How about the BIOS version? BIOS manufacturer string or similar info that 
could potentially differ between boxes?

What about the value of some semi-randomly picked memory locations? Or is 
all memory initialized to a known value? (could it be read before being 
initialized?).

How about the build time/build nr of the kernel? Again, easily 
discoverable, but if the main problem is boxes booting at same time 
getting same initial random numbers, then this could differentiate them at 
least a bit if they had different kernels.


If a few (or all) of the above where used to provide a few bits of 
entropy each, wouldn't that combined give enough difference between boxes 
and provide enough sources to make the final result (at least a bit) less 
easily guessed?


There is of course a big possibility that I'm completely wrong, but I just 
wanted to mention my thoughts in case they would be of use.


--
Jesper Juhl <juhl-lkml@dif.dk>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-08-03 20:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-22 22:52 [PATCH] get_random_bytes returns the same on every boot Balint Marton
2004-07-22 23:28 ` Patrick McHardy
2004-08-02 22:42 ` David Wagner
2004-08-03 17:47   ` Jack Lloyd
2004-08-03 20:53     ` Jesper Juhl
  -- strict thread matches above, loose matches on Subject: below --
2004-07-26 13:57 Eble, Dan
2004-07-26 19:31 ` Balint Marton
2004-07-27 18:01 ` Balint Marton
     [not found] <2kUHO-6hJ-15@gated-at.bofh.it>
2004-07-27 17:43 ` Andi Kleen
2004-07-27 19:25   ` Balint Marton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox