* [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll?
@ 2008-08-06 9:08 Stanislav Meduna
2008-08-06 11:11 ` Tomas Mraz
0 siblings, 1 reply; 5+ messages in thread
From: Stanislav Meduna @ 2008-08-06 9:08 UTC (permalink / raw)
To: openssl-dev; +Cc: user-mode-linux-devel
Hi,
I and a few other users are seeing sshd failing with
Couldn't obtain random bytes (error 604389476)
and other ssl-related application failing randomly
in user mode linux guests and I suspect a problem
in openssl that got triggered by some change in UML.
I reviewed the RAND_poll function in rand_unix.c
(statically, no time for building a debug version now)
and have following suspicions:
===
For Linux:
int r; ... this has random bytes from stack
...
if (poll(&pset, 1, usec / 1000) < 0)
usec = 0;
else
try_read = (pset.revents & POLLIN) != 0;
... Let's say that the poll timed out (i.e. returned 0)
try_read remains 0, r still has garbage
while ((r > 0 || (errno == EINTR || errno == EAGAIN)) &&
... Let's say that the garbage was negative. We are out of
the loop and errno has bogus data (successfull/timed out
poll did not set anything)
=== For other Unices there's additional problem:
If the select select's successfully and immediately, it can
leave the time not slept unchanged in the time argument
(which is IMHO fully legal, if it finds the bytes immediately).
If the read then does not get all the needed bytes, the code
if (usec == 10*1000)
usec = 0;
kicks in and we are out of the loop again.
Suggested changes:
- add
r = -1;
inside the do loop after the int try_read = 0;
- change
if (usec == 10*1000)
into
if (r < 0 && usec == 10*1000)
Regards
--
Stano
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? 2008-08-06 9:08 [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? Stanislav Meduna @ 2008-08-06 11:11 ` Tomas Mraz 2008-08-06 12:11 ` Stanislav Meduna 0 siblings, 1 reply; 5+ messages in thread From: Tomas Mraz @ 2008-08-06 11:11 UTC (permalink / raw) To: openssl-dev; +Cc: user-mode-linux-devel On Wed, 2008-08-06 at 11:08 +0200, Stanislav Meduna wrote: > Hi, > > I and a few other users are seeing sshd failing with > Couldn't obtain random bytes (error 604389476) > and other ssl-related application failing randomly > in user mode linux guests and I suspect a problem > in openssl that got triggered by some change in UML. > > I reviewed the RAND_poll function in rand_unix.c > (statically, no time for building a debug version now) > and have following suspicions: > > === > For Linux: > > int r; ... this has random bytes from stack > ... > > if (poll(&pset, 1, usec / 1000) < 0) > usec = 0; > else > try_read = (pset.revents & POLLIN) != 0; > > ... Let's say that the poll timed out (i.e. returned 0) > try_read remains 0, r still has garbage The r is set to -1 in the else of if(try_read) statement. > while ((r > 0 || (errno == EINTR || errno == EAGAIN)) && > > ... Let's say that the garbage was negative. We are out of > the loop and errno has bogus data (successfull/timed out > poll did not set anything) errno has garbage value - this should be fixed by initializing errno to 0 before the poll/select calls. But in the worst case it means the code will wait until it will be able to read some data from the random device (which was not the intent). > > === For other Unices there's additional problem: > > If the select select's successfully and immediately, it can > leave the time not slept unchanged in the time argument > (which is IMHO fully legal, if it finds the bytes immediately). > If the read then does not get all the needed bytes, the code > if (usec == 10*1000) > usec = 0; > kicks in and we are out of the loop again. The problem is not in the RAND_poll() timeouting - this is fully intentional, the function should timeout after 10ms if the random device blocks read. The default is to try /dev/urandom, /dev/random, /dev/srandom in this order. So if you for example do not have /dev/urandom and have just the blocking /dev/random, it is perfectly possible that the RAND_poll returns error. The other possibility is that the /dev/urandom is broken in UML and blocks if not enough entropy is available. -- Tomas Mraz No matter how far down the wrong road you've gone, turn back. Turkish proverb ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? 2008-08-06 11:11 ` Tomas Mraz @ 2008-08-06 12:11 ` Stanislav Meduna 2008-08-06 12:55 ` David Schwartz 2008-08-07 8:51 ` Damien Miller 0 siblings, 2 replies; 5+ messages in thread From: Stanislav Meduna @ 2008-08-06 12:11 UTC (permalink / raw) To: openssl-dev; +Cc: user-mode-linux-devel Tomas Mraz wrote: > errno has garbage value - this should be fixed by initializing errno to > 0 before the poll/select calls. Actually after it returns with timeout - a successfull syscall is free to set errno to whatever value it wants, it is only after an error the value has to be meaningful (I did have this problem a few times). > The problem is not in the RAND_poll() timeouting - this is fully > intentional, the function should timeout after 10ms if the random device > blocks read. Ah, ok.. So what should the applications calling openssl actually do if this happens? Now the ssh/apache/... simply exit, which is bad (it left me without an access to a remote box...). I assume they are not calling the method directly, instead they are using some of the openssl's methods. In the current situation anyone who actually wants to block until the entropy is available is simply out of luck :( > try /dev/urandom, /dev/random, /dev/srandom in this order. So if you for > example do not have /dev/urandom and have just the blocking /dev/random, > it is perfectly possible that the RAND_poll returns error. Both UML guest and host have /dev/urandom. I straced a ssh, it opens /dev/urandom first, so this should be OK too. > The other possibility is that the /dev/urandom is broken > in UML and blocks if not enough entropy is available. Good.. let's try it: === #include <unistd.h> #include <fcntl.h> #include <poll.h> #include <stdio.h> main() { int fd = open("/dev/urandom", O_RDONLY|O_NONBLOCK|O_NOCTTY); int i; int errpoll=0, blocked=0, rdbytes=0,errread=0, nullread=0; for (i=0; i < 1000000; ++i) { struct pollfd pset; int r; char tmp[32]; pset.fd = fd; pset.events = POLLIN; pset.revents = 0; r = poll(&pset, 1, 10); if (r > 0) { if ((pset.revents & POLLIN) != 0) { r = read(fd, tmp, sizeof(tmp)); if (r < 0) errread++; else if (r==0) nullread++; else rdbytes += r; } else { printf("poll returned %d, but POLLIN is false (%x)\n", r, pset.revents); } } else if (r == 0) blocked++; else errpoll++; } printf("got %d bytes of entropy, poll err %d, blocked %d times, err read: %d, null read: %d\n", rdbytes, errpoll, blocked, errread, nullread); } === got 3200000 bytes of entropy, poll err 0, blocked 0 times, err read: 0, null read: 0 Tried many many times, even two running at the same time or poll timeout set to zero, not one instance of blocking even with od -x /dev/urandom and od -x /dev/random running simultaneously (the second one blocks, of course). Hmmmm.. what the #$%# is happening here.. more ideas? -- Stano ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? 2008-08-06 12:11 ` Stanislav Meduna @ 2008-08-06 12:55 ` David Schwartz 2008-08-07 8:51 ` Damien Miller 1 sibling, 0 replies; 5+ messages in thread From: David Schwartz @ 2008-08-06 12:55 UTC (permalink / raw) To: openssl-dev; +Cc: user-mode-linux-devel > Tried many many times, even two running at the same time > or poll timeout set to zero, not one instance of blocking > even with > od -x /dev/urandom > and > od -x /dev/random > running simultaneously (the second one blocks, of course). > > > Hmmmm.. what the #$%# is happening here.. more ideas? > > -- > Stano My bet is that '/dev/urandom' only blocks if it doesn't have enough entropy. Early in the startup process, '/dev/urandom' doesn't have enough entropy, and your application times out on it. Later on, when the system has had lots of network activity, you log in and test '/dev/urandom'. At this point, the system is well-seeded from the network activity. So it works great for you. Try launching your test program automatically on boot up at the saem time you launch ssh or whatever application is failing. I bet '/dev/urandom' will fail then. If you have a network, one solution might be to do a few 'ping's or 'nslookup's to seed the entropy pool. You can also keep an entropy pool on disk, saving it on shutdown and loading it on startup. Or I could be completely wrong. DS ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? 2008-08-06 12:11 ` Stanislav Meduna 2008-08-06 12:55 ` David Schwartz @ 2008-08-07 8:51 ` Damien Miller 1 sibling, 0 replies; 5+ messages in thread From: Damien Miller @ 2008-08-07 8:51 UTC (permalink / raw) To: openssl-dev; +Cc: user-mode-linux-devel On Wed, 6 Aug 2008, Stanislav Meduna wrote: > So what should the applications calling openssl actually > do if this happens? Now the ssh/apache/... simply exit, > which is bad (it left me without an access to a remote > box...). Exiting is the best behaviour - continuing without a good source of randomness may compromise cryptographic protocols and even long-term private keys (e.g. if DSA is used). -d ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-08-07 8:51 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-08-06 9:08 [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? Stanislav Meduna 2008-08-06 11:11 ` Tomas Mraz 2008-08-06 12:11 ` Stanislav Meduna 2008-08-06 12:55 ` David Schwartz 2008-08-07 8:51 ` Damien Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.