All of lore.kernel.org
 help / color / mirror / Atom feed
* [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll?
@ 2008-08-06  9:08 Stanislav Meduna
  2008-08-06 11:11 ` Tomas Mraz
  0 siblings, 1 reply; 5+ messages in thread
From: Stanislav Meduna @ 2008-08-06  9:08 UTC (permalink / raw)
  To: openssl-dev; +Cc: user-mode-linux-devel

Hi,

I and a few other users are seeing sshd failing with
   Couldn't obtain random bytes (error 604389476)
and other ssl-related application failing randomly
in user mode linux guests and I suspect a problem
in openssl that got triggered by some change in UML.

I reviewed the RAND_poll function in rand_unix.c
(statically, no time for building a debug version now)
and have following suspicions:

===
For Linux:

int r;   ... this has random bytes from stack
...

if (poll(&pset, 1, usec / 1000) < 0)
   usec = 0;
else
   try_read = (pset.revents & POLLIN) != 0;

... Let's say that the poll timed out (i.e. returned 0)
     try_read remains 0, r still has garbage

while ((r > 0 || (errno == EINTR || errno == EAGAIN)) &&

... Let's say that the garbage was negative. We are out of
     the loop and errno has bogus data (successfull/timed out
     poll did not set anything)


=== For other Unices there's additional problem:

If the select select's successfully and immediately, it can
leave the time not slept unchanged in the time argument
(which is IMHO fully legal, if it finds the bytes immediately).
If the read then does not get all the needed bytes, the code
   if (usec == 10*1000)
     usec = 0;
kicks in and we are out of the loop again.


Suggested changes:

- add
     r = -1;
   inside the do loop after the int try_read = 0;

- change
     if (usec == 10*1000)
   into
     if (r < 0 && usec == 10*1000)


Regards
-- 
                                     Stano

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll?
  2008-08-06  9:08 [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? Stanislav Meduna
@ 2008-08-06 11:11 ` Tomas Mraz
  2008-08-06 12:11   ` Stanislav Meduna
  0 siblings, 1 reply; 5+ messages in thread
From: Tomas Mraz @ 2008-08-06 11:11 UTC (permalink / raw)
  To: openssl-dev; +Cc: user-mode-linux-devel

On Wed, 2008-08-06 at 11:08 +0200, Stanislav Meduna wrote:
> Hi,
> 
> I and a few other users are seeing sshd failing with
>    Couldn't obtain random bytes (error 604389476)
> and other ssl-related application failing randomly
> in user mode linux guests and I suspect a problem
> in openssl that got triggered by some change in UML.
> 
> I reviewed the RAND_poll function in rand_unix.c
> (statically, no time for building a debug version now)
> and have following suspicions:
> 
> ===
> For Linux:
> 
> int r;   ... this has random bytes from stack
> ...
> 
> if (poll(&pset, 1, usec / 1000) < 0)
>    usec = 0;
> else
>    try_read = (pset.revents & POLLIN) != 0;
> 
> ... Let's say that the poll timed out (i.e. returned 0)
>      try_read remains 0, r still has garbage

The r is set to -1 in the else of if(try_read) statement.

> while ((r > 0 || (errno == EINTR || errno == EAGAIN)) &&
> 
> ... Let's say that the garbage was negative. We are out of
>      the loop and errno has bogus data (successfull/timed out
>      poll did not set anything)

errno has garbage value - this should be fixed by initializing errno to
0 before the poll/select calls. But in the worst case it means the code
will wait until it will be able to read some data from the random device
(which was not the intent).

> 
> === For other Unices there's additional problem:
> 
> If the select select's successfully and immediately, it can
> leave the time not slept unchanged in the time argument
> (which is IMHO fully legal, if it finds the bytes immediately).
> If the read then does not get all the needed bytes, the code
>    if (usec == 10*1000)
>      usec = 0;
> kicks in and we are out of the loop again.

The problem is not in the RAND_poll() timeouting - this is fully
intentional, the function should timeout after 10ms if the random device
blocks read. The default is to
try /dev/urandom, /dev/random, /dev/srandom in this order. So if you for
example do not have /dev/urandom and have just the blocking /dev/random,
it is perfectly possible that the RAND_poll returns error. The other
possibility is that the /dev/urandom is broken in UML and blocks if not
enough entropy is available.

-- 
Tomas Mraz
No matter how far down the wrong road you've gone, turn back.
                                              Turkish proverb


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll?
  2008-08-06 11:11 ` Tomas Mraz
@ 2008-08-06 12:11   ` Stanislav Meduna
  2008-08-06 12:55     ` David Schwartz
  2008-08-07  8:51     ` Damien Miller
  0 siblings, 2 replies; 5+ messages in thread
From: Stanislav Meduna @ 2008-08-06 12:11 UTC (permalink / raw)
  To: openssl-dev; +Cc: user-mode-linux-devel

Tomas Mraz wrote:

> errno has garbage value - this should be fixed by initializing errno to
> 0 before the poll/select calls.

Actually after it returns with timeout - a successfull
syscall is free to set errno to whatever value it wants,
it is only after an error the value has to be meaningful
(I did have this problem a few times).

 > The problem is not in the RAND_poll() timeouting - this is fully
 > intentional, the function should timeout after 10ms if the random device
 > blocks read.

Ah, ok..

So what should the applications calling openssl actually
do if this happens? Now the ssh/apache/... simply exit,
which is bad (it left me without an access to a remote
box...).

I assume they are not calling the method directly, instead
they are using some of the openssl's methods. In the current
situation anyone who actually wants to block until the entropy
is available is simply out of luck :(

> try /dev/urandom, /dev/random, /dev/srandom in this order. So if you for
> example do not have /dev/urandom and have just the blocking /dev/random,
> it is perfectly possible that the RAND_poll returns error.

Both UML guest and host have /dev/urandom. I straced
a ssh, it opens /dev/urandom first, so this should
be OK too.

> The other possibility is that the /dev/urandom is broken
 > in UML and blocks if not enough entropy is available.

Good.. let's try it:

===
#include <unistd.h>
#include <fcntl.h>
#include <poll.h>
#include <stdio.h>


main()
{
   int fd = open("/dev/urandom", O_RDONLY|O_NONBLOCK|O_NOCTTY);
   int i;
   int errpoll=0, blocked=0, rdbytes=0,errread=0, nullread=0;

   for (i=0; i < 1000000; ++i)
   {
     struct pollfd pset;
     int r;
     char tmp[32];

     pset.fd = fd;
     pset.events = POLLIN;
     pset.revents = 0;

     r = poll(&pset, 1, 10);
     if (r > 0)
     {
       if ((pset.revents & POLLIN) != 0)
       {
         r = read(fd, tmp, sizeof(tmp));
         if (r < 0)
           errread++;
         else if (r==0)
           nullread++;
         else
           rdbytes += r;
       }
       else
       {
         printf("poll returned %d, but POLLIN is false (%x)\n", r, pset.revents);
       }
     }
     else if (r == 0)
       blocked++;
     else
       errpoll++;
   }

   printf("got %d bytes of entropy, poll err %d, blocked %d times, err read: %d, null 
read: %d\n", rdbytes, errpoll, blocked, errread, nullread);
}
===

got 3200000 bytes of entropy, poll err 0, blocked 0 times, err read: 0, null read: 0


Tried many many times, even two running at the same time
or poll timeout set to zero, not one instance of blocking
even with
   od -x /dev/urandom
and
   od -x /dev/random
running simultaneously (the second one blocks, of course).


Hmmmm.. what the #$%# is happening here.. more ideas?

-- 
                                    Stano

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll?
  2008-08-06 12:11   ` Stanislav Meduna
@ 2008-08-06 12:55     ` David Schwartz
  2008-08-07  8:51     ` Damien Miller
  1 sibling, 0 replies; 5+ messages in thread
From: David Schwartz @ 2008-08-06 12:55 UTC (permalink / raw)
  To: openssl-dev; +Cc: user-mode-linux-devel



> Tried many many times, even two running at the same time
> or poll timeout set to zero, not one instance of blocking
> even with
>    od -x /dev/urandom
> and
>    od -x /dev/random
> running simultaneously (the second one blocks, of course).
>
>
> Hmmmm.. what the #$%# is happening here.. more ideas?
>
> --
>                                     Stano

My bet is that '/dev/urandom' only blocks if it doesn't have enough entropy.
Early in the startup process, '/dev/urandom' doesn't have enough entropy,
and your application times out on it.

Later on, when the system has had lots of network activity, you log in and
test '/dev/urandom'. At this point, the system is well-seeded from the
network activity. So it works great for you.

Try launching your test program automatically on boot up at the saem time
you launch ssh or whatever application is failing. I bet '/dev/urandom' will
fail then.

If you have a network, one solution might be to do a few 'ping's or
'nslookup's to seed the entropy pool. You can also keep an entropy pool on
disk, saving it on shutdown and loading it on startup.

Or I could be completely wrong.

DS



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll?
  2008-08-06 12:11   ` Stanislav Meduna
  2008-08-06 12:55     ` David Schwartz
@ 2008-08-07  8:51     ` Damien Miller
  1 sibling, 0 replies; 5+ messages in thread
From: Damien Miller @ 2008-08-07  8:51 UTC (permalink / raw)
  To: openssl-dev; +Cc: user-mode-linux-devel

On Wed, 6 Aug 2008, Stanislav Meduna wrote:

> So what should the applications calling openssl actually
> do if this happens? Now the ssh/apache/... simply exit,
> which is bad (it left me without an access to a remote
> box...).

Exiting is the best behaviour - continuing without a good source
of randomness may compromise cryptographic protocols and even
long-term private keys (e.g. if DSA is used).

-d

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-08-07  8:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-06  9:08 [uml-devel] Couldn't obtain random bytes in sshd - problem in RAND_poll? Stanislav Meduna
2008-08-06 11:11 ` Tomas Mraz
2008-08-06 12:11   ` Stanislav Meduna
2008-08-06 12:55     ` David Schwartz
2008-08-07  8:51     ` Damien Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.